HughesTech Engineers discover the root cause for Hard Drive crashes on DELL Server running Linux. One of our DELL Servers would have a major hard drive crash about every seven (7) days.
The Server would just reboot, scan the Hard Drive, wait a few seconds for a root login at the console, then repeat the cycle. Our Engineers would login at the console, run the ext2 fsck and repair the Hard Drive.
One Engineer while logged on as root, noticed the system time under Linux was off by about 4 hours, so he reset the time in linux, rebooted the server and set the time in the bios setup (cmos battery), everything was fine for another seven (7) days.
After the seven (7) days we had another hard drive crash on the same DELL Server, so the Engineers repeated everything the same way again, and again noticed the time was off by 4 hours, they replaced the cmos battery (CR2032) for the Real-Time Clock, reset the time and that was the fix!
Image that, a cmos battery that costs about $1.00 taking down an entire server!
We thought we would share this, for others that may be having the same issues.