Tuesday, July 17, 2012

The Long Road

When a man sets forth to carry a cat by its tail, he learns something valuable that will never grow dim or doubtful - Mark Twain

For the last few months I have been trying to repair the RAID-5 system on Gemini. It did not fail completely, but a couple of disks developed unrecoverable media errors at the same time. Much to my surprise what this means is that even if I replace a failing disk with a perfectly good new disk, the RAID system will not be able to successfully rebuild it because of media errors on a different disk.

Obviously the people who wrote the RAID software are complete morons.

It took me several tries and ultimately a phone call to Intel Technical Support to figure this out. All they could suggest was backup my files and reinitialize the RAID configuration. Thanks for nothing.

As I was going to have to reinitialize my RAID configuration I decided to get a real RAID Controller rather than rely on the Intel Software RAID implemented in the BMC (Baseboard Management Controller) of the system board. I have generally had nothing but problems with it from day one.

I ultimately decided on the LSI  9266-8i with cache backup. This is a high end SAS (Serial Attached SCSI) controller - so it can support both SAS and SATA (Serial ATA) storage devices. It has a 1 GB RAM cache to speed up operations, and an onboard flash memory with a bank of capacitors. In the event of a power failure, the capacitors will power the controller long enough to copy the cache to persistent flash memory. The next time the system powers on it will automatically restore the cache so you don't loose any data. The advantage of capacitors over batteries is they are more robust, last longer, and require less maintenance.

While the 9266 is an 8-lane PCI Express 2.0 device, sadly my system only has a 4-lane PCI Express 1.0 slot available. The card seems to run fine in this slot, but it just means there will not be as much I/O bandwidth available. While there will likely not be that kind of bandwidth to disk, the 9266 has a 1 GB RAM cache that is very fast.

Because the 9266 can support 8 disks I decided to purchase a couple of extra mounting brackets and two more enterprise class 2 TB disks. Consequently I will have an 8 disk array, instead of a 5 disk array, and a spare replacement disk. The 9266 can actually support up to 256 SAS storage devices if you daisy chain them together, but I have nowhere near that much room in my computer case.

I think I have learned my lesson with RAID 5 so I will use RAID 6 instead. While the write performance will be less than RAID 5, the cache on the 9266 seems to make a big difference so my performance will be better overall than using the on-board RAID 5. With RAID 6 you can loose two storage devices in the array without failure. I now know from first hand experience how likely it is to loose two devices at the same time.

It is interesting to note that the Intel Embedded Server RAID Technology II (ESRT2) used on the S5520SC is based on LSI technology, so the LSI RAID Management Software can manage both systems at the same time. Sadly, the 9266 has the same frustrating limitation that the ESRT2 does - namely that you cannot create virtual disks on the same array (disk group) with different RAID schemes. You can do this on other RAID systems, however, such as those with Intel's Matrix Storage Manager. For example with MSM, you can create a RAID 0 virtual disk (for a fast system drive) and a RAID 5 virtual disk (for a reliable data drive).

When I finally had all the pieces together to migrate everything from the old RAID to the new RAID, I then began the week from hell. Generally this should be simple.
  1. Backup all the stuff you want to keep.
  2. Reconfigure the RAID.
  3. Restore all the stuff you saved.
Sadly I had nothing but problems doing this. Part of the problem was I need to move the disks from the old RAID to the new RAID, so I could not do a direct transfer. Also, I needed a way to boot my system independent of any of the RAID systems.
  • The first solution I thought of was to install Windows 8 to a USB Thumb Drive. This is a new feature Microsoft has added to Windows 8, and I found a nice article on a simple way to do this.
  • The first time I tried this it worked fairly well and I was able to boot Windows 8 from the new high-speed ADATA thumb drive I just purchased for this purpose. While running Windows 8 from here was fairly effective, it was still slow as thumb drives are not all that fast. Everything was going along fine, I was able to install the RAID management software and Acronis True Image there. I was almost finished doing all the backups when my system rebooted for mysterious reasons, before my backup was finished.
  • After restarting my system I was no longer able to boot from the ADATA drive, it complained there were no boot records. I tried to repair the boot records, but still no luck. I tried reinstalling Windows, but I still could not boot from it - boot-block problems still. Also, half the time or more, my system did not recognize the ADATA as a boot device. After an insane number of hours playing this stupid game I decided that because the ADATA is actually a USB 3.0 thumb drive, my Intel S5520SC does not play well with it. Consequently, this is not a reliable solution.
  • Then I tried installing Windows 8 to my Patriot thumb drive. While this is a USB 2.0 device, it is also much slower. When I tried to boot Windows 8 the setup took almost 10 times longer than with the ADATA, and many things did not work well because all sorts of Windows processes would time-out. Windows is exceptionally poor when running on/with slow or problematic file systems.
  • While I was still deciding on a boot solution, I tried backup up my old RAID volumes again, so I decided to install Acronis True Image on my main system. Before the installer finished my system 'blue screened' and after restarting it would not boot. I had to go into Windows system repair which started a 'system restore' - I forgot how insanely long a system restore is, over 3 hours. In the end there was a message that the system restore failed. I tried rebooting and it worked. However, my system was extremely unstable and I was having problems with all my storage devices.
  • I got on the phone with Acronis technical support explaining the problem. They had to install a special program to uninstall their software because the regular installer would not work. Then they had to muck around in the registry to remove all the crap their installer put there. Finally they downloaded a newer build of their installer and told me to try that after rebooting. When I tried the new installer, it got a little further, but again blue screened my system. Clearly a pattern here. I had to repeat the software removal experience with technical support yet again, but after that my system was still not stable.
  • After getting my system semi stable again, I tried a variety of other backup solutions, but had trouble with most of them for one reason or another.
  • etc, etc, etc
Here is the solution that ultimately worked, but it was not without its own problems.
  • I simply gave up expecting Windows to boot and run from any USB device, whether flash or rotating disk, none were reliable enough.
    •  I wasted many days of time trying to create a stable, bootable USB drive.
    • I decided I could bootstrap the process by installing Windows 8 on the new array and then using that restore the Gemini system images.
  • Ultimately I found that DriveImage XML was the best product for backing up drive images to external USB devices. In particular I backed up both the Windows System (C:) and System Reserved partitions using DriveImage XML. By Data (D:) partition I simply used a normal Windows file copy to an external drive.
    • I wasted a lot of time because I failed to read the warning that you should run DriveImage XML as Administrator.
    • But this is what happens when you are tired and stressed out, you can easily miss subtle things like this.
    • While I have used imagex many times in the past, I could not get it to back up Gemini because it kept complaining about "a virus or other unwanted software." Typical Microsoft arrogance, they always believe they know how to do your tasks better than you do.
  • I tested reinstalling my system on a partial RAID I had constructed on the new controller. Once I was convinced I could restore a system I disassembled the old RAID system and moved all the disks to the LSI 9266 for a total of 8 disks in a RAID 6 configuration with 4 virtual disks.
    • Dress rehearsals are always important, and I am most proud of the fact I had the forethought to do this before committing myself to a radical system change. Measure twice - cut once ;-)
  • I installed Windows 8 on virtual disk #0. This is basically my (maintenance) system. I then used DriveImage XML from here to restore the system partitions to virtual disk #2, and normal Windows file copy to restore the data partition to virtual disk #3.
    • A good lesson is whenever you build a high end server like Gemini, keep a few extra virtual disks around for maintenance and experimentation
    • While I had installed Windows 8 successfully 6 times or more, suddenly it refused to install with some stupid message about not finding or being able to create partitions. Finally I had to install Windows 7 first, and then install Windows 8 over it. The most import an thing about Microsoft products is that you can NEVER expect them to behave the same way more than once under the same conditions.
  • Finally I used the Windows Setup DVD to copy the boot blocks to virtual disk #2, reconfigured the LSI 9266 to use virtual disk #2 as the boot drive, and booted the newly migrated system for Gemini.
Things seem to be running smoothly now, and Gemini is much faster than before with respect to file systems. Windows is now behaving much better and most of the annoying long pauses seem to have gone away.

Lessons Learned

  1. Windows behaves extremely poorly with a slow or failing file system. In fact, it is completely unbelievable how much a slow or failing file system can cause Windows and its desktop applications like Explorer to freeze for long indeterminate amounts of time.
  2. Always have a bootable backup disk on hand. Avoid Windows PE and RE, and have a full version of Windows available. Install really useful recovery and other maintenance tools.
  3. Modern computer systems have shockingly few ways to boot, especially from external devices. While you can generally used network boot, it is insanely difficult to set up and configure. Forget about booting from FireWire, or USB 3.0. Forget about booting a complex USB 2.0 device with a bridge or disk multiplexor - most, if not all, BIOSes are incredibly simple and stupid. EFI systems might be a bit better, but the EFI support on my S5520SC system board is really messed up. While you can install Windows on a USB Thumb Drive, most are too slow - see point 1.
  4. Windows 8 is better than Windows 7 at self configuring.
  5. Failure is not an option! If at first you don't succeed, take a deep breath, swear, scream, pull you hair out, stamp your feet, have a few drinks, but just keep going. Winston Churchill said "If you are going through hell, just keep going."

Thursday, July 5, 2012

Serious Upgrades

PCI Express Flash Drive

Years ago when I originally conceived Gemini my plan was to use one of the FusionIO devices for my main boot and system drive. While FusionIO kept promising they would release a bootable device, after a couple of years they finally gave up promising and declared that none of their customers really had a need for such a device.

While OCZ has had a similar device for some time that is bootable, when I studied how it operated it never sounded right.

Recently Intel released its Ramsdale series under the Intel 910 product line, and it seemed to be what I was waiting for so I ordered the 800 GB model for $4,000. It was easy enough to install in my computer, but when I finally got it working it appeared as four 200 GB devices. Try as I might, I could find no way to install and operating system on it either.

Finally I started searching the web again for more information and found a few good reviews the pointed out it indeed was not bootable, and it indeed looked like 4 separated devices, with no built-in RAID management. What is surprising is that while each review made these facts clear as day and easy to find, none of the Intel documentation makes any of that clear.

It was easy enough to configure the 4 devices into a single 800 GB Software RAID 0 drive using Windows, but still you cannot boot from a Software RAID. Also, what was really odd according to some of the reviews, the device does not perform any better in RAID 0 operation than in direct operation. Now that is really strange. I wonder how Intel screwed that up?

One good thing is that the new device is pretty fast. Using VMWare Workstation 8 I created a Virtual Disk on the Flash Drive and installed Windows 8. It only takes 11 seconds for Windows 8 to boot from a cold start. By comparison, using a RAM Disk to host the Virtual Disk, I can get Windows 8 to boot in 9 seconds.

Real RAID

I have had so many problems with the built-in RAID on my Intel S5520SC system board that I decided to go buy a high end PCI Express RAID Controller - the LSI MegaRAID SAS 9266-8i.

I also got the Flash backup option. Basically this is a daughter card with Flash memory and a bank of capacitors. The controller has a huge 1 GB RAM cache to improve performance, but if the power fails your file system can be royally hosed. With the flash backup, the capacitors provide power to the controller long enough to copy the RAM cache to flash. When the power comes back, the RAM cache is restored from flash and all the outstanding I/O operations can be completed so that your file system stays sane.

Hierarchical Storage Management

In order to make better use of the Intel 910 more conveniently I decided to experiment with and HSM system called MoonWalk. The idea is that I create a 'source' directory on the RAID 0 Flash Drive and a 'destination' directory on the slower RAID 5 Disk file system. The HSM software will automatically migrate files from the source directory to the destination directory, and leave behind empty stub files in the source directory. If any program tries to access the stub files in the source directory, the HSM software will automatically de-migrate the files back from the destination.

In effect, you can pretend that your source directory is a lot bigger than it really is because files that are not used frequently are migrated to the slower, larger, less expensive disk array. It is convenient because when you want to access the files in the source directory again, they are automatically restored.

Thumb Drive Boot

I was finally able to do something I have wanted for years - boot Windows directly from a USB Thumb Drive. For a long time it was possible to boot Windows PE or Windows RE from a Thumb Drive for installing and/or repairing Windows, but I could never figure out how to actually install the full Windows O/S on the Thumb Drive and boot it, until recently. Finally I found a great article on how to do this with Windows 8 and I was able to get it working.

The great thing about this is that in an emergency, if I have a disk system failure, or if I just want to do maintenance, I can boot the full version of Windows with full functionality from the Thumb Drive.