Friday, October 9, 2009

Cool Change



It's been a month now of playing around with Gemini. I finally solved the problem of the I/O Hub overheating by installing a new fan in the side of the case - it's the transparent fan in the middle of the side panel. I searched around the web reading up on fans, reading reviews, reading product specifications, etc. Finally I settled on the Antec Tricool 120 DBB, mainly because it has a three-speed switch and I did not know how much cooling I would need. Once I got it installed I found that the fan cools adequately well on the lowest setting. Before I could install the fan I had to find a way to cut a hole in the pexiglass side panel. Fortunately Deena remembered a plastics company (Dimension 3 Plastics) that did fabrication and machining. At any rate, I now have Gemini upstairs in the study where it belongs instead of spread out across the dining room table. Deena was getting pretty tired of eating in the living room by then.

While Gemini runs well and is nice and powerful, I still cannot get the file system configured the way I want. I have five 2 TB disks in a RAID array configured to appear as one large 10 TB disk. Unfortunately I cannot access more than 2 TB right now. The reason is that the disk array is layed out with the Master Boot Record (MBR) partition table. This is the standard PC partition layout, but it only supports disks up to 2 TeraBytes in size. The GUID Partion Table (GPT) layout is newer and has no such limitation, but Windows does not support booting from a GPT disk with the BIOS firmware. Windows does support booting from a GPT disk if your system uses the Extensible Firmware Interface (EFI). As it turns out the S5520SC motherboard I have supports both BIOS and EFI, but for some reason you cannot use EFI with the built-in RAID feature. This is especially dumb because the S5520SC is a server/workstation board, and you would expect people to use a large RAID array.

My other option is to configure the RAID as two virtual disks, one small one for booting Windows, and one large one for all the data. Unfortunately the new motherboard I have will not allow me to configure two virtual disks. It's frustrating that with my previous motherboard I was able to configure two virtual disks, and now I can't.

I've been in contact with Intel technical support for over two weeks now trying to find a solution to my dilemma. Finally we had to esclate the problem to their Engineering department. I had to fill out a long detailed report for the Engineering people, and I'm still waiting to hear back from them.

I think I've installed Windows about a dozen times now because I keep trying to find a way to configure the disk array the way I want it. Needless to say every time I do this I have to reinstall all my applications again. At least I am able to run Windows 7 fine, and I continue to get more exprience with it. I still have more work to do on the hardware - mainly I need to clean up the rats nest of wires and cables inside - but I can do that any time.

Tuesday, September 8, 2009

Darkest Before the Dawn

Who'll Stop the Rain...

Friday September 4 I set out again to rebuild my system. I tried to be a little more careful this time and not miss any steps. Previously I had forgot to install the back-plate on the I/O panel of the motherboard. Still installation had it's bumps. I got one of the water blocks on the CPU and then noticed that one of the fan cables was stuck under the motherboard and I could not get it out. So I had to pull the water block off, clean off all the thermal compound, pull the motherboard and put it back in holding the fan connector out of the way.

Finally I had enough things rigged up that I could power on the system again - I pushed the power button and everything powered up fine. Pretty much the first thing I did was to booth the Intel Deployment CD and upgrade the BIOS (and friends) to the latest version. However, when this was finished my system powered down instead of restarting - this was very odd (but a sign of things to come).

After that a lot of thing were behaving oddly, but mostly the system just kept powering down. For the longest time I could not get into the BIOS setup, the system just kept trying to boot something, but there was nothing to boot. Eventually I go into the system BIOS again and started trying to configure the RAID but I was having the same problems last time, I could not configure a second Virtual Disk.

Eventually I got on the phone with Intel Technical Support. He as asking my about what disks I had connected and I said five 2 TB disks. He says, "oh, that's your problem; your controller can only support 1 TB disks." After that he was not very willing to help any more so I said I would see what I could figure out on my own.

After a while later I tried booting the Intel Deployment CD, but this would no longer boot. I was confused as I did not have this problem with the previous motherboard. Also, my system kept powering down, which was annoying. Eventually I was back on the phone with Intel Technical Support (a different person this time) and told him about the system shutting down. I asked if this was because some component was getting too hot and he said likely. He asked me what Intel server case I was using and I said it wasn't Intel. He started giving me the same song and dance that it was not compatible with the motherboard and there was little he could do. I asked him how I could identify which part was overheating and he told me how to run a utility to view the System Even Log.

Now this part gets interesting because my system is not a normal BIOS, it's an EFI system so you can boot to the EFI shell, which is a little like an old style DOS or Unix system. At any rate I had to download the selview utility to a USB drive, then run the EFI shell on my system, and run the selview utility - but it would not run. At that point the guy from Intel said he could not do any more.

After a bit of a break and a chance to think I started reading about the BMC (Base Management Controller). This is basically a separate 32-bit computer separate from the main CPUs that controls the motherboard and presumably where the EFI shell lives. I read up on the EFI shell commands and how it work. It's actually pretty powerful and all computers should have EFI instead of the old BIOS. Eventually I found that I could run selview and dump the logs to a file, so I tried that and it worked. The best I can tell is that selview could not work on my display for some reason because it was a full-screen application.

After looking through the System Event Log I could see all kinds of warnings about fans not working. Back when I upgraded the BIOS it asked me which fans were connected in my chassis and I said all were (without thinking). Anyway, this was one of the things the system was complaining about. Also, there is a status LED at the back of the computer. At first I didn't realize this was a status signal because the Intel decal at the back was labeled incorrectly, but once I realized this was the status LED things became clearer. The light was always blinking amber which means there is a serious problem, and then when my system would power down the LED would be solid amber, meaning a critical problem.

The other thing the System Event Log showed was that the IOH (I/O Hub) was overheating, but it only said 10.0 degrees C - which is not hot, so I was confused. Up until now I had been running my system with the sides off the case because I had not finishes all the wiring. On a hunch I went an got a bit room fan and pointed it at the side of my computer on full. This time the status LED on my system stayed solid green, meaning everything was operating correctly.

Eventually I gave up on getting the RAID to work and just tried to install Windows Vista on a single disk normally. To my amazement it worked and I was able to get Vista running. Unfortunately after Vista was running there was no network connection. I finally fixed this by using the Intel Deployment CD to install the network drivers. It is interesting to note that when installing Windows 7 it already has the network drivers. I ran Task Manager and noted with some satisfaction that there was no pausing problem like I had seen before. Next I installed Second Life and it ran beautifully - no pausing - very smooth. Unfortunately my microphone was not working. I eventually fixed that by installing the audio drivers from the Intel Deployment CD and then fiddling around with the Realtek audio utilities.

After a satisfying couple of nights of running Second Life with my friends I went back to fiddling with the RAID setup again. Eventually I learned that if I enabled the SW-RAID in the BIOS that the Intel Deployment CD would not boot. It would only boot if this setting was not enabled. But I needed that setting enabled to configure the RAID. I had not seen this problem with my previous motherboard. Finally I went back to the BIOS and instead of configuring two virtual disks (which seems to be buggy) I configured a single large 8 TB RAID 8 array. I was able to install Windows 7 and get it running. I solved the pausing problem in Windows 7 by running two network connections - a trick I had learned in Driver Heaven. Again, everything seems to perform well, except my RAID performance is not what I had hoped - but I have no direct experience with RAID 5. Also, my RAID got formatted with MBR (Master Boot Record) layout and I wanted it formatted with GUID Partition Table.

At any rate I am impressed that I was able to create a functioning RAID 5 system with 2 TB disks after Intel told me it was impossible.

Thursday, September 3, 2009

Labor Pains - Part 2

Bad Luck is sometimes like rain - when it rains it pours!

I've been taking Fridays off from work this summer, and by Friday August 21 I had thought I had everything figured out.
  • I realized the problem I had getting Windows 7 to install on my RAID was that I had not set a system disk - and the Windows installer is too stupid to let you sent one from the UI. But this was something I could set in the S5520SC BIOS.
  • I had heard from DriverHeaven that other people had the same CPU spiking problem with Windows 7 and that I should use Visa in the meantime.
  • I confirmed that one of my disk drives was defective.
I went to NCIX and returned the disk drive - they confirmed it was defective too and ordered a replacement for me. I also bought a copy of Windows Vista Ultimate. When I got home I finally hooked up the rest of the front panel connectors on the computer case, and even managed to get the sides back on so everything looked nice and tidy.

I really took my time and wanted everything to go well for once. I got everything ready and then went into the BIOS to set up the RAID. For some reason the BIOS setup was not working properly this time, it would not let me finished configuring my second virtual disk - it kept freezing and forcing me to reboot the computer - grumble grumble grumble.

Next I thought I would try using the Intel RAID Web Console 2 from the utility CD. I booted the CD, and then upgraded to the latest version of the utility from the network, but I could not get the Web Console 2 to work - nothing would happen. Next I rebooted the CD again, but this time I did not do the network upgrade. Finally I was able to get into the RAID Web Console 2 user interface. This application was pretty crummy too, confusing to use, and buggy in some places. Eventually I managed to define the two RAID 5 virtual disks I wanted and started to initialize them. After 30 minutes I was wondering what was taking so long and then a progress bar finally popped up to show that it was only 10% done. I wish I had selected the fast initialization instead of the full initialization. I was getting tired of waiting so I went off to do some reading for a while.

I came back 15 or 20 minutes later to see how things were and found the screen blank, and the graphics card fan was on full (something that never happened before). I tried power the system off and on, but nothing happened. In fact, when powering the system on the Power On Self Test (POST) LEDs would not even light up at all. That was a very bad sign.

Eventually I found a phone number for Intel technical support and someone talked me through some tests. Mostly it was removing stuff from the motherboard and powering the system back on. Nothing helped and nothing changed so the support person conceded that the board was dead and sent me instructions for returning the board for a replacement.

By this time I felt pretty crushed - the morning had started off so well, and by mid afternoon it looked like I was finally going to get everything working - when BAM - the worst happens. I suppose this it what someone feels like after a terrible child birth and they discover that their child is not only retarded, but blind and deaf too. Of course this was just a computer and could never be the same as a child, but I just felt really depressed and angry. Why me?

The next day I set out to return the motherboard. First of all Intel required that there be some sort of commercial invoice for customs purposes so it took me an hour or so to fabricate something that looked official. Next was the process of removing all the connectors from the motherboard. Taking the water blocks off of the CPUs was interesting - but it was good to see that the thermal compound I had used had spread out nice and evenly across the CPU heat spreader. Of course I had to clean everything off and put the CPUs away safely, then prepare the motherboard for shipping. I took me almost 45 minutes at the UPS store to get all the information right because I was shipping across the boarder. I selected the least expensive shipping method, and that took over a week and cost me $85.00.

Anyway, I've had two weeks waiting for a replacement and today I'm supposed to get my replacement motherboard...

Monday, August 24, 2009

Labor Pains

In some respects building an enthusiast class computer is akin to having a child: conceiving the child is a lot of fun, but the gestation and birth can be very challenging.

On Thursday, August 13 I finally got my computer case back from CoolIT Systems in Calgary. It looks like my photos and measurements paid off cause everything fit perfectly. It also helped that they put one of their best builders on the case - Sean Mutlow - because he really understood what to do.



Friday, the next day, I set forth to get everything working. This was my first time ever building a system from scratch - let alone something as exotic as this. I was mostly worried about installing the CPUs (at $1,500 a piece) and attaching the water blocks. When you attach a cooling system to your CPUs you have to use some kind of thermal paste to make a good thermal connection between the CPU heat spreader and cooling device. I had read a lot of articles and decided on Arctic Silver Ceramique as it showed the best conductive properties for CPUs. I had to remove the thermal grease from the water blocks that CoolIT has installed, then I had to prep the surfaces of both the CPUs and water blocks. I had to further prep the water block surface with Ceramique per the instructions, but it did not seem to make a difference as the surface of the water blocks was like a mirror. Finally a laid down a thin strip of Ceramique on the CPU and attached the water block. This was a bit tricky because the WS-240 cooler had all the hoses attached and it was awkward settling the water block on the CPU with the hoses trying to pull things ever which way.



After finishing what I considered to be the hardest part then came the tedious part of connecting all the power and data cables everywhere. This was extra challenging because the S5520SC is a big board and I really had to stretch some cables and connections.



Finally I had enough together that I could power on Gemini for the first time. This was rather distressing because after powering it on - nothing happened - there was nothing on my computer display to show that anything was working. I did this a few times and still nothing. Whenever I powered it down there was this beeping from the motherboard -which I discovered was the diagnostic beep codes. I looked them up in the manual and it said "DC Power Missing" so I fiddled with the power connectors a little and powered it on again. This time the boot sequence showed on my display and I breathed a sigh of relief. In hindsight I don't really know why nothing happened the first few times I power on Gemini, because every time I power it down I get the same beep codes.

Now that I was finally able to get into the BIOS settings I started exploring the disk system. To my surprise one of the disks was not working. Fortunately I had bought an extra disk as a spare so when I connected that one everything was ok again, and the ICH10R southbridge could see all 5 of my disks.

Next I tried configuring the RAID. I had thought this would be easy because I have configured RAID before on the ICH10R southbridge using the Intel Matrix Storage Manager, but this turned out to be a nightmare. It turns out that the S5520SC board does not use the Intel Matrix Storage Manager like every other civilized computer, but some other piece of crap called Intel® Embedded Server RAID Technology II (ESRTII). I design and implement computer user interfaces for a living and I can say with all expertise - the user interface on this utility is total crap. Let's just say that this whole process used about 2 hours of my time and involved a great deal of swearing.

After trying to configure the RAID I could not get Windows to install. I tried using the Intel Deployment CD to set up the RAID, and this was even worse crap. Then I made a bad decision. Because Windows would not installed I enabled a BIOS setting called "EFI Optimized Boot" and restarted Gemini. After that Gemini was a "brick" - that is it was nothing more than a very big paper-weight - it would start to boot - but would not even go into the BIOS setup. By the time Deena got home I was swearing like a soldier and bordering on a depressive mental break-down.

Friday nights is my night for Second Life - and lets just say that my technology woes only got worse that night as I ruined the entire beach. It took excessive quantities of alcohol and hard rock music to hang on to my sanity that night.

By Sunday I had had time to cool down enough to start thinking of solutions. My very good friend Remo (from Second Life) had suggested resetting the CMOS, so after searching through the manual I found a jumper on the motherboard that did that - and finally Gemini booted againg.

Eventually I gave up trying to set up RAID and switched back to a non-RAID configuration and finally got Windows 7 to install. Pretty well the first thing I did after that was run the Microsoft performance analyzer - which gave my a 7.8 out of 7.9 on my CPU and Memory performance. That was a very satisfying result.



My next big goal was to install Second Life and see how well it ran. Well it ran pretty crappy! Every two seconds there was a one second pause - everything froze. Needless to say I was getting pretty disillusioned by now. Not that I have children - but it's sort of like going through a challenging gestation and difficult birth - only to find out your child is mentally retarded. In spite of this limitation, I was actually able to run Second Life, which I cannot do any more on my Sony PCV-RX660; and Deena and I were finally able to be in Second Life together for the first time in 8 months. I was a hell of a lot of fun having her in-world at the same time as me.

For the rest of the week I played with various things, trying to learn more about the problems I had, and was still having. But all in all, it was very satisfying having my dream-system running after 3 years of planning, and 2 months of waiting for the case to be ready after initially ordering all the parts.

To be continued...

Wednesday, July 22, 2009

The Patience of Job

Well sometimes when you are the first person trying a thing, it takes time to get it right.



It took quite a while for CoolIT Systems to get my Boreas Chassis to me. I had asked them to install the Boreas chiller (for my GPUs), and their new WS-240 chiller for the CPUs. Not only is the 240 a new product, they had never installed one in this kind of chassis before, so things were a little cramped. After they got it installed they realized the fans were a little too thick - 25 mm. After some shopping around they finally found some low profile 20 mm fans.

All this took time to sort out but I finally got my chassis Friday, July 17. Since we had just moved homes the day before our new place was in quite a state of disarray - boxes and boxes everywhere. Finally Saturday morning I had some time and space to open the box and take out the chassis. I spent the morning installing the power supply, BluRay drive, and motherboard; but after installing the motherboard it was clear there were problems.
  1. The coolant hose from the WS-240 was blocking the power connector for CPU-2.
  2. The pump for the Boreas was blocking the main power connector.
  3. The coolant hose between the two CPU water-blocks was too short.
In retrospect I probably should have shipped them the motherboard and these problems would have been obvious to them too. After talking to CoolIT on the phone on Monday they agreed to have me ship it back and correct the problems. They should get the chassis on Friday, and hopefully will be able to get it back to me soon.

While I am incredibly eager to finally get this computer system running, it has been over two years that I have been planning it, so if it takes a few more weeks to get it built right I have to be patient.

In the mean time rumors are the Intel will soon release a Xeon 5590 processor, which will likely be faster than the 5570 processors I bought. I have not opened the boxes my processors are in, so if Intel releases the new ones soon enough, I am hoping I can exchange my 5570s for a couple of 5590s.

Cheers, Eric

Friday, June 12, 2009

Obsolete the Day After You Buy It

So I've been waiting many many months for word on an Intel Skulltrail II motherboard. It figures that the week after I buy the Intel S5520SC motherboard I hear about the Skulltrail II
Of course there are no real specifications out yet, but it's interesting that they are planning it with the new 8-Core Nehalem-EX - that's 32 threads folks! We'll be lucky to see the Nehalem-EX before 2010.

I'm really torn because the Skulltrail II was what I really wanted, but then do I want to wait another 6 months? Also, the Skulltrail I was a terrible motherboard - I don't know what Intel was smoking when they came up with that. They got so much criticism I can only hope they learned their lesson and won't be so stupid next time. I know deep down in my heart I could design a really cool Skulltrail II - probably better than Intel.

So where does that leave me? If the Skulltrail II incorporates an ICH10 compatible RAID controller I would be able to swap out my S5520SC with a Skulltrail II and preserve my RAID file system. Hopefully there would be a spare 4/8 lane PCI Express slot for my FusionIO ioBoard. Of course I would have to by new processors - the EX ones. I suspect the EX processors will not be socket compatible with the Xeon 5580s, so I can't just drop them into my S5520SC motherboard.

What else does Skulltrail II get you? Well probably the ability to overclock the processors. It is well known the Nehalem processors can easily handle 4 GHz or better.

Anyway - so many if's right now. My current plan is to proceed with building Gemini and 6 months to a year from now if the Skulltrail II is appropriate, maybe we'll see Gemini on steroids :-)

Getting it all together




Last Friday I started receiving the parts for Gemini. Pictured here from top left to right is
  1. Enermax Galaxy 1250 Watt power supply
  2. Intel S5520SC motherboard
  3. ATI Radeon 4870 graphics card (temporary)
  4. LG 50 GB Blu-ray reader/writer
  5. two Corsair Dominator 6 GB 1600 MHz memory kits
  6. two Intel Xeon 5580 processors
  7. Intel RAID 5 activation key for the S5520SC
  8. 250 GB HD (temporary) for testing
  9. ArctiClean Thermal Material Remover
  10. Arctic Silver Ceramique - thermal paste
  11. ArctiClean Thermal Surface Purifier
  12. Sun Microsystems keyboard and mouse
I was able to get all these parts from NCIX locally (except for the keyboard) and they gave me a nice discount too.

I could not find any keyboards I liked anywhere, but I have an old Sun Workstation that still has one of the better keyboards I have ever used. However it's not compatible with personal computers. So I ordered one of Sun's newer keyboard kits that is compatible with personal computers. Now if only I could find some way to get all the keys to work with Windows.

The last piece I'm waiting for - the biggie - is my Silverstone TJ-07 case with a CoolIT Systems Boreas and Domino WS 240 water cooling for the graphics cards and CPUs. Then I can finally assemble the first phase of Gemini.

On back-order I have 6 Western Digital Enterprise 2 terabyte disk drives coming. They are so new they are still a month or two away, and in the mean time I have a smaller 250 GB drive I can use for testing.

I've already got a copy of 64-bit Windows 7 Ultimate Release Candidate. When Microsoft finally release Windows 7 in October, I'll go get a legitimate copy then.

I'm also planning to get a Dell UltraSharp 3008WFP 30'' Widescreen LCD Monitor - but I can get that just about any time. I may as well spread out my spending over a few pay cheques.

In the next phases I plan to add a FusionIO ioDrive, but I'm still waiting for them to announce when you can boot from the device. That's going to be my boot drive and it's going to be blazing fast.

In the final phase I plan to get two ATI Radeon 5870 X2 graphics cards, and two Koolance water blocks to connect to the Boreas water chiller. Hopefully these will be ready sometime this fall. I will be ready for some intense GPU overclocking. Also, these cards will be the first to support DirectX-11, the new graphics standard in Windows 7.

We plan to move in about a month to a slightly larger place. One of the things I intend to do is have a dedicated 15 Amp, 240 Volt circuit added to the second bedroom just to power Gemini. It will be a great way to warm the room in the winter. At any rate, it will be interesting to see if I finally get Gemini powered up before we have to move.

When I get things working relatively well I will probably have a coming out party for Gemini.