I’ve been playing around with SSD drives in raid configuration for almost 2 years now and would like to share my experience that will hopefully save somebody some time and money.
All RAID configurations described here were based on Intel RAID controller built-in to P55-based motherboards.
I’ve built my first SSD RAID-0 rig when Windows 7 came out. I wanted 1TB in storage so I’ve put 4 Corsair P256 SSD drives in RAID-0 configuration. Before you would run off and do just that without reading this article to the end, let me give you one word of warning: “Using RAID-0 without proper regular backups is a loud statement that data on the array is not important to you at all”. I’ve had automated nightly backups done by Windows Home Server so I was covered.
Back then Corsair SSDs didn’t have TRIM support but I didn’t care much. Even after the drives filled up, RAID-0 provided around 100 MB/s sustained linear writes which was good enough for me (this is what a single mechanical HDD would put out). The strength of SSDs lies not in their linear access speed (which normally is on par or better than mechanical drives) but in their ability to process thousands of IO operations compared to around 100 of HDDs. This dramatically improves overall PC performance.
I was getting 250MB/s sustained linear reads and 100MB/s sustained linear writes and around 2K IOPs on 4KB random access. Total computing bliss!
My RAID of SSDs was performing happily until it started stuttering. The system would freeze for several seconds due to the fact that no IO could happen at all. Later Intel Matrix storage manager started reporting failures of one of the drives. By that time Corsair came up with firmware update that supported TRIM (VBM19C1Q) and Intel announced TRIM support in RAID-0 configuration via driver update. I reflashed all 4 drives with new firmware, restored the system from backup (since updating the drives causes a complete data-loss) and updated the Intel driver.
For some time everything was tip-top (I guess due to the fact that drives were restored to “like-new condition” via secure erase). But after a while everything slowed down. It very much seemed like TRIM did not work. I never found out which part of the system didn’t do it’s job because soon stuttering came back and then array linear write performance decreased to 3MB/s. The situation was unbearable so I pulled the array apart and replaced it with RAID-0 array of 2 512MB OCZ colossus LT drives.
Back then it seemed like a good idea since OCZ claimed it didn’t need TRIM because drives relied on automatic internal “garbage collection” to keep themselves in shape. I then tested and re-tested each individual Corsair P256 and each drive performed flawlessly by itself. TRIM worked, there was no stuttering. My new OCZ Colossus array (on the same controller) was also performing great so I assumed it was a compatibility issue between Intel RAID chipset and Corsair SSDs. After making sure that drives were flawless, I sold 3 on eBay to recoup some of the money I spent on Colossuses. The 4th one still works fine in my wife’s notebook.
New Colossus array seemed to work fine and preformed great. Linear access was sustained 500+MB/s even after the drives filled up. I was getting 4K IOPS so everything seemed tip top. There was one slight oddity. CrytalDiskMark reported linear write speed of 500MB/s but linear reads were at 240MB/s. Copying large files to nul device showed speeds in excess of 500MB/s so clearly the array read data as it was supposed to, but for some reason CrysalDiskMark was doing it slowly. This didn’t bother me much so life went on.
After several months trouble started again. This time the machine would BSOD because of IO fatal error. After rebooting Intel RAID manager would report one of the SSD drives as failed. Sometimes BIOS wouldn’t detect the drive as well. Powering the machine down, rebooting and resetting drive to normal temporarily solved the problem until it happened again. Needless to say, Blue Screen of Death is bad for overall health of the PC. Files get corrupted, data is lost etc. Eventually one of the system files was damaged badly enough by the BSOD so the system would no longer boot.
By that time I’ve had enough so I replaced two 512GB Colossus drives with a single 1TB drive. Although I’ve lost half of the bandwidth, hopefully it will gain me some peace of mind as the system will become more reliable.
Bottom line: Intel Matrix Storage and SSDs don’t mix. The system may work for a while but eventually it will fail and you may lose all of your data. While it works it completely rocks though.
After a few month of happy usage of OCZ Colossus 1TB drive (now discontinued – no surprise here) it is SSDD – BSOD, stuttering, drive not being detected by BIOS, file system errors – you name it. Good thing my Windows Home Server does nightly backups… After yet another “crash, then drive not detected” episode I decided to call it quits on SSDs. Thing is – I have several 32..256 GB ones working flawlessly in laptops but it seems that once the size gets over 256GB reliability goes out the window. I used HDClone to move my system partition to a mechanical spinning hard drive and contacted OCZ for RMA.
Bottom line 2: OCZ SSDs suck – end of story.
After using mechanical drive for some time I once again started to miss SSD performance. Unfortunately for me, I do need 1TB of storage. Largest solid state drive from a reputable manufacturer that gets good reviews is Intel 320 series 600GB – not large enough. Since my past experience with OCZ drives demonstrated that the problem was with OCZ drives rather than with RAID setup, I decided to give RAID another try. Obviously with RAID it is impossible to use TRIM so with usage drive write performance is bound to decrease. This is somewhat offset by the fact that RAID0 provides 2x performance boost.
I also plan to over-provision the array. 600 billion byte translates to 558.9 Gigabytes (1024 * 1024 * 1024), twice that is 1,117.8 GB. This is 186.3 GB larger than current 931.5 GB spin-drive that I use now. By keeping partition size the same when doing the transfer I will ensure that each Intel SSD drive will have 93.2 GB of logical blocks that are never used. So in theory these flash blocks should always stay clean and ready to be written at maximum performance. Once LBA is overwritten, it can immediately be reclaimed by the background garbage collector.
This is theory – how it will work out in real life – we will see once I install the system next week and in the months to follow.
The array is up and running. Here’s CrystalDiskMark screenshot of it’s performance measurements. It knocks the socks off OCZ Colossus LT drives (single or RAID) when it comes to random 4K reads. It is only slightly slower than 2 OCZ drives in RAID0. Will this setup work? I hope so. Time will tell…
You can find more detailed performance comparisons here.