After two to three years, GroupWise performance degraded, and cluster nodes lost communication with each other through the Quorum disk and/or network. The Fiber Channel infrastructure checked out well. I checked the Clariion SAN server and everything looked good.
I assumed the e-mail database was just too big for GroupWise to handle. I first forced all e-mail over 1 year old to be archived into the users' home directories, but this had little to no effect. Before I started walking down the uncharted road of migrating users to another post office, I checked the storage stats on the server and noticed the peak disk writes to the SAN were extremely slow, like 1.5MB/s to 7MB/s. This gave me my first clue as to what was happening.
I tested the back-end SAN. Performance turned out to be terrible, but we were stuck now. We had no servers to move the e-mail server to and no SAN to back up this SAN. I was already looking for a faster backup solution at this same time. I came across Pillar Data Systems for disk-to-disk backup and sent them a request.
I knew we were living on borrowed time, but I was also very wary of getting another terrible SAN product. What I found was that Pillar built all the redundancy that we originally wanted and added speed and ease-of-use to the package for about the same price as the Clariion package. This price was on a per-gigabyte basis.
The Pillar solution has three components: the Pilot, the Slammer and the Bricks. The Bricks are the actual storage units; they house two RAID 5 disk arrays. Each array has a hot swap spare built in, as well as dual redundant controllers and dual redundant power supplies. This means you could lose up to four disks, one controller and a power supply in a Brick and still be running. The Slammer controls access to the Bricks. It has redundant controllers, as well as redundant power supplies. The Pilot is the management portion of the system, and it is fully redundant in power and electronics, just like the other two.