[the minnecamp entries are basically going to be a pile of notes with some conclusions]
Going over how SAN works -- rebuilding raid filesystems is slow, mirroring between multiple silos works, but is expensive. Large cache is expensive, but can help. Async file IO with multiple users stinks.
Possible solution?
Use multiple disks attached to multiple nodes, distribute disks and processing. managed cluster/full peer cluster are the two variants. Many systems have problems with small files. most cluster filesystems are too complex -- lots of setup time. 2 types of metadata, integrated or separate. (ie: metadata servers or alongside on disk -- or on separate disks) $20k for a small scale clustered storage setup. cache on each node instead of having cache all on one processing blade. storage is then closer to a grid model.
soft rubber grommets cause more damage to drives since they vibrate more. -- use harder plastics/rubber instead to prevent noise but reduce vibration.
the object based systems are essentially large databases.
beat up your vendors to get hardware to test with, they should be able to get you some because of the current market situation.
some systems claim to be able rebuild disks very quickly by using the entire cluster to rebuild the volume.
750gb disks only had ecc on one side of memory. data was going to disk corrupted.
