It is somewhat surprising just how many skilled IT specialists still shy away from eliminating traditional internal boot disks with a Boot-from-SAN process. I realize old habits die hard and there’s something reassuring about having the O/S find the default boot-block without needing human intervention. However the price organizations pay for this convenience is not justifiable. It simply adds waste, complexity, and unnecessary expense to their computing environment.
Traditionally servers have relied on internal disk for initiating their boot-up processes. At start-up, the system BIOS executes a self-test, starts primitive services like the video output and basic I/O operations, then goes to a pre-defined disk block where the MBR (Master Boot Record) is located. For most systems, the Stage 1 Boot Loader resides on the first block of the default disk drive. The BIOS loads this data into system memory, which then continues to load Stage 2 Boot instructions and ultimately start the Operating System.
Due to the importance of the boot process and the common practice of loading the operating system on the same disk, two disks drives with a RAID1 (disk mirroring) configuration is commonly used to ensure high availability.
Ok, so far so good. Then what’s the problem?
The problem is the disks themselves. Unlike virtually every subsystem in the server, these are electro/mechanical devices with the following undesirable issues:
- Power & Cooling – Unlike other solid-state components, these devices take a disproportionately large amount of power to start and operate. A mirrored pair of 300GB, 15K RPM disks will consume around .25 amps of power and need 95.6 BTUs for cooling. Each system with internal disk has its own miniature “space heater” that aggravates efforts to keep sensitive solid state components cool.
- Physical Space – Each 3.5 inch drive is 1” x 4.0” x 5.76” (or 23.04 cubic inches) in size, so a mirrored pair of disks in a server represents an obstacle of 46.08 cubic inches that requires physical space, provisions for mounting, power connections, air flow routing, and vibration dampening to reduce fatigue on itself and other internal components.
- Under-utilized Capacity – As disk drive technology continues to advance, it becomes more economical to manufacture higher capacity disk drives than maintain an inventory of lower capacity disks. Therefore servers today are commonly shipped with 300GB or 450GB boot drives. The problem is that Windows Server 2008 (or similar) only needs < 100GB of space, so 66% of the disk’s capacity is wasted.
- Backup & Recovery – Initially everyone plans to keep only the O/S, patches and updates, log files, and related utilities on the boot disk. However, the local disk is far too convenient and eventually has other files “temporarily” put on it as well. Unfortunately some companies don’t include boot disks in their backup schedule, and risk losing valuable content if both disks are corrupted. (Note: RAID1 protects data from individual disk failures but not corruption.)
Boot-from-SAN does not involve a PXE or tftp boot over the network. It is an HBA BIOS setting that allows SAN disk to be recognized very early in the boot process as a valid boot device, then points the server to that location for the Stage 1 Boot Loader code. It eliminates any need for internal disk devices and moves the process to shared storage on the SAN. It also facilitates the rapid replacement of failed servers (all data and applications remain on the SAN), and is particularly useful for blade systems (where server “real-estate” is at a premium and optimal airflow is crucial).
The most common argument used against Boot-from-SAN is “what if the SAN is not available”. On the surface it sounds like a valid point, but what is the chance of that occurring with well-designed SAN storage? Why would that be any different than if the internal boot disk array failed to start? Even if the system started internally and the O/S loaded, how much work could a server do if it could not connect to the SAN? The consequences of any system failing to come up to an operational state are the same, regardless if it uses a Boot-from-SAN process or boots up from internal disks.
For a handful servers, this may not be a very big deal. However, when you consider the impact on a datacenter running thousands of servers the problem becomes obvious. For every thousand servers, Boot-from-SAN eliminates the expense of two thousand internal disks, 240 amps of current, the need for 655,300 BTUs of cooling, greatly simplifies equipment rack airflow, eliminates 200TB of inaccessible space, and measurably improves storage manageability and data backup protection.
Boot-from-SAN capability is built into most modern HBA BIOS’s and is supported by almost every operating system and storage array on the market. Implementing this valuable tool should measurably improve the efficiency of your data center operation.