How Hard Disk Drives Work

Hard Disk Drives (aka HDDs) have been around for quite sometime. If I had to guess, I'd say that the main demographic of my blog already knows roughly how a HDD works but for those who don't, that's OK. Computer parts are a commodity. It's easier to replace than it is to repair, so why does anyone need to understand how spinning disks work? If you're working with storage (IE, administering a SAN), it pays to understand how they work. Lets dive in.

I'll go over the basics in case anyone needs a refresher. HDDs are made up dual-sided magnetic platters on a spindle, controlled by a motor, with read/write heads hovering over the surfaces of said platters on an actuator arm. All of this mechanical goodness is controlled by a controller (the usually green PCB on the bottom of a HDD).

The Platters

As mentioned above, the platters in a typical modern HDD are stacked with 2, 3 or 4 platters on a spindle (which I'll talk more about below, in case you were wondering). The platters are covered with a magnetic coating and it's where all of your data is stored. The magnetic surface has positive and negative charges that actually represent the 1s and 0's in a base 2 numbering system, which makes up your data. Modern HDDs use a Zoned Data Recording (ZDR) to organize data on the platters into tracks and sectors. Without stealing a picture off of the internet to depict this, picture a record with grooves to represent the tracks. Now picture a pizza cut up into slices to represent sectors. Now, lay one over the other and you have either a representation for tracks and sectors on HDD platters, a dirty record or a hard-to-eat pizza. I hope you have the former. HDDs hold X-Gb or Mb of storage based on how many tracks/sectors they can cram onto a platter and how many platters they can cram into a single HDD. 

The Spindle

The spindle and it's motor are what the HDD platters are connected to. The motor spins the platters around so that the R/W heads (mentioned above and more below) can read data stored in tracks/sectors on the platters. One of the reasons HDDs are being out-paced by Solid State Drives (SSDs, which I'll talk more about in another post) is because of the spindle motor speed. HDDs suffer in performance because before data can be read or written to a sector on a platter, the spindle must move the platters to the correct sector. This time is referred to as Rotational Latency, which is measured in Milliseconds. While rotational latency might sound like a huge slowdown, we're talking "slow" as in it takes a few millionths of a second. Spindle speeds can vary depending on what the application of the HDD is. Spindle speeds on consumer-grade HDDs range from 5400 RPM (think laptop HDDs), to 10k RPM (some performance drives). Most are 7200 RPM. In servers, 7200 RPM is usually for backup drives, while 15,000 RPM drives are for the rest of the workloads. The faster HDDs usually feature a smaller capacity. The outer edge of a platter on a 15,000 RPM SAS (Serial Attached SCSI) drive reach over 150 miles per hour. 

The R/W heads

The Read/Write heads fly over each side of the platters where data is stored and (you guessed it) read and write data from and to the magnetic surface of the platters. All of the R/W heads are held over the surface of their platters by a single actuator arm, so all of the heads move across the platters in unison. In fact, the distance between the heads and the surface of the platters (referred to as the flying height) is so small, it's measured in nanometers and is smaller than a smoke particle or the depth of a human hair. In combination with Rotational Latency from the spindle, HDDs also slow down due to the "Seek Time" it takes the actuator arm to position the R/W heads over the right track/sector of a given platter. Seek time, just like rotational latency, is just part of the design and is measured in milliseconds. 

The Controller

You can't expect all these parts to manage themselves, right? This is where the controller comes in. The controller is responsible for managing the Logical Block Address (LBA) table. This is the map that keeps track of what data is in which track/sector. 

Interfaces & Protocols

There are a number of common interfaces and protocols for HDDs. The two most common ones you'll hear about are PATA and SATA.

PATA is currently more of a legacy interface. When I first started working with computers, it had long been the standard interface for HDDs in desktop or laptop computers. PATA is short for Parallel ATA, and was a 40pin connector (20x2 arrangement) being connected via a ribbon cable. There was some 80pin connectors for special PATA standards as well as a 44pin interface for smaller drives (the extra 4 pins were for power), but those aren't very common at all and at this time in computing history, if you see a ribbon cable connected to a HDD, that machine is probably due for a replacement.

The Parallel part of PATA means that data bits are transferred in parallel, meaning multiple data bits can be transferred simultaneously, in parallel across their own wires (think of cars speeding along in their lanes). The maximum speed of the PATA bus is 133MB/s.

SATA stands for Serial ATA. The Serial part works by transmitting serially, one bit at a time but at a higher rate. SATA cables are noticeably smaller and have only 7 pin connectors. When the original SATA standard was released, it's maximum bus speed was 150MB/s. The second standard had new features such as NCQ (Native Command Queuing) and TCQ (Tagged Command Queuing), bringing the maximum SATA II bus speed up to 300Mb/s.

Outside of the consumer level storage stuff, you'll see more interfaces such as SCSI and SAS. Let's look at SCSI first. SCSI (often pronounced "skuzzy") stands for Small Computer System Interface (or Interconnect), and it's been around for a while. It used to be the standard in desktop computers before the PATA standard gained popularity. SCSI has outgrown the "small". You'll still see it in the enterprise world connecting older HDDs and even things like tape drives. It has 3 different bus speeds and connectors (Fast, Ultra and Wide). It's a Parallel data connection so it sends 1 bit simultaneously across each wire, reaching speeds of up to 320MB/s.

SAS is typically what you'll see for spinning disks in a datacenter today. SAS stands for Serial Attached SCSI. Similarly to how the PATA vs SATA difference was a Parallel vs Serial communication method, SCSI vs SAS works the same way. SAS works very similar to SCSI, but it's a serial connection so it sends 1 bit at a time but it does it really fast. The SAS-4 specification is expected to have maximum speeds of 22.5GB/s.

Honestly, the interfaces and protocols can be their own in-depth topics, so I'm just trying to skim the surfaces for those. I hope that this post helped you understand more about computer storage in general and how far such an "old" technology has come. I'm also writing about Solid State Storage and while it's easy to get caught up in how much faster and more energy efficient flash is, magnetic storage still has a place in the computing world. For example, for a backup solution, you're generally better off with spinning disk for large amounts of backup data. HDDs offer better density, which allows them to be "cheap and deep" for the storage situations that don't require high performance. 

Comments

Popular posts from this blog

Installing CentOS 7 on a Raspberry Pi 3

Modifying the Zebra F-701 & F-402 pens

How to fix DPM Auto-Protection failures of SQL servers