The Vast Expansions of Hardware At the Small Data conference recently, one of the talks looked at hardware advances. It was interesting to see a data perspective on hardware changes, as many of us only worry about the results of hardware: can I get my data quickly? In or out, most of us are more often worried about performance than specs. However, today I thought it might be fun to look at a few changes and numbers to get an idea of how our hardware has changed, in the march towards dealing with more and more data. Big data anyone? In thinking about disks, I saw a chart that looked at the changes from HDD (hard disk drives) to SDD (solid state drives) to NVMe (Nonvolatile Memory Express). These show read speeds going through the list from 80MB/S to 200MB/s to 5000+MB/s. That's a dramatic change, and not one only in high-end arrays. There are off-the-shelf drives you can put in a desktop that read this fast. If you think about some of the early IBM drives, which read at 8800b/s. Growth in disk speed, inside the timeline of our careers, has grown by a few orders of magnitude in read speed. Write speed hasn't grown as much but capacity has. My early career work used HDDs with a 100MB capacity. These days we can get TB range storage on all of these mediums, with many laptops having 0.5TB or more on them. Desktops often have plenty more. My current workstation at home has 3.5TB of storage. Contrast that to the early IBM drive linked above, which had 5MB. These days people regularly demo hundreds of TB, or even 1PB queries from a database. Many of us just expect the network to work well. In fact, I assume many of us won't complain to network people since they are never at fault for performance issues. I started my career with Arcnet connections between machines. Those ran at 2.5Mb/s. We were moving those and 4Mbps Token ring to Ethernet at 10Mbps with Thicknet, Thinnet, and eventually RJ-45 connections. When we got 100Mpbs bridges, I thought we were cutting edge for our SQL Server Central servers. If we look back 20 years, 1Gbps was more the standard then, but today we see growth up into the 800Gbps with Infiniband. While I don't know many data centers doing that, there are plenty running in the 50Gbps range. If we think about CPUs, I started my career on a 386 machine running at 25MHz. I helped upgrade some 286 machines, but most of our servers were 486 class machines at 25 MHz. I still remember being excited about the early Pentium processors for a large system. There were many Pentium variants and later families of processors, but back in the 2000s, almost all machines were single-core. The first multi-core chips were released and slowly became more common over time. These days, many new laptops have multiple cores, including the new on I got, which has 12 cores. If you want, you can purchase an AMD Epyc 9004 processor with 96 cores. That's on one chip. Since most servers can take more than one CPU, you can have hundreds of cores running if you want. If you want to get really crazy. the Nvidia Blackwell has thousands of cores for their GPU-based AI calculations. Memory has likewise grown, though it seems most servers are much less than a TB of RAM, which is a much lower growth over time than storage and networking. Maybe because of those two changes, memory has had less of a reason to grow into common multi-TB-sized capacities in our systems. In fact, for you reading this, what are the common memory sizes you have in servers? I see many VMs and other machines set up with somewhere between 128GB and 1TB for memory, even as their data sizes have grown much, much larger. However, there are plenty that don't have anything near 128GB. That was one of the interesting things I realized about the Small Data conference, and one reason the event was created. Most of our data sets, especially usable sets, and most of our queries can run on a laptop if not a mobile device. The focus on big data seems overblown, especially as most of our companies don't have anything approaching 100TB, much less 1PB. If you need it, there is hardware out there for you, but some of the amazing advances made over time are lost on me as the common, average capabilities out there on the majority of systems could handle the majority of my needs. With some well-written queries. Steve Jones - SSC Editor Join the debate, and respond to today's editorial on the forums |