Greenplum and Teradata: Simliar Architecture, Different Strategies
Hardware systems, servers and network fabric, provide the foundation upon which all shared-nothing database management systems rest. Hardware systems are a major contributor to the overall price/performance and total cost of ownership for a data warehouse platform. This blog considers the hardware strategies of EMC/Greenplum: applying the idea of using common, off-the-shelf (COTS) components to build a competitive foundation; and Teradata: developing a proprietary hardware system by tightly integrating components.
The Teradata hardware strategy is simple to describe. They expend R&D dollars to couple low-level technologies into a tightly integrated system. Their servers are custom-designed within a set of guidelines that allows both the LINUX and Microsoft Windows operating systems to execute there. Their network fabric is highly proprietary, using cycles within the fabric to offload sort/merge data processing from the server CPU.
In other words, Teradata believes that the time and effort required to engineer an integrated proprietary offering will improve the performance of their offering enough to offset the cost.
EMC and Greenplum have taken a different approach. They have elected a strategy that leverages off-the-shelf servers offered by hardware vendors like Dell or HP, and network switches from vendors like Brocade, Arista, and Cisco. They have elected to expend few dollars on hardware design and development and to leverage the R&D investments made by these other vendors. In other words, Greenplum believes that the advantages in price and performance provided by using off-the-shelf hardware provides a sustainable advantage.
The lower costs associated with Greenplum’s strategy clearly provide an advantage. Greenplum does not have to expend to design and manufacture custom hardware. The manufacturing costs may not be significant, but the staff costs required by the Teradata strategy must affect the price. Clearly the Greenplum strategy provides an advantage on the price side.
The Teradata strategy has to be about performance… so lets speculate:
- How much of a performance increase might their integration provide on the server-side?
- How much of a performance increase might their integration provide on the network side?
In the days before there was a microprocessor based enterprise server market, Teradata could gain substantially here. Microprocessors were built for personal computing and not designed for the high-availability and high-performance requirements in an enterprise. Teradata had much to gain from building rather than buying server.
But today, there is little to gain from a highly customized design. The requirement to run standard LINUX and Windows operating systems limit their ability to innovate and the resulting servers have to be very similar to those built for off-the-shelf enterprise servers. There is little or no performance advantage here.
On the network side, there once was a distinct advantage to Teradata’s ByNet. It was both faster than available off-the shelf switches and it offloaded cycles from the under-powered CPU. Today, however, there are plenty of cheap, fast switches… so the speed advantage has disappeared. Worse still, the introduction of multi-core CPUs have eliminated the advantage of the in-the-switch sort/merge that makes ByNet unique. CPU is inexpensive these days.
The bottom line: it is unclear if the Teradata hardware strategy affords them a performance advantage.
Cost of Ownership
An argument could be made that supporting COTS hardware is inherently less expensive than supporting a Teradata cluster. But there is a more substantial savings that is clear.
Every 2-3 years, as newer Teradata technology obsoletes your currently installed cluster the value of the current hardware goes to zero and the cost of ownership goes up significantly. The costs of this are especially high when you are required to add several nodes to accommodate growth as Teradata refreshes their technology. You may have to buy servers that are already obsolete.
With Greenplum, your current cluster is built from general-purpose servers that are re-purposed with ease. In fact, since the nodes in a Greenplum cluster are usually high-end servers, customers often cycle new technology into their data warehouse and cycle the old servers out into their server farm. The result is a higher performance warehouse and full use of all of the server technology.
A Final Word
The words “proprietary hardware” are sometimes thrown around as an insult. But Teradata’s proprietary approach is based on the belief that a tightly integrated configuration adds benefit to offset the costs. Greenplum believes that today the enterprise server and the network switch vendors have matured their products to the point where off-the-shelf technology can match or exceed the performance of custom hardware… at a significantly reduced cost. You may have an opinion or you may wait to see how the benchmarks, the proof-of-concepts, and the market decide… but its interesting to understand the differing approaches.