This post has been thrown at me a couple of times now… so I’ll now take the time to go through it… and try to address the junk.
It starts by suggesting that “the Germans” have started a war… but the next sentence points out that the author tossed grenades at HANA two months before the start he suggests. It also ignores the fact that the HANA post in question was a response to incorrect public statements by a Microsoft product manager about HANA (here).
The author suggests some issue with understanding clustered indexes… Note that “There are 2 implementations of xVelocity columnstore technology: 1. Non clustered index which is read only – this is the version available in SMP (single node) SQL Server 2012. 2. Columnstore as a clustered index that is updateable – This is the version available in MPP or PDW version of SQL 2012.”. The Microsoft documentation I read did not distinguish between the two and so I mistakenly attributed features of one to the other. Hopefully this clears up the confusion.
He suggests that the concept of keeping redundant versions of the data… one for OLTP and one for BI is “untrue”… I believe that the conventional way to deal with OLTP and BI is to build separate OLTP and BI databases… data warehouses and data marts. So I stand by the original comment.
The author rightfully suggests that I did not provide a reference for my claim that there are odd limitations to the SQL that require hand-coding… here they are (see the do’s and dont’s).
He criticizes my statement that shared-nothing gave us the basis for solving “big data”. I do not understand the criticism? Nearly very large database in the world is based on a shared-nothing architecture… and the SQL Server PDW is based on the same architecture in order to allow SQL Server to scale.
He is critical of the fact that HANA is optimized for the hardware and suggests that HANA does not support Intel’s Ivy Bridge. HANA is optimized for Ivy Bridge… and HANA is designed to fully utilize the hardware… If we keep it simple and suggest that using hardware-specific instruction sets and hardware-specific techniques to keep data in cache together provide a 50X performance boost [This ignores the advantages of in-memory and focusses only on hw-specific optimizations... where data in cache is either 15X (L3) or 20X (L2) or 200X (L1) faster than data fetched from DRAM... plus 10X or more using super-computer SIMD instructions], I would ask… would you spend 50X more for under-utilized hardware if you had a choice? SAP is pursuing a distinct strategy that deserves a more thoughtful response than the author provided.
He accuses me of lying… lying… about SQL being architected for single-core x286 processors. Sigh. I am unaware of a rewrite of the SQL Server product since the 286… and tacking on support for modern processors is not re-architecting. If SQL Server was re-architected from scratch since then I would be happy to know that I was mistaken… but until I hear about a re-write I will assume the SQL Server architecture, the architecture, is unchanged from when Sybase originally developed it and licensed it to Microsoft.
He says that HANA is cobbled together from older piece parts… and points to a Wikipedia page. But he does not use the words in the article… that HANA was synthesized from other products and , as stated in the next sentence, built on: “a new application architecture“. So he leaves the reader to believe that there is nothing new… he is mistaken. HANA is more than a synthesis of in-memory, column-store, and shared-nothing. It includes a new execution engine built on algorithms from the search space… columns in the column store are processed as vectors rather than the rote tuple-by-tuple approach from the 1980′s. It includes powerful in-database support for procedural languages with facilities that convert loops to fully parallel set-based processes. It provides, as noted above, a unique approach to supporting OLTP and BI queries in the same instance (see here)… and more. I’m not trying to hype HANA here… time and the market will determine if these new features are important… but there is no doubt that they are new.
I did not find the Business Intelligist post to be very informative or helpful. With the exception of the Wikipedia article mentioned above there is only unsubstantiated opinion in the piece… … and a degree of rudeness that is wholly uncalled for.
I posted a blog on the SAP site here that discussed the implications of mobile clients. I want to re-emphasize the issue as it is crucial.
While at Greenplum we routinely replaced older EDW platforms and provided stunning performance. I recall one customer in particular where we were given a query that ran in 7 hours and Greenplum executed the query in seven seconds. This was exceptional… more typical were cases where we reduced run-times from several hours to under 30 minutes… to 10 minutes… to 5 minutes. I’m sure that every major competitor: Teradata, Greenplum, Netezza, and Exadata has similar stories to tell.
But 5 minutes will not cut it if you are servicing a mobile client where sub-second response to the device is a requirement… and 10 minutes is out of the question. It does not matter if it ran in 10 hours before… 10 minute response is not acceptable to a mobile device.
Today we see sub-second response delivered to our phones by custom applications built on special high-performance platforms designed specifically to service a mobile client: iPhones, iPads, and Android devices.
But what will we do about the BI applications built on commercial platforms which have just used every trick in the book to become one of the 5 minute stories mentioned above?
I think that there are only a couple of architectural choices.
- We can rewrite the high-value queries as custom applications using specialized infrastructure… at great expense… and leaving the vast majority of queries un-serviced.
- We can apply the 80/20 rule to get the easiest queries serviced with only 20% of the effort. But according to Murphy the 20% left will be the highest value queries.
- We can tack on expensive, specialized, accelerators to some queries… to those that can be accelerated… but again we leave too much behind.
- Or we can move to a general purpose high performance computing platform that can service the existing BI workload with sub-second response.
In-memory computing will play a role… Exalytics provides option #3… HANA option #4.
SSD devices may play a role… but the performance improvements being quoted by vendors who use SSD as a block I/O device is 10X or less. A 10X improvement applied to a query that was just improved to 10 minutes yields a 1 minute query… still not the expected level of service.
IT departments will have to evaluate the price/performance, not just the price, as they consider their next platform purchases. The definition of adequate response is changing… and the old adequate, at the least cost, may not cut it. Mobile clients are here to stay. The productivity gains expected from these devices is significant. High performance BI computing is going to be a requirement.
When I was at Greenplum… and now again at SAP… I ran into a strange logic from Teradata about query concurrency. They claimed that query concurrency was a good thing and an indicator of excellent workload management. Let’s look at a simple picture of how that works.
In Figure 1 we depict a single query on a Teradata cluster. Since each node is working in parallel the picture is representative no matter how many nodes are attached. In the picture each line represents the time it takes to read a block from disk. To make the picture simple we will show I/O taking only 1/10th of the clock time… in the real world it is slower.
Given this simplification we can see that a single query can only consume 10% of the CPU… and the rest of the time the CPU is idle… waiting for work. We also represented some I/O to spool files… as Teradata writes all intermediate results to disk and then reads them in the next step. But this picture is a little unfair to Greenplum and HANA as I do not represent spool I/O completely. For each qualifying row the data is read from the table on disk, written to spool, and then read from spool in the subsequent step. But this note is about concurrency… so I simplified the picture.
Figure 2 shows the same query running on Greenplum. Note that Greenplum uses a data flow architecture that pushes tuples from step to step in the execution plan without writing them to disk. As a result the query completes very quickly after the last tuple is scanned from the table.
Let me say again… this story is about CPU utilization, concurrency, and workload management… I’m not trying to say that there are not optimizations that might make Teradata outperform Greenplum… or optimizations that might make Greenplum even faster still… I just want you to see the impact on concurrency of the spool architecture versus the data flow architecture.
Note that on Greenplum the processors are 20% busy in the interval that the query runs. For complex queries with lots of steps the data flow architecture provides an even more significant advantage to Greenplum. If there are 20 steps in the execution plan then Teradata will do spool I/O, first writing then reading the intermediate results while Greenplum manages all of the results in-memory after the initial reads.
In Figure 3 we see the impact of having the data in-memory as with HANA or TimeTen. Again, I am ignoring the implications of HANA’s columnar orientation and so forth… but you can clearly see the implications by removing block I/O.
Now let’s look at the same pictures with 2 concurrent queries. Let’s assume no workload management… just first in, first out.
In Figure 4 we see Teradata with two concurrent queries. Teradata has both queries executing at the same time. The second query is using up the wasted space made available while the CPUs wait for Query 1’s I/O to complete. Teradata spools the intermediate results to disk; which reduces the impact on memory while they wait. This is very wasteful as described here and here (in short, the Five Minute Rule suggests that data that will be reused right away is more economically stored in memory)… but Teradata carries a legacy from the days when memory was dear.
But to be sure… Teradata has two queries running concurrently. And the CPU is now 20% busy.
Figure 5 shows the two-query picture for Greenplum. Like Teradata, they use the gaps to do work and get both queries running concurrently. Greenplum uses the CPU much more efficiently and does not write and read to spool in between every step.
In Figure 6 we see HANA with two queries. Since one query consumed all of the CPU the second query waits… then blasts through. There is no concurrency… but the work is completed in a fraction of the time required by Teradata.
If we continue to add queries using these simple models we would get to the point where there is no CPU available on any architecture. At this point workload management comes into play. If there is no CPU then all that can be done is to either manage queries in a queue… letting them wait for resources to start… or start them and let them wastefully thrash in and out… there is really no other architectural option.
So using this very simple depiction eventually all three systems find themselves in the same spot… no CPU to spare. But there is much more to the topic and I’ve hinted about these in previous posts.
Starting more queries than you can service is wasteful. Queries have to swap in and out of memory and/or in and out of spool (more I/O!) and/or in and out of the processor caches. It is best to control concurrency… not embrace it.
Running virtual instances of the database instead of lightweight threads adds significant communications overhead. Instances often become unbalanced as the data returned makes the shards uneven. Since queries end when the slowest instance finishes it’s work this can reduce query performance. Each time you preempt a running query you have to restore state and repopulate the processor’s cache… which slows the query by 12X-20X. … Columnar storage helps… but if the data is decompressed too soon then the help is sub-optimal… and so on… all of the tricks used by databases and described in these blogs count.
But what does not count is query concurrency. When Teradata plays this card against Greenplum or HANA they are not talking architecture… it is silliness. Query throughput is what matters. Anyone would take a system that processes 100,000 queries per hour over a system that processes 50,000 queries per hour but lets them all run concurrently.
I’ve been picking on Teradata lately as they have been marketing hard… a little too hard. Teradata is a fine system and they should be proud of their architecture and their place in the market. I am proud to have worked for them. I’ll lay off for a while.
I recently pointed out some silliness published by Teradata to several SAP prospects. There is more nonsense that was sent and I’d like to take a moment to clear up these additional claims.
In their note to HANA prospects they used the following numbers from the paper SAP published here:
|# of Query Streams||
|# of Queries per Hour (Throughput)||
Teradata makes several claims from these numbers. First they claim that the numbers demonstrate a bottleneck that is tied to either the NUMA effect or to the SMP Knee Curve. This nonsense is the subject of a previous blog here.
For any database system as you increase the number of queries to the point where there is contention the throughput decreases. This is just common sense. If you have 10 cores and 10 threads and there is no contention then all threads run at the same speed as fast as possible. If you add an 11th thread then throughput falls off, as one thread has to wait for a core. As you add more threads the throughput falls further until the system is saturated and throughput flattens. Figure 1 is an example of the saturation curve you would expect from any system as the throughput flattens.
There are some funny twists to this, though. If you are an IMDB then each query can use 100% of a core. If you are multi-threaded IMDB then each query can use 100% of all cores. If you are a disk-based system then you give up the CPU to another query while you wait for I/O… so throughput falls. I’ll address these twists in a separate blog… but you will see a hint at the issue here.
Teradata claims that these numbers reflect a scaling issue. This is a very strange claim. Teradata tests scaling by adding hardware, data, and queries in equal amounts to see if the query performance holds constant… or they add hardware and data to look for a correlation between the number of nodes and query performance… hoping that as the nodes increase the response time decreases. In fact Teradata scales well… as does HANA… But the hardware is constant in the HANA benchmark so there is no view into scaling at all. Let me emphasis this… you cannot say anything about scaling from the numbers above.
Teradata claims that they can extrapolate the saturation point for the system… this represents very bad mathematics. They take the four data points in the table and create an S curve like the one in Figure 1… except they invert it to show how throughput decreases as you move towards the saturation point… Figure 2 shows the problem.
If you draw a straight line through the curve using any sort of math you miss the long tail at the end. This is an approximation of the picture Teradata drew… but even in their picture you can see a tail forming… which they ignore. It is also questionable math to extrapolate from only four observations. The bottom line is that you cannot extrapolate the saturation point from these four numbers… you just don’t know how far out the tail will run unless you measure it.
To prove this is nonsense you just have to look here. It turns out that SAP publicly published these benchmark results in two separate papers and this second one has numbers out to 60 streams. Unsurprisingly at 60 streams HANA processed 112,602 queries per hour while Teradata told their customers that it would saturate well short of that… at 49,601 queries (they predicted that HANA would thrash and the number of queries/hour would fall back… more FUD).
Teradata is sending propaganda to their prospects with scary extrapolations and pronouncements of architectural bottlenecks in HANA. The mathematics behind their numbers is weak and their incorrect use of deep architectural terms demonstrates ignorance of the concepts. They are trying to create Fear, Uncertainty, and Doubt. Bad marketing… not architecture, methinks.
One more surprise…
In the past SAP applications have, in general, avoided using database features. Even a SELECT with a projection was out-of-bounds. They did not want to depend on any database, so they tended to pull all data from the data layer to the application layer and loop through the data using procedural languages like ABAP. You might say that they were religiously database agnostic. My mistake… you might say that we were religiously database agnostic. I have to get used to these new surroundings.
Besides the obvious attributes of HANA: in-memory, shared-nothing, MPP, and column-oriented… the aim is to move the application logic next to the data and into HANA.
Any of you who have labored to convert procedural code into set-based SQL will understand the issue here. There are hundreds of thousands or millions of lines of procedural code… often very simple loops… that have to be converted to SQL to make the HANA architecture support the SAP application portfolio.
The surprise is not that there is this outstanding issue.. nor is it the ambitious architecture designed to push the application deep into the database (we are not talking about SQL-based stored procedures… we are talking about the application). The surprise is that the HANA development team has built a state-of-the-art facility that programmatically converts procedural logic into its set-based equivalent (not necessarily into SQL but sometimes into a language that can execute in-parallel). This is not a tool requiring manual intervention… it is an automatic, mathematically provable, transformation.
Right now the technique is used to covert logic in stored-procedures and in ABAP. But I hope to see it applied in the optimizer to convert those ugly Oracle cursor loops on-the-fly.
You can read more here.
By the way… SAP will continue to support ABAP using the database as a file server… moving all of the data from the database server to the application server for processing. But you can imagine that… when running applications that use this powerful capability… over time HANA will emerge with a huge performance advantage over other databases…
Oracle should be worried.
I found myself wondering where did the rule-of-thumb for Exalytics that suggests that TimesTen can use 800GB of a 1TB memory space… and requires 400GB of that space for work tables leaving room for 400GB of user data… come from (it is quoted everywhere… here is an example… see question #13).
Sure enough, this rule has been around for a while in the TimesTen literature… in fact it predates Exalytics (see here).
Why is this important? The workspace per query for a TPC-A transaction is very small and the amount of time the memory is held by a TPC-A transaction is very short. But the workspace required by a TPC-H query is at least 10X the space required by a TPC-A query and the duration of a TPC-H query is at least 10X the duration of a TPC-A query. The result is at least 100X more pressure on memory utilization.
So… I suspect that the 600GB of user data I calculated here may be off by more than a little. Maybe Exalytics can support 300GB of user data or 100GB of user data or maybe 60GB?
As a side note… it is always important to remember that the pressure on memory is the amount of memory utilized times the duration of the utilization. This is why the data flow architecture used in modern databases like Greenplum are effective. Greenplum uses more memory per transaction but it holds the memory for less time by never (almost) writing it to disk. This is different from older database architectures like Teradata and Oracle which use disk to store intermediate results… lowering the overall amount of memory required but increasing the duration of the query. More on this here…
I’ve been trying to sort through the noise around Exalytics and see if there are any conclusions to be drawn from the architecture. But this post is more about the noise. The vast majority of the articles I’ve read posted by industry analysts suggest that Exalytics is Oracle‘s answer to SAP‘s HANA. See:
But I do not see it?
Exalytics is a smart cache that holds a redundant copy of aggregated data in memory to offload aggregate queries from your data warehouse or mart. The system is a shared-memory implementation that does not scale out as the size of the aggregates increase. It does scale up by daisy-chaining Exalytics boxes to store more aggregates. It is a read-only system that requires another DBMS as the source of the aggregated data. Exalytics provides a performance boost for Oracle including for Exadata (remember, Exadata performs aggregation in the RAC layer… when RAC is swamped Exalytics can offload some processing).
HANA is a fully functional in-memory shared-nothing columnar DBMS. It does not store a copy of the data.. it stores the data. It can be updated. HANA replaces Oracle… it does not speed it up.
I’ll post more on Exalytics… and on HANA… but there is no Exalytics vs. HANA competition ahead. There will be no Exalytics vs. HANA POCs. They are completely different technologies solving different problems with the only similarity being that they both leverage the decreasing costs of RAM to eliminate the expense of I/O to disk or SSD devices. Don’t let the common phrase “in-memory” confuse you.