I renamed this so that Teradata folks would not get here so often… its not really about Intelligent Memory… just prompted by it. The post on Intelligent Memory is here. – Rob
Two quick comments on Teradata’s recent announcement of Intelligent Memory.
First… very very cool. More on this to come.
Next… life is going to become very hard for my readers and for bloggers in this space. The notion of an in-memory database is becoming rightfully blurred… as is the notion of column store.
Oracle blurs the concepts with words like “database in-memory” and “hybrid column compression” which is neither an in-memory database or a column store.
Teradata blurs the concept with a strong offering that uses DRAM as a block-IO device (like the old RAM-disks we used to configure on our PCs).
Teradata and Greenplum blur the idea of a column store by adding columnar tables over their row store database engines.
I’m not a fan of the double-speak… but the ability of companies to apply the 80/20 rule to stretch their architectures and glue on new advanced technologies is a good thing for consumers.
But it becomes very hard to distinguish the products now.
In future blogs I’ll try to point out differences… but we’ll have to go a little deeper into the Database Fog.
In many of my posts I refer to the issues associated with building “extra” data structures to meet performance goals (see one of my first posts ever here). These extra structures are always a trade-off… slowing the performance of one function in order to speed up another. I thought that it might be helpful to be very clear about where I stand on this.
Indexes improve the performance of queries that address a small set of data. They also can improve join performance if your favorite optimizer can apply an index intersection to the execution plan for your queries. Indexes dramatically slow the performance of inserts, updates, and bulk data loads as they have to be maintained when data changes. You can mitigate the cost and update indexes in the background… the trade-off does not go away. Indexes are probably required for OLTP applications that pick out single rows.
Wouldn’t it be great if your favorite DBMS could resolve every query very fast without the overhead and operational effort associated with maintaining indexes? Certainly we should aspire to a read-optimized database, a data warehouse DBMS, that does not require indexes.
Vertica projections provide an optimized, materialized, view that improves the performance for a set of queries. The Vertica optimizer automatically selects the optimal projection. Vertica provides a very slick tool that builds projections based on the query set provided. I worded my post on Vertica a little vague… so let me be sure here to point out that every Vertica query runs against a projection… so it is possible to have only one. In this case there is no additional overhead. Adding projections slows the data load process and increases the storage requirements. This is the trade-off.
Other databases offer materialized views. They make the same trade-off as above.
An OLAP cube is a physical structure that pre-aggregates data so that your query workload can avoid the aggregation. The best implementations of this express the cube as a materialized view so that queries can use the pre-aggregated data without explicitly pointing at a cube structure… the optimizer picks it for you. In addition the best implementations let you drill out of the cube to the detail records. These products have the update/delete/load issues of an index plus add an extra data latency issue as the data has to be aggregated on some interval… usually hours or days. Many products do not allow joins from a cube. You can see the trade-off. The Oracle Exalytics product materializes the aggregated cube on a separate server in-memory. This provides even more performance but adds the system and operational overhead of moving data across system boundaries.
Wouldn’t it be nice if you could query raw data and perform aggregation so fast that even against terabytes of data you could run any query with 3 second or less response without the overhead of building cubes?
You may build specialized table structures and pre-join, pre-aggregate, or pre-compute data to make a set of queries run fast. The cost of building and maintaining this sort of implementation versus just querying the base tables is the trade-off. Further, this approach is sort of a trap. You cannot build these structures for every query… if you did the business would conceive another critical query the next day that required work.
You can add indexes to the structures built using the technique above and provide very fast application-specific performance to a small set of queries. This is currently the favored approach when companies build iOS or Android apps as it provides the best possible performance… at a significant price.
Wouldn’t it be great if this was unnecessary… you could just scan so fast that mobile response service levels could be met from the base data regardless of the query.
You can deploy redundant data in operational data stores, data marts, cube servers, analytic data stores, and so on… with each specialized store providing performance for some limited set of queries at the cost of development and support ongoing. Each of these copies could deploy specialized database products that speed up that set of queries a little more. Again, this surround-the-EDW approach is a trap that leads to the proliferation of data marts and of database technologies.
Please do not take that last paragraph the wrong way… I believe that the worst possible approach is to blindly standardize on one or two database products. This trade-off makes life convenient for the IT department at the expense of performance and agility in the business. It is OK to have one or two favored products but IT must always serve the business to the best of their ability as a first priority… and sometime the new start-up has just the thing (remember that once Teradata was a start-up and DB2 on the mainframe was the IT standard…).
What I wish was that one or two products could solve all of the performance and functionality problems without the cost of building “extra” stuff… one product would be better that two. I like products that make the extra stuff “free”. Netezza does a nice job of making zone maps “free”, for example. Teradata and Greenplum provide the option of row store or column store for “free”. Vertica automatically build extra projections for “cheap”… and while there is a cost to the projection it at least does not require staff to tune it up. Oracle materialized views are “cheap”.
What I dislike are products that require DBAs to work harder and harder to apply all of the techniques above to meet performance SLAs. Each of these techniques trades off performance for development and operational expense.
As I have noted before… the performance SLAs for BI are about to become severe as companies try to support BI on mobile devices. The development and operational costs of tuning up; that is the TCO; will be significant unless better, faster, software infrastructure becomes available.
The TCO for a database that could eliminate these extra constructs and could eliminate the cost of developing and maintaining them; and could eliminate the architectural fragility these approaches imply… and replace this with a DBMS that holds base data which could satisfy all queries in seconds; delivering the business agility this implies… the TCO would be compelling.
I actually believe that the answer is available in the market today… this is no longer a pipe dream… more later…
The shared-nothing architecture has, from the beginning, offered the promise of using hardware to solve performance problems rather than applying staff and tuning. By this I mean… if you can add nodes and scale out to improve query response then why not throw hardware at performance problems rather than build a fragile infrastructure of aggregate tables, cubes, pre-joined/de-normalized marts, materialized views, indexes, etc. Each of these performance workarounds are both expensive to build and expensive to operate.
There are several reasons, I think tuning has been more popular than scaling. Not in any particular order:
First, hardware vendors made it too hard to order/provision new nodes. You could not just press a button and buy capacity. Vendors wanted to charge you for terabytes when all you wanted might be CPU and Memory to fix the problem (see here, sigh). You had to negotiate a deal with a rep, work through your procurement group, wait weeks for delivery. Then, the hardware you have might not match the hardware for sale. New models could not be mixed with old nodes… so you had to consider a whole new cluster. The process was so not-agile. There have been attempts to fix this… and some of them are credible… but none are popular.
Next, the process to install the new nodes was moderately difficult… not rocket science but not seamless to be sure. Data had to move. Backups had to be reconfigured and sometimes old backups could not be easily restored to the new configuration. There was no easy way to burn in the new hardware and if it failed early there were issues reversing the process. It just was not considered an everyday operational process… it was the exception and that made it tough. This process too has improved over time but it never became a no-brainer.
Finally, buying hardware is a capital expense (CAPEX). Even if you had to pay more in people costs to do the hard work of tuning those were operational expenses… and funding was easier to get.
Redshift changes the game here. Even if the Paraccel database is just OK (see here)… and if the overhead of running in the virtualized AWS environment makes it worse… it is still OK. You can provision new hardware in a couple of minutes. If Teradata is 25% faster than Paraccel for your query set… so what? You can add 25% more Redshift for a fraction of the extra cost of Teradata. Need more performance? Dial it in. Need permission? No problem because it is all OPEX dollars.
Redshift will deliver the flexibility to make scale out less expensive than tune it out. The TCO reductions from running a simple system where hardware solves performance problems instead of ETL and staff will be significant. This is how it always should have been.
The issue for Redshift will be… given the trend to reduce the data latency from operations to BI… can you move significant amounts of data from on-premise into the cloud fast enough to meet service level agreements?
Do not overlook Redshift… Amazon could be a player in the EDW space… But look for other databases to make inroads here as well. In-memory databases could work well in the cloud as they avoid some of the hardware abstraction required to access disks.
- Amazon preparing ‘disruptive’ big data AWS service? (go.theregister.com)
- Amazon RedShift Open for Everyone (ozeanmedia.com)
I posted a blog on the SAP site here that discussed the implications of mobile clients. I want to re-emphasize the issue as it is crucial.
While at Greenplum we routinely replaced older EDW platforms and provided stunning performance. I recall one customer in particular where we were given a query that ran in 7 hours and Greenplum executed the query in seven seconds. This was exceptional… more typical were cases where we reduced run-times from several hours to under 30 minutes… to 10 minutes… to 5 minutes. I’m sure that every major competitor: Teradata, Greenplum, Netezza, and Exadata has similar stories to tell.
But 5 minutes will not cut it if you are servicing a mobile client where sub-second response to the device is a requirement… and 10 minutes is out of the question. It does not matter if it ran in 10 hours before… 10 minute response is not acceptable to a mobile device.
Today we see sub-second response delivered to our phones by custom applications built on special high-performance platforms designed specifically to service a mobile client: iPhones, iPads, and Android devices.
But what will we do about the BI applications built on commercial platforms which have just used every trick in the book to become one of the 5 minute stories mentioned above?
I think that there are only a couple of architectural choices.
- We can rewrite the high-value queries as custom applications using specialized infrastructure… at great expense… and leaving the vast majority of queries un-serviced.
- We can apply the 80/20 rule to get the easiest queries serviced with only 20% of the effort. But according to Murphy the 20% left will be the highest value queries.
- We can tack on expensive, specialized, accelerators to some queries… to those that can be accelerated… but again we leave too much behind.
- Or we can move to a general purpose high performance computing platform that can service the existing BI workload with sub-second response.
In-memory computing will play a role… Exalytics provides option #3… HANA option #4.
SSD devices may play a role… but the performance improvements being quoted by vendors who use SSD as a block I/O device is 10X or less. A 10X improvement applied to a query that was just improved to 10 minutes yields a 1 minute query… still not the expected level of service.
IT departments will have to evaluate the price/performance, not just the price, as they consider their next platform purchases. The definition of adequate response is changing… and the old adequate, at the least cost, may not cut it. Mobile clients are here to stay. The productivity gains expected from these devices is significant. High performance BI computing is going to be a requirement.
As you may have noticed I’m looking at in-memory databases (IMDB) these days… Here are some quick architectural observations on VMWare‘s SQLFire, Oracle’s Exalytics and TimesTen offerings, and SAP HANA.
It is worth noting up front that I am looking to see how these products might be used to build a generalized data mart or a data warehouse… In other words I am not looking to compare them for special case applications. This is important because each of these products has some extremely cool features that allow them to be applied to application-specific purposes with a narrow scope of data and queries… maybe in a later blog I can try to look at some narrow use-cases.
Further, to make this quick blog tractable I am going to assume that the mart/dw problem to be solved requires more data than can fit on one server node… and I am going to ignore features that let queries access data that resides on disk… in-memory or bust.
Finally I will assume that the SQL dialect supported is sufficient and not drill into details there. I will look at architecture not SQL features…
Simply put I am going to look at a three characteristics:
- Will the architecture support ad hoc queries?
- Does the architecture support scale-out?
- Can we say anything with regards to price/performance expectations?
Exalytics is a smart-aggregate store that sits over an Oracle database to offload aggregate query workload (see my previous post here or the Rittman Mead post here which declares: “Oracle Exalytics uses a specially enhanced version of Oracle TimesTen, Oracle’s in-memory database, to cache commonly used aggregates used in dashboards, analyses and other BI objects.” Exalytics does not support a scale-out shared-nothing architecture but it can scale up by adding nodes with new aggregate data. Queries access data within the aggregate structure and it is not possible to join to data off the Exalytics node… so ad hoc is out. Within these limits, which preclude Exalytics from being considered as a general platform for a mart or warehouse, Exalytics provides dictionary-based compression which should provide around 5X compression to reduce the amount of memory required and reduce the amount of hardware required.
TimesTen can do more. It is a general RDBMS. But it was designed for OLTP. I assume that the reason that Oracle has not rolled it out as a general-purpose data mart or data warehouse has to do with constraints that grow from those OLTP architectural roots. For example, BI queries run longer and require more data than a OLTP query… and even with data in-memory temporary storage is required for each query… and memory utilization is a product of the amount of data required and the amount of time the data has to inhabit memory… so BI queries put far more pressure on an in-memory DBMS. There are techniques to mitigate this… but you have to build the techniques in from the ground up.
I imagine that this is why TimesTen works for Exalytics, though. A OLAP query against a pre-aggregated cube does not graze an entire mart or warehouse. It is contained and “small data” (for my wacky take re: Exalytics and Exadata see here).
TimeTen is not sharded… so scalability is an issue. Oracle gets around this nicely by allowing you to partition data across instances and have the application route queries to the appropriate server. But this approach will not support joins across partitions so it severely limits scalability in a general-purpose mart or warehouse.
SQLFire is a very interesting new product built on top of Gemfire… and therefore mature from the start. SQLFire is more scalable than TimesTen/Exalytics. It supports sharded data in a cluster of servers. But SQLFire has the limitation that it cannot join data across shards (they call them partitions… see here) so it will be hard to support ad hoc queries… They provide the ability to replicate tables to support any sort of joins. If, for example, you replicate small dimension tables to coexist with sharded fact tables all joins are supported. This solution is problematic if you have multiple fact tables which must be joined… and replication of data uses more memory… but SQLFire has the foundation in place to become BI-capable over time.
Performance in an in-memory database comes first and foremost from eliminating disk I/O. All three IMDB product provide this capability. Then performance comes from the efficient use of compression. TimeTen incorporates Oracles dictionary-based “columnar” compression (I so hate this term… it is designed to make people think that Oracle products are sort-of columnar… but so far they are not). Then performance comes from columnar projection… the ability to avoid touching all data in a row to process a query. Neither TimesTen nor SQLFire are columnar databases. Then performance comes from parallel execution. Neither TimesTen nor SQLFire can involve all cores on a single query to my knowledge.
Price comes from compression as well. The more highly compressed the data is the less memory required to store it. Further, if data can be used without decompressing it, then less working memory is required. As noted, TimesTen has a compression capability. SQLFire does not appear to compress data. Neither can use compressed data. Note that 2X compression cuts the amout of memory/hardware required in half or more… 4X cuts it to a quarter… and so on. So this is significant.
Now for some transparency… I started the research for this blog, and composed a 1st draft, last Spring while I was at EMC Greenplum. I am now at SAP working with HANA. So… I will not go into HANA at great length… but I will point out that: HANA fully supports a shared-nothing architetcture… so it is fully scalable; HANA is fully parallel and able to use all cores for each query; HANA fully supports columnar tables so it provides deep compression and the ability to use the compressed data in execution. This is not remarkable as HANA was designed from the bottom up to support both BI and OLTP workloads while TimesTen and SQLFire started from a purely OLTP architectural foundation.
Wikipedia defines computer memory as:
In computing, memory refers to the physical devices used to store programs (sequences of instructions) or data (e.g. program state information) on a temporary or permanent basis for use in a computer or other digital electronic device. The term primary memory is used for the information in physical systems which are fast (i.e. RAM), as a distinction from secondary memory, which are physical devices for program and data storage which are slow to access but offer higher memory capacity. Primary memory stored on secondary memory is called “virtual memory“.
The term “storage” is often (but not always) used in separate computers of traditional secondary memory such as tape, magnetic disks and optical discs (CD-ROM and DVD-ROM). The term “memory” is often (but not always) associated with addressable semiconductor memory, i.e. integrated circuits consisting of silicon-based transistors, used for example as primary memory but also other purposes in computers and other digital electronic devices.
To a computer program like a DBMS, memory is a resource allocated using commands like malloc() and calloc(). Note that these commands allocate primary memory using the definition above. From this you should conclude that an in-memory DBMS (IMDB) is a system that puts all of its data into memory allocated by the database program.
In their announcements this week Oracle states (here) that Exadata 3 is an in-memory database machine and Larry Ellison said. “Everything is in memory. All of your databases are in-memory. You virtually never use your disk drives. Disk drives are becoming passe. They’re good at storing images and a lot of data we don’t access very often.”
But their definition of in-memory includes SSD devices that are not directly addressable by the DBMS. In fact they use 22TB of SSDs and 4TB of DRAM. The SSDs are a cache sitting between the DBMS and disk storage. They are storage according to Wikipedia.
Exadata 3 is not an in-memory database machine. It takes more than lots of hardware to make a DBMS an in-memory DBMS.
Oracle is spewing marketing, not architecture.
As you look at the enterprise RDBMS marketplace today you will find something shocking… almost every product in the market is built based on designs and concepts that are over thirty years old. IBM’s System R grew into DB2 and influenced Oracle before 1980. Ingres, developed before 1980, became Postgres which became Netezza and Greenplum and more. Teradata was a fresh start… around 1980.
This is not a bad thing in its own right… but imagine the hardware architectures these systems were designed and optimized for. Maybe DB2 was built for a multi-core mainframe… maybe Oracle too… maybe. Memory was tiny… so memory management was important and memory was used sparingly. Data sizes were tiny. Consider the fact that Teradata named the company based on the belief that someday way beyond the planning horizon some customers might get to a terabyte of data.
The reality is that these old designs are inefficient. They have hacked the old code to continuously extend their products. I mean this as a compliment. It is not trivial engineering to find tweaks and tack-ons that make old code work on new hardware architectures. Teradata and Netezza and Greenplum designed ways to use multiple address spaces to take advantage of multiple cores. Oracle tacked-on a shared-nothing I/O subsystem to a shared-everything architecture to stretch.
But these hacks are not efficient.
Yale is working on some new-new stuff (see here). HANA is based on a completely different design (see here). The NoSQL vendors have bent the ACID-tested rules, if not always the fundamental approaches.
I can’t help but believe that in one of these new approaches is a path forward.
If you would like to read some history of the start here is a cool link.
OLAP searches a set of pre-aggregated data… a cube. If the cube is large enough that you don’t bump into the edges you might think that your search is ad hoc… but that is an illusion. The set is prescribed not ad hoc.
In the 1980′s we sent paper reports out… they were moved on a pallet with a fork-lift. The reports aggregated key metrics to many levels in a hierarchy sliced and diced across many dimensions. Today we take the lines off the reports and store them digitally in a cube and provide tools to let users navigate the cube to build their reports. What they build looks, to a large extent, like the reports from the 80′s.
Data warehousing provides more data and better data… so there are more cubes, more dimensions, more reports… and hopefully more business intelligence. But these reports provide 1980′s quality business intelligence on a screen instead of on paper… bounded by the OLAP cube.
When you hear folks talk about data science and data mining and advanced analytics and optimization… they are talking about advanced mathematical treatment of the data… know that this is going to require technology that is beyond the capabilities of a OLAP engine.
Exalytics is a OLAP engine. Here are some Exalytics use cases from a proponent. They are about OLAP dashboards… good stuff… but hardly advanced analytics. Oracle says that Exalytics is engineered for Extreme Analytics. If we agree that “extreme” analytics is not in any way advanced… then I agree.
One more surprise…
In the past SAP applications have, in general, avoided using database features. Even a SELECT with a projection was out-of-bounds. They did not want to depend on any database, so they tended to pull all data from the data layer to the application layer and loop through the data using procedural languages like ABAP. You might say that they were religiously database agnostic. My mistake… you might say that we were religiously database agnostic. I have to get used to these new surroundings.
Besides the obvious attributes of HANA: in-memory, shared-nothing, MPP, and column-oriented… the aim is to move the application logic next to the data and into HANA.
Any of you who have labored to convert procedural code into set-based SQL will understand the issue here. There are hundreds of thousands or millions of lines of procedural code… often very simple loops… that have to be converted to SQL to make the HANA architecture support the SAP application portfolio.
The surprise is not that there is this outstanding issue.. nor is it the ambitious architecture designed to push the application deep into the database (we are not talking about SQL-based stored procedures… we are talking about the application). The surprise is that the HANA development team has built a state-of-the-art facility that programmatically converts procedural logic into its set-based equivalent (not necessarily into SQL but sometimes into a language that can execute in-parallel). This is not a tool requiring manual intervention… it is an automatic, mathematically provable, transformation.
Right now the technique is used to covert logic in stored-procedures and in ABAP. But I hope to see it applied in the optimizer to convert those ugly Oracle cursor loops on-the-fly.
You can read more here.
By the way… SAP will continue to support ABAP using the database as a file server… moving all of the data from the database server to the application server for processing. But you can imagine that… when running applications that use this powerful capability… over time HANA will emerge with a huge performance advantage over other databases…
Oracle should be worried.
I have promised not to promote HANA heavily on this site… and I will keep that promise. But I want to share something with you about the HANA architecture that is not part of the normal marketing in-memory database (IMDB) message: HANA is parallel from its foundation.
What I mean by that is that when a query is executed in-memory HANA dynamically shards the data in-memory and lets each core start a thread to work on its shard.
Other shared-nothing implementations like Teradata and Greenplum, which are not built on a native parallel architecture, start multiple instances of the database to take advantage of multiple cores. If they can start an instance-per-core then they approximate the parallelism of a native implementation… at the cost of inter-instance communication. Oracle, to my knowledge, does not parallelize steps within a single instance… I could be wrong there so I’ll ask my readers to help?
As you would expect, for analytics and complex queries this architecture provides a distinct advantage. HANA customers are optimizing price models sub-second in-real-time with each quote instead of executing a once-a-week 12-hour modeling job.
As you would expect HANA cannot yet stretch into the petabyte range. The current HANA sweet spot is for warehouses or marts is in the sub-TB to 20TB range.