In many of my posts I refer to the issues associated with building “extra” data structures to meet performance goals (see one of my first posts ever here). These extra structures are always a trade-off… slowing the performance of one function in order to speed up another. I thought that it might be helpful to be very clear about where I stand on this.
Indexes improve the performance of queries that address a small set of data. They also can improve join performance if your favorite optimizer can apply an index intersection to the execution plan for your queries. Indexes dramatically slow the performance of inserts, updates, and bulk data loads as they have to be maintained when data changes. You can mitigate the cost and update indexes in the background… the trade-off does not go away. Indexes are probably required for OLTP applications that pick out single rows.
Wouldn’t it be great if your favorite DBMS could resolve every query very fast without the overhead and operational effort associated with maintaining indexes? Certainly we should aspire to a read-optimized database, a data warehouse DBMS, that does not require indexes.
Vertica projections provide an optimized, materialized, view that improves the performance for a set of queries. The Vertica optimizer automatically selects the optimal projection. Vertica provides a very slick tool that builds projections based on the query set provided. I worded my post on Vertica a little vague… so let me be sure here to point out that every Vertica query runs against a projection… so it is possible to have only one. In this case there is no additional overhead. Adding projections slows the data load process and increases the storage requirements. This is the trade-off.
Other databases offer materialized views. They make the same trade-off as above.
An OLAP cube is a physical structure that pre-aggregates data so that your query workload can avoid the aggregation. The best implementations of this express the cube as a materialized view so that queries can use the pre-aggregated data without explicitly pointing at a cube structure… the optimizer picks it for you. In addition the best implementations let you drill out of the cube to the detail records. These products have the update/delete/load issues of an index plus add an extra data latency issue as the data has to be aggregated on some interval… usually hours or days. Many products do not allow joins from a cube. You can see the trade-off. The Oracle Exalytics product materializes the aggregated cube on a separate server in-memory. This provides even more performance but adds the system and operational overhead of moving data across system boundaries.
Wouldn’t it be nice if you could query raw data and perform aggregation so fast that even against terabytes of data you could run any query with 3 second or less response without the overhead of building cubes?
You may build specialized table structures and pre-join, pre-aggregate, or pre-compute data to make a set of queries run fast. The cost of building and maintaining this sort of implementation versus just querying the base tables is the trade-off. Further, this approach is sort of a trap. You cannot build these structures for every query… if you did the business would conceive another critical query the next day that required work.
You can add indexes to the structures built using the technique above and provide very fast application-specific performance to a small set of queries. This is currently the favored approach when companies build iOS or Android apps as it provides the best possible performance… at a significant price.
Wouldn’t it be great if this was unnecessary… you could just scan so fast that mobile response service levels could be met from the base data regardless of the query.
You can deploy redundant data in operational data stores, data marts, cube servers, analytic data stores, and so on… with each specialized store providing performance for some limited set of queries at the cost of development and support ongoing. Each of these copies could deploy specialized database products that speed up that set of queries a little more. Again, this surround-the-EDW approach is a trap that leads to the proliferation of data marts and of database technologies.
Please do not take that last paragraph the wrong way… I believe that the worst possible approach is to blindly standardize on one or two database products. This trade-off makes life convenient for the IT department at the expense of performance and agility in the business. It is OK to have one or two favored products but IT must always serve the business to the best of their ability as a first priority… and sometime the new start-up has just the thing (remember that once Teradata was a start-up and DB2 on the mainframe was the IT standard…).
What I wish was that one or two products could solve all of the performance and functionality problems without the cost of building “extra” stuff… one product would be better that two. I like products that make the extra stuff “free”. Netezza does a nice job of making zone maps “free”, for example. Teradata and Greenplum provide the option of row store or column store for “free”. Vertica automatically build extra projections for “cheap”… and while there is a cost to the projection it at least does not require staff to tune it up. Oracle materialized views are “cheap”.
What I dislike are products that require DBAs to work harder and harder to apply all of the techniques above to meet performance SLAs. Each of these techniques trades off performance for development and operational expense.
As I have noted before… the performance SLAs for BI are about to become severe as companies try to support BI on mobile devices. The development and operational costs of tuning up; that is the TCO; will be significant unless better, faster, software infrastructure becomes available.
The TCO for a database that could eliminate these extra constructs and could eliminate the cost of developing and maintaining them; and could eliminate the architectural fragility these approaches imply… and replace this with a DBMS that holds base data which could satisfy all queries in seconds; delivering the business agility this implies… the TCO would be compelling.
I actually believe that the answer is available in the market today… this is no longer a pipe dream… more later…
This is a rehash of my post for SAP here… I thought you might find it interesting as it describes the architecture HANA uses to support OLTP and BI against a single table.
A couple of points to think about:
- If you have only one database structure you can optimize for only one query; e.g. the OLTP query is fast against a OLTP structure but slow against a BI structure… or visa versa.
- If you have two structures you have to ETL the data between the two at some cost. There is cost in keeping a replica of the data, cost in developing, administering, and executing the ETL process. In addition there is a lost opportunity cost hidden in the latency of the data. You cannot see the current state of the business by querying the BI data as some data has not yet been ETL’d across.
- OLTP performance is normally paramount; so the perfect system would not compromise that performance or compromise it only a little.
Let’s look at the HANA approach to this at a high level.
HANA provides a single view of a table to an application or a user, but under-the-covers each table includes a OLTP optimized part, a BI optimized part, and a mechanism for moving data from one part to the other
When a transaction hits the system; inserts, updates, and deletes are processed in the OLTP part with no performance penalty. The read portion of the OLTP query accesses the read-optimized internal structure with no performance penalty. Note that reading a single column in a column store, which is the key for the transaction, is roughly equivalent to reading an index structure on top of a standard disk-based DBMS. Except the column is always in-memory which means I/O is never required. This provides the HANA system with an advantage over a disk-based system. Disk I/O is 120+ times slower than memory access so even an index is unlikely to beat in-memory. See here for some numbers you should know.
After the transaction is committed into the internal, OLTP-optimized part, a process starts that moves the data to the BI optimized part. This is called a delta merge as the OLTP portion holds all of the changes, the delta, in the data set.
When a BI query starts it can limit the scan to only partitions in the BI optimized part, or if real-time data is required it can scan both parts. The small portion of the scan that accesses the OLTP/delta portion is sub-optimal when compared to the scan of the BI part, but not slow at all as the data is all in-memory.
We can tease the performance apart as follows:
- There is a OLTP insert/update/delete “write” portion… and HANA executes this like any OLTP database, as fast as an OLTP RDBMS, with a commit after a write-to-log;
- There is a OLTP select “read” portion… and HANA performs this in the in-memory column store faster than many OLTP databases… and scans the delta structure as fast as any OLTP database;
- There is a delta merge from the OLTP write-optimized part to the BI read-optimized column store that is hundreds to tens of thousands of times faster than any ETL tool; and
- There is a BI select portion that scans the in-memory column store hundreds to thousands of times faster than a disk-based BI database.
- If the BI query requires access to real-time data then an in-memory scan of the delta file is required… there is no analogy to this in a system with separate OLTP and BI tables.
First, you should look at Google’s Spanner paper here… this is the next-gen from Google and once it is embraced by the open source community it will put even more pressure on the big data DBMSs. Also have a look at YARN the next Map/Reduce… more pressure still…
Next… you can imagine that the conventional database folks will quibble a little with my analysis. Lets try to anticipate the push-back:
- Hadoop will never be as fast as a commercial DBMS
Maybe not… but if it is close then a little more hardware will make up the difference… and “free” is hard to beat in price/performance.
I do not think so… disk controllers, the overhead of non-memory I/O, and an inability to fully optimize processing for in-memory will make a big difference. I said 50X to be conservative… but it could be 200X… and a 200X performance improvement reduces the memory required to process a query by 200X… so it adds up.
- The Price of IMDB will always be prohibitive
Nope. The same memory that is in SSD’s will become available as primary memory soon and the price points for SSD-based and IMDB will converge.
- IMDB won’t scale to 100TB
HANA is already there… others will follow.
- Commercial customers will never give up their databases for open source
Economics means that you pay me now or you pay me later… companies will do what makes economic sense.
The original post on this is here…
Wikipedia defines computer memory as:
In computing, memory refers to the physical devices used to store programs (sequences of instructions) or data (e.g. program state information) on a temporary or permanent basis for use in a computer or other digital electronic device. The term primary memory is used for the information in physical systems which are fast (i.e. RAM), as a distinction from secondary memory, which are physical devices for program and data storage which are slow to access but offer higher memory capacity. Primary memory stored on secondary memory is called “virtual memory“.
The term “storage” is often (but not always) used in separate computers of traditional secondary memory such as tape, magnetic disks and optical discs (CD-ROM and DVD-ROM). The term “memory” is often (but not always) associated with addressable semiconductor memory, i.e. integrated circuits consisting of silicon-based transistors, used for example as primary memory but also other purposes in computers and other digital electronic devices.
To a computer program like a DBMS, memory is a resource allocated using commands like malloc() and calloc(). Note that these commands allocate primary memory using the definition above. From this you should conclude that an in-memory DBMS (IMDB) is a system that puts all of its data into memory allocated by the database program.
In their announcements this week Oracle states (here) that Exadata 3 is an in-memory database machine and Larry Ellison said. “Everything is in memory. All of your databases are in-memory. You virtually never use your disk drives. Disk drives are becoming passe. They’re good at storing images and a lot of data we don’t access very often.”
But their definition of in-memory includes SSD devices that are not directly addressable by the DBMS. In fact they use 22TB of SSDs and 4TB of DRAM. The SSDs are a cache sitting between the DBMS and disk storage. They are storage according to Wikipedia.
Exadata 3 is not an in-memory database machine. It takes more than lots of hardware to make a DBMS an in-memory DBMS.
Oracle is spewing marketing, not architecture.
David Linthicum suggests here that Shadow IT is not all a bad thing. He references a PricewaterhouseCoopers study that suggests that 30% of all IT spending comes from the business directly… from outside of the IT budget.
In the data warehouse space we can confirm these numbers easily. Just google on “data mart consolidation” to see the impact of the business building their own BI infrastructure in order to get around the time-consuming strictures and bureaucratic processes that IT imposes on a classic EDW platform. Readers… think of the term “data governance”… governance implies bureaucracy. And a “single version of the truth” implies a monopoly (governed by IT). We need a market for ideas to support our business intelligence… and a market is a little chaotic.
What we need is a place where IT says to the business… we cannot get you integrated into our formal EDW infrastructure as fast as you would like… but don’t go and build your own warehouse/mart on your own shadow platform. Let us provide you with a mart in the cloud. Take the data you need from our EDW. Enhance it as you see fit. We can spin up a server to house the mart in the cloud in a couple of hours. Let us help you. Use the tools you want… we think that it is cool that you are going to try out some new stuff… but if you want to use the tools we provide then you’ll get the benefit of our licensing deal and the benefit of our support… but you decide. We need IT to allow a little chaos…
This, I believe is what cloud offers to the data warehouse space…. the platform to respond.
But there is a rub… data warehouse appliances from Teradata, Exadata, and Netezza require bundled hardware that is not going to fit in your cloud. A shared-nothing architecture is a tough fit into the shared disk paradigm of the cloud (see here). The I/O reliance of a disk-based DBMS make performance tough on a shared disk platform. I think that for data marts and analytic sandboxes the cloud is the right choice… if you want to minimize the size of the shadow IT cast by lines of business. An in-memory database (IMDB): HANA, TimesTen, or SQLFire may be the best alternative for a small cloud-based mart.
David Linthicum has it right in spades for the data warehouse space… we need some user pull-through… and we need cloud computing as the platform to make these user-driven initiatives manageable.