The Greenplum ORCA Optimizer

In January Greenplum rolled out a new query optimizer. This is very cool and very advanced stuff. Query optimization is a search problem… in a perfect world you would search through the space of all possible plans for any query and choose the least expensive plan. But the time required to iterate through all possible…

HANA, BLU, Hekaton, and Oracle 12c vs. Teradata and Greenplum – November 2013

I would like to point out a very important section in the paper on Hekaton on the Microsoft Research site here. I will quote the section in total: 2. DESIGN CONSIDERATIONS  An analysis done early on in the project drove home the fact that a 10-100X throughput improvement cannot be achieved by optimizing existing SQL…

Who is How Columnar? Exadata, Teradata, and HANA – Part 2: Column Processing

In my last post here I suggested that there were three levels of maturity around column orientation and described the first level, PAX, which provides columnar compression. This apparently is the level Exadata operates at with its Hybrid Columnar Compression. In this post we will consider the next two levels of maturity: early materialized column…

Who is How Columnar? Exadata, Teradata, and HANA – Part 1: Column Compression

There are three forms of columnar-orientation currently deployed by database systems today. Each builds upon the next. The simplest form uses column-orientation to provide better data compression. The next level of maturity stores columnar data in separate structures to support columnar projection. The most mature implementations support a columnar database engine that performs relational algebra…

The Fog is Getting Thicker…

I renamed this so that Teradata folks would not get here so often… its not really about Intelligent Memory… just prompted by it. The post on Intelligent Memory is here. – Rob Two quick comments on Teradata’s recent announcement of Intelligent Memory. First… very very cool. More on this to come. Next… life is going…

Some Unaudited HANA Performance Numbers

The following performance numbers are being reported publicly for HANA: HANA scans data at 3MB/msec/core On a high-end 80-core server this translates to 240GB/sec per node HANA inserts rows at 1.5M records/sec/core Or 120M records/sec per node… Aggregates 12M records/sec/core Or 960M records per node… These numbers seem reasonable: A 100X improvement over disk-based scan…