Who is Massively Parallel? HANA vs. Teradata and (maybe) Oracle
I have promised not to promote HANA heavily on this site… and I will keep that promise. But I want to share something with you about the HANA architecture that is not part of the normal marketing in-memory database (IMDB) message: HANA is parallel from its foundation.
What I mean by that is that when a query is executed in-memory HANA dynamically shards the data in-memory and lets each core start a thread to work on its shard.
Other shared-nothing implementations like Teradata and Greenplum, which are not built on a native parallel architecture, start multiple instances of the database to take advantage of multiple cores. If they can start an instance-per-core then they approximate the parallelism of a native implementation… at the cost of inter-instance communication. Oracle, to my knowledge, does not parallelize steps within a single instance… I could be wrong there so I’ll ask my readers to help?
As you would expect, for analytics and complex queries this architecture provides a distinct advantage. HANA customers are optimizing price models sub-second in-real-time with each quote instead of executing a once-a-week 12-hour modeling job.
As you would expect HANA cannot yet stretch into the petabyte range. The current HANA sweet spot is for warehouses or marts is in the sub-TB to 20TB range.