Aan de buitenkant zien OLAP-on-Hadoop-engines eruit als traditionele OLAP-tools die MDX-interfaces ondersteunen en, zoals gezegd, de gegevens in kubussen organiseren. Aan de binnenkant zijn ze echter zeer verschillend.
Apache Kylin™ is an open source Distributed Analytics Engine designed to provide SQL interface and multi-dimensional analysis (OLAP) on Hadoop supporting extremely large datasets, original contributed from eBay Inc.
What is OLAP on Hadoop? Many companies have embraced a Data Lake strategy for modern data analytics. In this model data from operational systems is moved to Hadoop for long term storage, processing, and analytics. Hadoop’s rich ecosystem of tools and integrations addresses a wide range of workloads through a shared resource model.
Hadoop is a great platform for storing a lot of data, but running OLAP is usually done on smaller datasets in legacy and traditional proprietary platforms.
OLAP-on-Hadoop can offer a very low data latency even on big data. Data Storage Scalability: Because Hadoop is the storage system, the amount of data that can be stored for analytical purposes is
It doesn’t take a data scientist to do a basic extrapolation of recent events around Hadoop, and see that it’s going somewhere. “More and more types of workloads will be supported on top of Hadoop,” Cutting said.
Using OLAP on Hadoop to accelerate BI on Big Data In this special guest feature Ajay Anand, Vice President of Marketing at Kyvos Insights, gives his views on speed and interactivity of OLAP and the scalability and flexibility of Hadoop.
Tim Spann discusses OLAP options on the Hadoop stack:. Apache Kylin. For an introduction to this interesting Hadoop project, check out this article.. Apache Kylin originally from eBay, is a Distributed Analytics Engine that provides SQL and OLAP access to Hadoop datasets utilizing Hive and HBase. It can use called through SparkSQL as well making for a very useful project.