incubator-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mattmann, Chris A (3980)" <>
Subject Re: [PROPOSAL] Grill as new Incubator project
Date Fri, 19 Sep 2014 04:04:31 GMT
This sounds super cool!

How does this relate to SciDB? is it trying to do a similar thing?


Chris Mattmann, Ph.D.
Chief Architect
Instrument Software and Science Data Systems Section (398)
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 168-519, Mailstop: 168-527
Adjunct Associate Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA

-----Original Message-----
From: Sharad Agarwal <>
Reply-To: "" <>,
"" <>
Date: Thursday, September 18, 2014 8:54 PM
To: "" <>
Subject: [PROPOSAL] Grill as new Incubator project

>Grill Proposal
># Abstract
>Grill is a platform that enables multi-dimensional queries in a unified
>over datasets stored in multiple warehouses. Grill integrates Apache Hive
>with other data warehouses by tiering them together to form logical data
># Proposal
>Grill provides a unified Cube abstraction for data stored in different
>stores. Grill tiers multiple data warehouses for unified representation
>efficient access. It provides SQL-like Cube query language to query and
>describe data sets organized in data cubes. It enables users to run
>against Facts and Dimensions that can span multiple physical tables stored
>in different stores.
>The primary use cases that Grill aims to solve:
>- Facilitate analytical queries by providing the OLAP like Cube
>- Data Discovery by providing single metadata layer for data stored in
>different stores
>- Unified access to data by integrating Hive with other traditional data
># Background
>Apache Hive is a data warehouse that facilitates querying and managing
>large datasets stored in distributed storage systems like HDFS. It
>SQL like language called HiveQL aka HQL.  Apache Hive is a widely used
>platform in various organizations for doing adhoc analytical queries.
>In a typical Data warehouse scenario, the data is multi-dimensional and
>organized into Facts and Dimensions to form Data Cubes. Grill provides
>logical layer to enable querying and manage data as Cubes.
>The Grill project is actively being developed at InMobi to provide the
>higher level of analytical abstraction to query data stored in different
>storages including Hive and beyond seamlessly.
># Rationale
>The Grill project aims to ease the analytical querying capabilities and
>the data-silos by providing a single view of data across multiple data
>Conceiving data as a cube with hierarchical dimensions leads to
>conceptually straightforward operations to facilitate analysis.
>Apache Hive with other traditional warehouses provides the opportunity to
>optimize on the query execution cost by tiering the data across multiple
>warehouses. Grill provides
>- Access to data Cubes via Cube Query language similar to HiveQL.
>- Driver based architecture to allow for plugging systems like Hive and
>other warehouses such as columnar data RDBMS.
>- Cost based engine selection that provides optimal use of resources by
>selecting the best execution engine for a given query.
>In a typical Data warehouse, data is organized in Cubes with multiple
>dimensions and measures. This facilitates the analysis by conceiving the
>data in terms of Facts and Dimensions instead of physical tables. Grill
>aims to provide this logical Cube abstraction on Data warehouses like Hive
>and other traditional warehouses.
># Initial Goals
>- Donate the Grill source code and documentation to Apache Software
>- Build a user and developer community
>- Support Hive and other Columnar data warehouses
>- Support full query life cycle management
>- Add authentication for querying cubes
>- Provide detailed query statistics
># Long Term Goals
>Here are some longer-term capabilities that would be added to Grill
>- Add authorization for managing and querying Cubes
>- Provide REST and CLI for full Admin controls
>- Capability to schedule queries
>- Query caching
>- Integrate with Apache Spark. Creating Spark RDD from Grill query
>- Integrate with Apache Optiq
># Current Status
>The project is actively developed at InMobi. The first version is deployed
>at InMobi 4 months back. This version allows querying dimension and fact
>data stored in Hive over CLI. The source code and documentation is hosted
>at GitHub.
>## Meritocracy
>We intend to build a diverse developer and user community for the project
>following the Apache meritocracy model. We want to encourage contributors
>from multiple organizations, provide plenty of support to new developers
>and welcome them to be committers.
>## Community
>Currently the project is being developed at InMobi. We hope to extend our
>contributor and user base significantly in the future and build a solid
>open source community around Grill.
>Core Developers
>Grill is currently being developed by Amareshwari Sriramadasu, Sharad
>Agarwal and Jaideep Dhok from InMobi, and Sreekanth Ramakrishnan who is
>currently employed by SoftwareAG. Raghavendra Singh from InMobi has built
>the QA automation for Grill.
>## Alignment
>The ASF is a natural home to Grill as it is for Apache Hadoop, Apache
>Apache Spark and other emerging projects in Big Data space.
>We believe in any enterprise, multiple data warehouses will co-exist, as
>not all workloads are cost effective to run on single one. Apache Hive is
>one of the crucial data warehouse along with upcoming projects like Apache
>Spark in Hadoop ecosystem. Grill will benefit in working in close
>with these projects.
>The traditional Columnar data warehouses complement Apache Hive as certain
>workloads continue to be cost effective to run in traditional columnar
>warehouses. Having multiple data warehouses leads to data silos that Grill
>aims to cut within the enterprise and provide a holistic unified access to
># Known Risks
>## Orphaned products & Reliance on Salaried Developers
>There is little risk of Grill getting orphaned, as Grill is key part of
>Data Platform stack at InMobi. The core Grill developers plan to work on
>full-time. We think Grill will bring value in the Big Data space and we
>plan to grow the community of users and contributors.
>## Inexperience with Open Source
>All the core developers have long and significant experience in Apache
>projects and Hadoop ecosystem. Amareshwari Sriramadasu has long standing
>contributions to Apache Hadoop MapReduce and Apache Hive, she being PMC
>member of Hadoop and a committer of Hive. Sharad Agarwal is a PMC member
>Hadoop and contributed to Hadoop YARN and Hadoop MapReduce. Srikanth
>Sundarrajan is a PMC member of Apache Falcon.  Sreekanth Ramakrishnan is
>committer of Apache Hadoop.  Jaideep Dhok has contributed patches to
>Hive. Gunther is a PMC member of Apache Hive. Vikram is a committer of
>Apache Hive.
>## Homogeneous Developers
>The initial developers are employed by Hortonworks, InMobi and SoftwareAG.
>We are committed to recruiting additional committers from other companies
>based on their contribution to the project.
>## Reliance on Salaried Developers
>The majority of initial committers are paid by their employee to
>to the project and few are contributing in their spare time. Once the
>project has a community built, we are committed to recruit committers and
>developers from outside the current core developers.
>## Relationships with Other Apache Products
>Grill is deeply integrated with other Apache projects. Grill uses and
>extends Apache Hive HCatalog to store and manage the Data cubes. It uses
>HDFS and Hive session management libraries. Grill has the driver-based
>architecture that allows for adding multiple execution drivers. Apart from
>integrating Apache Hive, it can be integrated with Apache Spark over Spark
>SQL or Shark, Apache Drill, Apache Tajo and Apache Phoenix.
>In future we want to use Apache Optiq in Grill for query optimization and
>cost based driver selection.
>## An Excessive Fascination with the Apache Brand
>The project is conceived from beginning to be in line with the Apache
>philosophy. As the core developers have good experience with Apache, the
>source code organization, build, review and commit process are highly
>influenced by Apache. We believe that Apache will be a solid home for
>to grow and build the open source community. We have also described the
>reasons in the Rationale and Alignment sections.
># Documentation
># Initial Source
>The source is currently in github repository at:
># Source and Intellectual Property Submission Plan
>The complete Grill code is already under Apache Software License 2.
># External Dependencies
>The dependencies all have Apache compatible licenses. These include Apache
>2.0, BSD, MIT, EPL and CDDL licensed dependencies.
># Cryptography
># Required Resources
>## Mailing lists
>grill-dev AT incubator DOT apache DOT org
>grill-commits AT incubator DOT apache DOT org
>grill-private AT incubator DOT apache DOT org
>## Subversion Directory
>Git is the preferred source control system: git://
>## Issue Tracking
># Initial Committers
>Amareshwari Sriramadasu (amareshwari AT apache DOT org)
>Gunther Hagleitner (gunther AT apache DOT org)
>Jaideep Dhok (jaideep.dhok AT Inmobi DOT com)
>Raghavendra Singh (raghavendra.singh AT Inmobi DOT com)
>Sharad Agarwal (sharad AT apache DOT org)
>Sreekanth Ramakrishnan (sreekanth AT apache DOT org)
>Srikanth Sundarrajan (sriksun AT apache DOT org)
>Suma Shivaprasad (suma.shivaprasad AT Inmobi DOT com)
>Vikram Dixit (vikram AT apache DOT org)
># Affiliations
>Amareshwari SR (InMobi)
>Gunther Hagleitner (Hortonworks)
>Jaideep Dhok (InMobi)
>Raghavendra Singh (InMobi)
>Sharad Agarwal (InMobi)
>Sreekanth Ramakrishnan (SoftwareAG)
>Srikanth Sundarrajan (InMobi)
>Suma Shivaprasad (InMobi)
>Vikram Dixit (Hortonworks)
># Sponsors
>## Champion
>Vinod K <vinodkv AT apache DOT org> (Apache Member)
>## Nominated Mentors
>Chris Douglas (Microsoft)
>Jacob Homan (Microsoft)
>Jean Baptiste Onofre (Talend)
>Vinod K (Hortonworks)
>## Sponsoring Entity
>Incubator PMC

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message