incubator-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mohammad Nour El-Din <>
Subject Re: [PROPOSAL] Propose Howl as an Apache Incubator project
Date Sun, 13 Feb 2011 17:57:54 GMT
Good catch, but allow me to disagree with you. Howl here is a name,
while OW2 HOWL is an acronym for "High-speed ObjectWeb Logger" and on
[1] it is all written in capital letters.

[1] -

On Sun, Feb 13, 2011 at 5:52 PM, Brian McCallister <> wrote:
> The proposal looks fine, but the name collides with
> -Brian
> On Thu, Feb 10, 2011 at 1:37 PM, Alan Gates <> wrote:
>> I would like to propose Howl as an Apache Incubator project.  Howl is a
>> table and storage management service for data created using Apache Hadoop.
>>  The proposal is on the Incubator wiki at
>> and is pasted below.  Thanks.
>> Alan.
>> == Abstract ==
>> Howl is a table and storage management service for data created using Apache
>> Hadoop.
>> == Proposal ==
>> The vision of Howl is to provide table management and storage management
>> layers for Apache Hadoop.  This includes:
>>  * Providing a shared schema and data type mechanism.
>>  * Providing a table abstraction so that users need not be concerned with
>> where or how their data is stored.
>>  * Providing interoperability across data processing tools such as Pig, Map
>> Reduce, Streaming, and Hive.
>> == Background ==
>> Data processors using Apache Hadoop have a common need for table management
>> services.  The goal of a table management service is to track data that
>> exists in a Hadoop grid and present that data to users in a tabular format.
>>  Such a table management service needs to provide a single input and output
>> format to users so that individual users need not be concerned with the
>> storage formats that are chosen for particular data sets.  As part of having
>> a single format, the data will need to be described by one type of schema
>> and have a single datatype system.
>> Additionally, users should be free to choose the best tools for their use
>> cases.  The Hadoop project includes Map Reduce, Streaming, Pig, and Hive,
>> and additional tools exist such as Cascading.  Each of these tools has users
>> who prefer it, and there are use cases best addressed by each of these
>> tools.  Two users on the same grid who need to share data should not be
>> constrained to use the same tool but rather should be free to choose the
>> best tool for their use case.  A table management service that presents data
>> in the same way to all of the tools can alleviate this problem by providing
>> interfaces to each of the data processing tools.
>> There are also a few other features a table management service should
>> provide, such as notification of when data arrives.
>> A couple of developers at Yahoo! started the project. It is based on the
>> Hive !MetaStore component. There is good amount of interest in such a
>> service expressed from Yahoo!, Facebook, !LinkedIn, and, others. We are
>> therefore proposing to place Howl in the Apache incubator and to build an
>> open source community around it.
>> == Rationale ==
>> There is a strong need for a table management service, especially for large
>> grids with petabytes of data, and where the data volume is increasing by the
>> day. Hadoop users need to find data to read and have a place to store their
>> data.  Currently users must understand the location of data to read, the
>> storage format, compression techniques used, etc.  To write data they need
>> to understand where on HDFS their data belongs, the best compression format
>> to use, how their data should be serialized, etc.
>> Most users do not want to be concerned with these issues.  They want these
>> managed for them.
>> Having it as an Apache Open Source project will highly benefit Howl from the
>> point of view of getting a large community that currently uses Hadoop and
>> the other products built around Hadoop (like Pig, Hive, etc.). Users of the
>> Hadoop ecosystem can influence Howl’s roadmap, and contribute to it. Looking
>> at it in another way, we believe having Howl as part of the Hadoop ecosystem
>> will be a great benefit to the current Hadoop/Pig/Hive community too.
>> == Current Status ==
>> === Meritocracy ===
>> Our intent with this incubator proposal is to start building a diverse
>> developer community around Howl following the Apache meritocracy model. We
>> have wanted to make the project open source and encourage contributors from
>> multiple organizations from the start. We plan to provide plenty of support
>> to new developers and to quickly recruit those who make solid contributions
>> to committer status.
>> === Community ===
>> Howl is currently being used by developers at Yahoo! and there has been an
>> expressed interest from !LinkedIn and Facebook. Yahoo! also plans to deploy
>> the current version of Howl in production soon. We hope to extend the user
>> and developer base further in the future. The current developers and users
>> are all interested in building a solid open source community around Howl.
>> To work towards an open source community, we have started using the !GitHub
>> issue tracker and mailing lists at Yahoo! for development discussions within
>> our group.
>> === Core Developers ===
>> Howl is currently being developed by four engineers from Yahoo! - Devaraj
>> Das, Ashutosh Chauhan, Sushanth Sowmyan, and Mac Yang. All the engineers
>> have deep expertise in Hadoop and the Hadoop Ecosystem in general.
>> === Alignment ===
>> The ASF is a natural host for Howl given that it is already the home of
>> Hadoop, Pig, HBase, Cassandra, and other emerging cloud software projects.
>> Howl was designed to support Hadoop from the beginning in order to solve
>> data management challenges in Hadoop clusters. Howl complements the existing
>> Apache cloud computing projects by providing a unified way to manage data.
>> == Known Risks ==
>> === Orphaned Products ===
>> The core developers plan to work full time on the project. There is very
>> little risk of Howl getting orphaned since large companies like Yahoo! are
>> planning to deploy this in their production Hadoop clusters. We believe we
>> can build an active developer community around Howl (companies like Facebook
>> and !LinkedIn have also expressed interest).
>> === Inexperience with Open Source ===
>> All of the core developers are active users and followers of open source.
>> Devaraj Das is an Apache Hadoop committer and Apache Hadoop PMC member, and
>> has experience with the Apache infrastructure and development process.
>> Ashutosh Chauhan is an Apache Pig committer and Apache Pig PMC member.
>>  Sushanth Sowmyan and Mac Yang made contributions to the Apache Hive and the
>> Apache Chukwa projects.
>> === Homogeneous Developers ===
>> The current core developers are all from Yahoo! However, we hope to
>> establish a developer community that includes contributors from several
>> corporations, and we are starting to work towards this with Facebook and
>> !LinkedIn.
>> === Reliance on Salaried Developers ===
>> Currently, the developers are paid to do work on Howl. However, once the
>> project has a community built around it, we expect to get committers and
>> developers from outside the current core developers. Companies like Yahoo!
>> are invested in Howl being a solution to the data management problem in
>> Hadoop clusters, and that is not likely to change.
>> === Relationships with Other Apache Products ===
>> Howl is going to be used by users of Hadoop, Pig, and Hive. See section
>> Initial Source below for more information about Howl's relationship to Hive.
>> === An Excessive Fascination with the Apache Brand ===
>> While we respect the reputation of the Apache brand and have no doubts that
>> it will attract contributors and users, our interest is primarily to give
>> Howl a solid home as an open source project following an established
>> development model. We have also given reasons in the Rationale and Alignment
>> sections.
>> == Documentation ==
>> Information about Howl can be found at The
>> following sources may be useful to start with:
>>  * The !GitHub site:
>>  * The roadmap:
>> == Initial Source ==
>> Howl has been under development since Summer 2010 by a team of engineers in
>> Yahoo!.  It is currently hosted on !GitHub under an Apache license at
>> The initial development of Howl has consisted of:
>>  * maintaining a branch of the entire Hive codebase
>>  * getting Howl-related patches committed to Hive
>>  * developing Howl-specific plugins and wrappers to customize Hive behavior
>> At runtime, Howl executes Hive code for metastore and CLI+DDL, disabling
>> anything related to Hadoop map/reduce execution.  It also makes use of the
>> RCFile storage format contained in Hive.
>> This approach was taken as a first step in order to validate the required
>> functionality and get a production version working.  However, in the
>> long-term, maintaining a clone of Hive is undesirable.  One possible
>> resolution is to factor the metastore+CLI+DDL components out of Hive and
>> move them into Howl (making Hive dependent on Howl).  Another possible
>> resolution is to remove the copy of Hive from Howl and do the build/release
>> engineering necessary to make Howl depend on Hive.  As part of the
>> incubation process, we plan to work towards resolution of these issues.
>> == External Dependencies ==
>> The dependencies all have Apache compatible licenses.
>> == Cryptography ==
>> Not applicable.
>> == Required Resources ==
>> === Mailing Lists ===
>>  * howl-private for private PMC discussions (with moderated subscriptions)
>>  * howl-dev
>>  * howl-commits
>>  * howl-user
>> === Subversion Directory ===
>> === Issue Tracking ===
>> JIRA Howl (HOWL)
>> === Other Resources ===
>> The existing code already has unit tests, so we would like a Hudson instance
>> to run them whenever a new patch is submitted. This can be added after
>> project creation.
>> == Initial Committers ==
>>  * Devaraj Das
>>  * Ashutosh Chauhan
>>  * Sushanth Sowmyan
>>  * Mac Yang
>>  * Paul Yang
>>  * Alan Gates
>> A CLA is already on file for Sushanth.
>> == Affiliations ==
>>  * Devaraj Das (Yahoo!)
>>  * Ashutosh Chauhan (Yahoo!)
>>  * Sushanth Sowmyan (Yahoo!)
>>  * Mac Yang (Yahoo!)
>>  * Paul Yang (Facebook)
>>  * Alan Gates (Yahoo!)
>> == Sponsors ==
>> === Champion ===
>> Owen O’Malley
>> === Nominated Mentors ===
>>  * Olga Natkovich (Pig PMC member and Apache VP for Pig)
>>  * Alan Gates (Pig PMC member)
>>  * John Sichi (Hive PMC member)
>> === Sponsoring Entity ===
>> We are requesting the Incubator to sponsor this project.
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail:
>> For additional commands, e-mail:
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

- Mohammad Nour
  Author of (WebSphere Application Server Community Edition 2.0 User Guide)
- LinkedIn:
- Blog:
"Life is like riding a bicycle. To keep your balance you must keep moving"
- Albert Einstein

"Writing clean code is what you must do in order to call yourself a
professional. There is no reasonable excuse for doing anything less
than your best."
- Clean Code: A Handbook of Agile Software Craftsmanship

"Stay hungry, stay foolish."
- Steve Jobs

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message