incubator-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Srihari Srinivasan <>
Subject Request for advise, collaboration
Date Fri, 15 Jul 2016 13:49:33 GMT
Hi Folks,
I am Hari, a developer with a company called ThoughtWorks. We've been developing data pipelines
using on Hadoop,Spark etc for a while now. From our experiences with different customers we've
noticed a recurring need to carry out tasks such as data preparation, data anonymization etc
on large datasets using Java MR and Spark.Based on this experience, we have been working on
building a couple of libraries targeted at data preparation and data protection to begin with.
Its hosted under an umbrella project called Data Commons at the moment (inspired by the
Apache Commons project which is organized around a similar theme).
At the moment this is a fledgling project and its contributions are driven by our data team.
However we are very keen on making this part of the larger Apache collective and make it
a community driven effort. 
Hence, I am reaching out to you folks for advise on what could be the best way forward for
this effort. We are also open to explore collaborations with other existing projects that
are already part of Apache. Please share your thoughts, advise.
-- Hari

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message