incubator-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jacky Li <>
Subject Re: [DISCUSS] CarbonData incubation proposal
Date Fri, 20 May 2016 06:06:56 GMT
Hi Julien Le Dem,

I am one of the developers in CarbonData project. Thanks for pointing out
this issue. Actually, we are in a process of rapid development of this new
file format and still missed proper documentation by now. 

CarbonData's goal is a columnar file format that can be used to satisfy
various query scenarios, so by design it has some unique features like
builtin multi-level index, operable encoded data, collumn group, etc. (Liang
has pointed out some of them in his last post). But since it is a columnar
file format, it shares some common terminologies with Apache Parquet and
Apache ORC, which I think it is inevitable. To reduce the confusion to
minimal in the future, I think we will improve our documentation later on.
And do you have other suggestion also?

For the file format specification, I have updated the wiki and thrift
definition to reflect the design of CarbonData. Please check whether still
have issues.

Jacky Li

View this message in context:
Sent from the Apache Incubator - General mailing list archive at

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message