incubator-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kasper Sørensen <>
Subject RE: Potential proposal + request volunteers for champion & mentor roles - MetaModel
Date Mon, 28 Jan 2013 13:12:50 GMT
Hi Sergio,

I am not sure how to characterize the comparison with RDF / Jena / Marmotta, since I actually
didn't know about these topics before now. Linked data might be a relevant topic for MetaModel
also, but in the current form MetaModel is very much focused on a tabular view of data, resembling
that of RDBMs, CSV/spreadsheet files etc. But also to adapt other datastores to a similar
tabular view so that very widely used SQL techniques and wisdom can be applied to it. Not
because we love SQL (MetaModel also supports several NoSQL databases), but because we needed
a "common metaphor" of all data and data tables was found to be the easiest metaphor to comprehend
for the users.

We've looked to DataNucleus a few times also. They do a good job in implementing the various
object mapping protocols (JDO/JPA etc.) but less so (to my understanding) of exposing data
in a dynamic model. As such I see them as a ORM library, where we're still metadata-driven.
The metadata that drives MetaModel comes from the datastore itself, not from the application
code. So if you add a column to your database, or a sheet to your spreadsheet - then it automatically
is also available in MetaModel. In a ORM you would define a new entity type etc. which is
what we needed to get rid of, because the kind of application that typically runs on MetaModel
is focused on the data itself, not any specific domain. Hope this makes some sort of sense

Best regards,

-----Original Message-----
From: Sergio Fernández [] 
Sent: 28. januar 2013 12:21
Cc: Kasper Sørensen
Subject: Re: Potential proposal + request volunteers for champion & mentor roles - MetaModel

Hi Kasper,

the proposal looks quite interesting, but I have two questions:

1) How this is related with the emergent at ASF elated with RDF (mainly Jena and Marmotta)?
I see some clear lines of cooperation there.

2) How this proposal is related with other well-established open source tools? Such as DataNucleus
AccessPlatform <>.


On 28/01/13 11:07, Kasper Sørensen wrote:
> Dear all,
> After some time spent (yeah, sorry quite a lot more time than we had hoped), we finally
now have the green light to officially post a proposal to the Incubator to take on MetaModel
as a project. Since it's been a while, I wish to call out again for sponsors. We found a few
some months ago, but we where out of contact for so long I feel it's most fair to ask again.
> Please find below the draft proposal which we would love to further work on together
with sponsors before finally submitting it. (I tried posting it on the wiki but was not authorized,
probably because my account is a new one).
> Best regards,
> Kasper Sørensen
> -------------------------------------------------
> MetaModel - uniform data access across datastores Proposal for Apache 
> Incubator
> -----------------------------
> Abstract
> MetaModel is a data access framework, providing a common interface for exploration and
querying of different types of datastores.
> --------
> Proposal
> MetaModel provides a uniform meta-model for exploring and querying the structure of datastores,
covering but not limited to relational databases. The scope of the project is to stay domain-agnostic,
so the meta-model will be concerned with schemas, tables, columns, rows, relationships etc.
> On top of this meta-model a rich querying API is provided which resembles SQL, but built
using compiler-checked Java language constructs. For datastores that do not have a native
SQL-compatible query engine, the MetaModel project also includes an abstract Java-based query
engine implementation which individual datastore-modules can adapt to fit the concrete datastore.
> Background
> The MetaModel project was initially developed by to service the DataCleaner
application. The main requirement was to perform data querying and modification operations
on a wide range of quite different datastores. Furthermore a programmatic query model was
needed in order to allow different components to influence the query plan.
> In 2009, Human Inference acquired the eobjects projects including MetaModel. Since then
MetaModel has been put to extensive use in the Human Inference products. The open source nature
of the project was reinforced, leading to a significant growth in the community.
> MetaModel has successfully been used in a number of other open source projects as well
as mission critical commercial software from Human Inference.
> Rationale
> Different types of datastores have different characteristics, which always lead to the
interfaces for these being different from one another. Standards like JDBC and the SQL language
attempt to standardize data access, but for some datastore types like flat files, spreadsheets,
NoSQL databases and more, such standards are not even implementable.
> Specialization in interfaces obviously has merit for optimized usage, but for integration
tools, batch applications and or generic data modification tools, this myriad of specialized
interfaces is a big pain. Furthermore, being able to query every datastore with a basic set
of SQL-like features can be a great productivity boost for a wide range of applications.
> Initial goals
> MetaModel is already a stable project, so initial goals are more oriented towards an
adaption to the Apache ecosystem than about functional changes. We are constantly adding more
datastore types to the portfolio (currently a web service consumer datastore
is being tested), but the core modules have not had drastic changes for some time. Our focus
will be on making ties with other Apache projects (such as POI, Gora and CouchDB) and potentially
renaming the 'MetaModel' project to something more rememberable.
> --------------
> Current status
> Meritocracy
> We intend to do everything we can to encourage a meritocracy in the development of MetaModel.
Currently most important development and design decisions have been made at Human Inference,
but with an open window for anyone to participate on mailing lists and discussion forums.
We believe that the approach going forward should be more encouraging by sharing all the design
ideas and discussions in the open, not only just the topics that have been "dragged" into
the open by third parties.  We believe that meritocracy will be further stimulated by granting
the control of the project to an independent committee.
> Community
> The community around MetaModel already exists, but we believe it will grow substantially
by becoming an Apache project. With MetaModel used in a wide range of both open and closed
source application, both at Human Inference (HIquality MDM), it's open source projects DataCleaner,
SassyReader and AnalyzerBeans and by other parties (such as the Quipo data warehouse automation
project), we believe that the critical mass to sustain a community is there.
> Core developers
> MetaModel was founded by Kasper Sørensen in 2009. Later it was incorporated as a core
library by Human Inference, meaning that more than 20 developers have been involved in its
making in this commercial setting. Furthermore a smaller number of contributors have submitted
patches for the library. Others have started building other interesting data-oriented libraries
on top of MetaModel, for instance the 'vasc' open source project by Willem Cazander, which
is an implementation of the Java Persistence API (JPA) for all the datastores supported by
> Alignment
> MetaModel already makes good usage of existing Apache projects such as POI, CouchDB and
OpenOffice. Furthermore developers from the Apache Gora project have indicated a need for
a project like MetaModel to solve specific uniform datastore access needs.
> -----------
> Known risks
> Orphaned products
> The contributors and the contributing organization (Human Inference) have a very strong
dependence on MetaModel already and will continue to have that for a long time. The continued
need for this vendor to support new types of datastores and maintain existing functionality
will ensure that MetaModel is not in the risk of being orphaned.
> Inexperience with Open Source
> MetaModel is already open source, and has been so for many years. Main contributors of
the project have also contributed to other open source projects such as DataCleaner and Apache
Mahout. The openness of Apache is arguably more extensive, but we are only encouraged and
delighted to be handling the project in a more open manner.
> Homogenous Developers
> Frequent committers are currently located in Denmark, The Netherlands and India. They
are used to working in a distributed environment.
> Reliance on Salaried Developers
> Most of the developers are paid by their employer to contribute to this project, but
given the dependence on MetaModel from both commercial and open source projects, the project
would continue without issue if no salaried developers contributed to the project.
> Relationship with Other Apache Products
> MetaModel depends on several Apache products including commons-lang, commons-io, commons-codec,
http-components, POI, CouchDB, OpenOffice and XMLBeans.
> Furthermore MetaModel is built by Apache Maven.
> An Excessive Fascination with the Apache Brand
> The ASF has a strong brand, and that brand is in itself attractive. However, the developers
of MetaModel have been quite successful on their own and could continue on that path with
no problems at all. We are interested in joining the ASF in order to increase our contacts
and visibility in the open source world. Furthermore, we have been enthusiastic users of Apache,
and would feel honored by getting the opportunity to join the club.
> -------------
> Documentation
> Information on MetaModel can be found at: 
> Initial source
> MetaModel has been developed since 2009 and have undergone a couple of major changes
(indicated by the 2.x and 3.x versions). The code is used in mission critical systems and
is considered very stable and high performing.
> The source includes a fork of the xBaseJ project's code, which will be removed upon incubation.
This code was originally GPL licensed, but granted with a special license to MetaModel to
be forked and relicensed using the current LPGL license of MetaModel. Removal of the xBaseJ
code will effectively mean that the Apache variant of MetaModel will not have support for
dBase database files. We imagine that the dBase module could live on as a separate pluggable
module under the LGPL license, outside of Apache.
> External dependencies
> The dependencies all have Apache compatible licenses. These include BSD and MIT licensed
> ------------------
> Required resources
> Mailing lists
> The default set of mailing lists for an Apache project is considered sufficient for MetaModel.
> Subversion directory
> A subversion or git repository is needed. Currently MetaModel's code is hosted at
but will be moved to an Apache repository.
> Issue tracking
> Other resources
> If possible, a set of database servers (specifically MongoDB, CouchDB, MySQL, PostgreSQL,
MS SQL Server (Express), Firebird) should be made available for integration testing. Currently
this is done internally at Human Inference.
> Initial committers
> Kasper Sørensen ( TODO
> Affiliations
> Kasper Sørensen - works at Human Inference TODO
> -------
> Sponsors
> Champion
> Nominated mentors
> Sponsoring entity

Sergio Fernández
Salzburg Research
+43 662 2288 318
Jakob-Haringer Strasse 5/II
A-5020 Salzburg (Austria)

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message