incubator-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Benjamin Young <>
Subject [PROPOSAL] Apache Annotator
Date Tue, 31 May 2016 13:46:25 GMT
Hi all,

I have been working with the community to move our community to the ASF. We
have in the past been a BDFL-led group, but that has proved unsustainable and resulted in
many forks and lost opportunity.

Recently, many community members gathered at the conference and subsequent
hackathon and discussed the future of the project. We again concluded that the ASF held the
most promise for a governance style that could support our growing community and assure that
collaboration continue into the future.

Our Incubator Proposal is current here (also below in markdown):

I would be happy to move the proposal to the incubator wiki-my user name there is `bigbluehat`.

Our current mailing list has a running vote/discussion around this proposal and our move to
the ASF:

Lastly, we have a Champion (Daniel Gruno), but are still in need of Mentors.

Thank you for considering this proposal!
Benjamin -

Apache Annotator Proposal:
#### Abstract
> A short descriptive summary of the project. A short paragraph, ideally one sentence in

Annotation enabling code for browsers, servers, and humans.

#### Proposal
> A lengthier description of the proposal.

The Annotator community seeks to build a foundational set of libraries under a liberal license
providing the pieces necessary for developers to add annotation to their projects.

#### Background
> Provides context for those unfamiliar with the problem space and history.

Annotator.js was originally created by Open Knowledge (formerly The Open Knowledge Foundation)
to provide annotation over works by Shakespeare. Since that time, Annotator has found its
way into a wide range of browser-based annotation systems such as,,
and various academic, publishing, and scientific research projects.

Sadly, this increased usage has primarily happened in forks of the main code or through copy-left
licensed plugins that prevent their use by many community members.

However, the community remains interested in combined collaboration and interested in a foundational
future for annotation--both in browsers as well as servers and desktop/mobile applications.

#### Rationale

> Explains why this project needs to exist and why should it be adopted by Apache.

Annotation is often implemented in projects in ad hoc ways with developers often re-solving
problems well known to the Annotator community. The Annotator community works to provide knowledge
and code to help developers more quickly implement or improve annotation within their projects.

We believe bringing the Annotator community into the Apache Software Foundation will allow
for wider recognition of the annotation problem space, help more developers find their way
to solving this shared problem, provide increased cohesion for our own somewhat fractured
community, and increase the use of commonly shared code within a wide range of projects.

#### Initial Goals

* create a collaborative space for the existing Annotator contributors and community
* further ignite interest and activity around annotation
* build foundational libraries for annotation
* implement code to support the Web Annotation Data Model, Protocol, and other annotation
related specifications
* potentially re-license Annotator under the Apache License 2.0
  * Annotator is currently licensed under a combination of the MIT & GPL
* consolidate (where possible) community activity around building add-ons, annotation storage
providers, and use-case specific feature sets
* grow interest and activity in annotation

#### Current Status

##### Meritocracy
> Apache is a meritocracy.

The project is in transition from a primarily BDFL-based model to one with a more diverse
set of committers. There are 36 total known commiters to Annotator. 3 commiters having done
the bulk of the coding and decision making. 2 of those commiters acting as project leadership.

However, the community is much larger and more diverse when the various forks and plugin authors
are considered.

We intend to invite and include participants from a wide array of annotation problem spaces
to collaborate in this new shared space.

##### Community
>  Apache is interested only in communities.

Community calls had been being done every 3-6 months with reports of the calls outcome being
posted to the mailing list and the website.

Most activity within the project happens on the mailing list. There is also a relatively inactive
#annotator channel on The website is primarily for promotion and includes
promotion of community plugins and showcases projects using Annotator. Documentation is published
on and linked to from the website.

There are many Annotator and W3C Annotation Data Model related projects found on GitHub. Our
objective would be to invite these communities to join this collaborative community with the
hope of greater stability and community longevity.

##### Core Developers
> Apache is composed of individuals.

The 3 primary committers to the project are Nick Stenning of The Hypothesis Project, Randall
Leeds of Medal, and Aron Carroll of Dropbox, Inc. Nick Stenning is the original creator of
Annotator. Randall Leeds is an Apache CouchDB committer. Aron has been a frequent contributor.
All three have been members of The Project in past years.

Other currently active community members include:

* Andrew Magliozzi of
   * Andrew drives the scheduling of community calls, is active on the mailing list, and encourages
progress within the project and community
* Benjamin Young of Wiley (also formerly of The Project)
   * an Apache CouchDB commiter
   * co-editor of the Web Annotation Data Model
* Oliver Sauter of WordBrain
   * active advocate for Annotator and the growth of the annotation community

Other committers have contributed significant amounts of code, content, or issues and discussions,
but are currently (in the last 3-6 months) less active on the project. However, at recent
annotation related conferences the scale of the plugin, fork, and ancillary project activity
was shown to be much higher than what was apparent from activity on the main Annotator mailing
list--in part due to community fracturing...something we hope to fix with joining the ASF.

A full list of Annotator contributors can be seen here:

##### Alignment
> Describe why Apache is a good match for the proposal.

The Annotator community believes that the Apache Software Foundation promotes and enforces
the sort of community that will best serve the future of the project. It is also believed
that Annotator can serve the ASF by providing its tools to bring annotation into various Apache
projects and eventually to the site, project documentation, and other tools within
the ASF.

The priority is on increasing community involvement, defining--via the Apache Way--how we
will code and collaborate going forward, and upon creating the best possible annotation solution
born out of that collaboration.

#### Known Risks
> An exercise in self-knowledge. Risks don't mean that a project is unacceptable. If they
are recognized and noted then they can be addressed during incubation.

##### Orphaned products
> A public commitment to future development.

The majority of the core committers were formerly from The Project which used
an earlier version of Annotator within it's annotation web service and BSD-licensed `h` annotation
software. However, Hypothesis and most other organizations and projects using Annotator have
forked the main code base or created unique plugins which only exist within their projects
and have not been contributed upstream.

The fracturing of the community and previous single-entity contribution has greatly prohibited
collaboration and growth of the community. Concurrently, interest and growth of annotation
projects from a wide constituents has grown--though around a much wider array of code and
projects. The hope is that the creation of a collaborative space built for discussion and
sharing of these tools would provide the opportunity to reach a common core to be shared among
the many diverse players.

As such, the Annotator project has begun the process of becoming an Apache project to establish
a development and community process that encourages diversity and cross-organization collaboration.

##### Inexperience with Open Source

Annotator was established as an Open Source project in 2011 with it's first, v0.0.1 release
being made on January 1st of that year:

The project has continued since that time as an open source project developed on GitHub. The
community has grown in diversity since that time and was moved into a separate "openannotation"
GitHub organization (from the original "okfn" GitHub organization) in 2014 in an effort to
increase community involvement and diversity.

Each of the core committers have worked on and created open source software for themselves
or various organizations for the greater than 5 years. Two of the contributors mentioned above
also have greater than 5 years contributor experience at the ASF and are both now core committers
to a top-level project (Apache CouchDB).

##### Homogeneous Developers
> Healthy projects need a mix of developers. Open development requires a commitment to
encouraging a diverse mixture. This includes the art of working as part of a geographically
scattered group in a distributed environment.

Active community members as well as plugin and compatible annotation storage system builders
are from a diverse, though scattered, range of organizations and individually driven projects.

The Annotator community is seeking to combine its efforts into a core group of committers
to more accurately encourage a shared foundation as well as continue the growth in diversity
of the community.

Geographically, the Annotator community is widely distributed from Germany, Hungary, the East
and West coasts of the US, and Australia.

Additionally, the wide range of annotation related projects that may be considered as input
for this projects code explorations range in size, contributor diversity, and growth.

##### Reliance on Salaried Developers
> A project dominated by salaried developers who are interested in the code only whilst
they are employed to do so risks its long term health.

In the past, contributors to Annotator project were solely from The Project and
their activity was driven primarily by the needs of that project. However, the diversity of
interested participants has greatly increased. There is an additional hope of creating an
aggregated community from various projects (including Annotator, Hypothesis' `h` code, and
various related libraries and plugins) as well as exploring the creation of new tools--not
only for the browser--to further widen the interest and activity around annotation.

##### Relationships with Other Apache Projects
> Apache projects should be open to collaboration with other open source projects both
within Apache and without. Candidates should be willing to reach outside their own little

The Annotator community also provides an annotation storage system ("annotator-store") built
upon ElasticSearch. There are compatible implementations of that API built on various storage
systems (including Apache CouchDB), and the community would encourage the creation of other
compatible storage systems built upon other Apache storage projects.

Additionally, Annotator is a JavaScript library which could serve any of the various CMS projects
within Apache.

The roadmap for Annotator also includes compatibility with the Web Annotation Data Model which
is a JSON-LD serialization of an RDF-based data model for annotation. The growing number of
RDF-focused Apache projects could take advantage of and contribute to the creation of these

The W3C Annotation Working Group is also creating multiple related deliverables around Web
Annotation including an Linked Data Platfrom-based Protocol specification, a note about selector
systems, and future notes for various serialization and integration opportunities for the
Web Annotation Data Model. Apache Marmotta is one project within the ASF which has native
support for LDP and may have an interest in collaborating around implementation of the Web
Annotation Protocol.

Lastly, Apache UIMA can currently generates Open Annotation Data Model annotations as an output
of it's Natural Language Processing system. These annotations could be displayed via code
written within this new Apache project--which could further leverage user interaction with
those NLP-based annotation (such as confirmation, rejection, or modification of the annotations
made by Apache UIMA's NLP process). There are other NLP projects within the ASF which could
similarly benefit from these explorations and code generated here.

##### A Excessive Fascination with the Apache Brand
> Concerns have been raised in the past that some projects appear to have been proposed
just to generate positive publicity for the proposers. This is the right place to convince
everyone that is not the case.

The Annotator community acknowledges the value and recognition that the Apache brand would
bring to the Annotator project. However, the primary interest is in the community building
process and long-term stability that the Apache Software Foundation provides for its projects.

We do hope for increased recognition of and contribution to an array of annotation code projects
built within this community. However, we primarily hope for community aggregation driven by
building a core set of tools for our shared set of needs which are now scattered across various
annotation endeavors.

Integrating those developers into this new community and adding them as contributors is seen
as a much higher priority then increasing awareness through branding.

#### Documentation
> References to further reading material.



Mailing List:


Annotator plugin index:

#### Initial Source
> Describes the origin of the proposed code base. If the initial code arrives from more
than one source, this is the right place to outline the different histories.

The original Annotator code base was created by Nick Stenning while at the Open Knowledge
Foundation. The code has been in development since before 2011 with the first public release
(v0.0.1) happening on January 1st, 2011 on GitHub.

The example annotation storage system (which works with Annotator's stock Store plugin) had
it's first release in February 21, 2011 and was originally built for Apache CouchDB. The contributor
list of annotator-store is similar, but the license is simply the MIT (rather than MIT &
GPL). The stated copyright is 2010-2012 Open Knowledge Foundation.

Additionally, there is a growing list of forks, plugins, and related tooling created by the
community in various places--often embedded within larger projects. The Annotator Plugins
index has reference to some such possible inputs to this project's code. The W3C specifications
are also being implemented and the growing number of projects available around those specifications
would also be considered as possible inputs. Most specifically, Randal Leeds (also a contributor
to Annotator) has built a set of libraries focus on implementing the W3C selectors. These
libraries could serve as an initial foundation for a core library for browsers or JavaScript-base
server code.

#### Source and Intellectual Property Submission Plan
> Complex proposals (typically involving multiple code bases) may find it useful to draw
up an initial plan for the submission of the code here. Demonstrate that the proposal is practical.

Our primary goal is to aggregate communities that center around annotation. We intend to focus
our initial work on a JavaScript-based library built from Randall Leeds `dom-anchor-*` libraries
(single owner copyright; MIT licensed) and potentially reusing code from Annotator (mixed
owner copyright; MIT & GPL dual-licensed).

The Annotator community has a stated copyright owner of "The Annotator Community." All contributions
are believed to have been made "in kind" and the copyright owned by the various contributors.
The three primary committers have stated a willingness to donate their contributions to the
Apache Software Foundation and the minimal parts with copyright owned by others will likely
be rewritten. Though we also hope to engage these individuals to join the combined efforts
being made at the ASF.

The `annotator-store` project is under a clearer, single BSD license. The copyright holder
is stated to be the Open Knowledge Foundation with the years 2010-2012. It is likely that
this code will only be used for reference or via library inclusion and not directly developed
upon within the ASF.

An earlier process was undertaken to collect re-licensing permission from known contributors
via the existing mailing list and GitHub issues--using a model similar to Twitter's when it
relicensed Bootstrap. General agreement was reached, but no decisive actions were taken as
many contributors of smaller amounts of code were no longer reachable.

We hope to engage the various plugin and fork authors along with similar annotation projects
to engage future work under a shared license and developed within The Apache Way. The contribution
of specific code to this project or its future deliverables will be handled individually by
the community over the course of the project.

One core goal of bringing the community to the ASF is to avoid this confused licensing situation
in the future.

#### External Dependencies

Annotator depends on the following JavaScript modules from NPM:

* backbone-extend-standalone - MIT
* browserify-shim - MIT
* clean-css - MIT
* enhance-css - MIT
* es6-promise - MIT
* insert-css - MIT
* jquery - MIT
* through - MIT / Apache License 2.0
* xpath-range - MIT + GPL-3.0+ Dual License

annotator-store depends on the following Python modules:

* elasticsearch - Apache License 2.0
* iso8601 - MIT
* six - MIT

MongoServer (a Web Annotation Platform implementation) is a single owner project currently
licensed under the Apache License 2.0.

Randall Leeds `dom-anchor-*` libraries are all licensed under the MIT and include these dependencies:

* dom-anchor-fragment - MIT
   * no dependencies
* dom-anchor-text-position - MIT
   * node-iterator-shim - MIT
   * dom-seek - MIT
* dom-anchor-text-quote - MIT
   * dom-anchor-text-position - MIT
   * diff-match-patch - Apache License 2.0

#### Required Resources

##### Mailing Lists

* private@
* dev@
* commits@

Note: the Annotator community currently uses a single list hosted by Open Knowledge at:

##### Git Repository


Note: the Annotator community hosts its code on GitHub as part of the "openannotation" organization.
Randall Leeds also uses GitHub for his `dom-anchor-*` libraries as does Rob Sanderson for
his Web Annotation Protocol implementation. These are all potential code inputs to be considered
for reuse or continuation by this community.


##### Issue Tracking

The Annotator community would prefer to continue using GitHub Issues if that is a possibility.

##### Other Resources

* static website hosting for

#### Initial Commiters

* Nick Stenning <>
* Randall Leeds <>
* Benjamin Young <>
* Oliver Sauter <>
* Andrew Magliozzi <>
* Aron Carroll <>
* Mariano Giagante <>
* Luke Murphy <>

#### Affiliations

* Nick Stenning of The Project
* Randall Leeds of Medal
* Benjamin Young of Wiley
* Oliver Sauter of
* Aron Carroll of Dropbox, Inc.
* Andrew Magliozzi of
* Mariano Giagante of
* Luke Murphy of

#### Sponsors

##### Champion

[Daniel Gruno]( aka `humbedooh`

##### Nominated Mentors


##### Sponsoring Entity
> The Sponsor is the organizational unit within Apache taking responsibility for this proposal.
The sponsoring entity can be: the Apache Board, the Incubator, another Apache project

The Incubator

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message