incubator-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ian Holsman <>
Subject Re: Spatial Information Systems Proposal
Date Tue, 09 Feb 2010 21:59:41 GMT
before calling a vote on this, it would be great if we could get some 
people who are interested in mentoring add their names to the proposal.


On 2/7/10 1:56 AM, Bruce Snyder wrote:
> On Fri, Feb 5, 2010 at 9:31 AM, patrick o'leary<>  wrote:
>> Hi
>> On behalf of the locallucene, localsolr communities, JPL, and myself, I
>> present an Apache Spatial incubator Proposal.
>> Apache Spatial will be a toolkit, allowing spatial data to be represented
>> and queried in multitude of implementing technologies.
>> The proposal is
>> and I have included a text version of the proposal below.
>> I appreciate any feedback and discussion.
>> Thanks
>> Patrick O'Leary / Chris Mattmann / Sean McCleese / Paul Ramirez / Ben Lewis
>> ------------------
>> Apache SIS, A toolkit for constructing spatial information systems.
>> Abstract
>> Spatial information systems (SIS) (akin to Geographic Information Systems,
>> or GIS) are rapidly growing as information has taken on a sense of location.
>> This location context has allowed people to start exploring different ways
>> of searching, clustering, and displaying information. Spatial queries such
>> as:
>>     * point-radius, e.g., show me all objects within X miles of point P,
>> typically a lat/lon;
>>     * bounding box, e.g., show me all objects within a box defined by south,
>> east, north, west bounding coordinates; and
>>     * polygon, an extension of bounding box to arbitrary shapes defined by
>> arbitrary points
>> are becoming a part of everyday life, where some combination of the above is
>> used to find a restaurant, determine sites of interest for climate research,
>> for data reduction and subsetting, or demographic profiling, social
>> networking, and a host of other applications. There exist a number of
>> libraries, and frameworks written in Java, C/C++, and other P/Ls that deal
>> with the aforementioned issues, however the one consistent homogeneity is
>> that most of these software do not include ASF-friendly licensing. On the
>> contrary, most of these software systems and tools are LGPL licensed, as
>> their use is primarily to produce GIS software, which is then sold for a
>> profit. What's more, even the standards organization the Open Geospatial
>> Consortium (OGC) promotes the use of LGPL SIS/GIS software to implements its
>> interfaces and specifications, leaving those interested in a more
>> ASL-friendly solution with a major hole to fill, or having to deal with the
>> license implications of leveraging LGPL open source software in their
>> applications.
>> We propose to construct Apache SIS, an ASL 2.0 licensed toolkit that spatial
>> information system builders or users can leverage to support the
>> aforementioned activities, alleviating much of the software and potentially
>> legal difficulties in implementing SIS/GIS systems. This project will look
>> to expand on those concepts and serve as a place to store reference
>> implementations of spatial algorithms, utilities, services, etc. as well as
>> serve as a sandbox to explore new ideas. Further, the goal is to have Apache
>> SIS grow into a thriving Apache top-level community, where a host of SIS/GIS
>> related software (OGC datastores, REST-ful interfaces, data standards, etc.)
>> can grow from and thrive under the Apache umbrella.
>> Proposal
>> The Internet is changing to the "local world" wide web, where information no
>> longer exists in a digital vapor, but contains real world context. From news
>> stories to tweets, location is a very powerful concern, evidenced by the
>> proliferation of popular websites offering geo-referenced information for
>> all relevant content (Flickr, Twitter, Google Maps, etc). Besides the social
>> utility of spatial data, there are also national interest related uses of
>> prime importance. For example, from a national policy perspective, and
>> federal agency perspective (e.g., NASA, NOAA, DoD), global climate concerns
>> have underscored the importance of science data collected about our planet,
>> all of which is location based. So-called "operational" and "actionable"
>> data including climate models, weather forecasts as well as scientific,
>> "offline" data (measurements of CO2 in the atmosphere, measurements of sea
>> surface temperature, etc.) all provide some sense of where the data was
>> created, where currently resides, and/or what it references. These are just
>> a sampling of the spatially relevant information available -- the list is
>> growing as scientists, policy-makers and decision makers develop new
>> downstream activities that leverage spatial data. As we move forward there
>> is also no reason to restrict the focus of SIS/GIS to just this planet as a
>> point of reference; other sciences (astrophysics, planetary science) have
>> been collecting information about our universe and other celestial bodies
>> for years, information that could be "spatial"-enabled. There has been a
>> growing recent interest in data collected about the Earth's moon as in the
>> case of NASA's Lunar Reconnaissance Orbiter, its Lunar CRater Observation
>> and Sensing Satellite (LCROSS) and its Lunar Mapping and Modeling Project
>> (LMMP), as well as Google Moon and other such projects. Spatial data can
>> offer substantial value added for consumers of data through the use of
>> location-rich metadata, as well as through the use of layering, allowing
>> users of spatial data to explore layers of data (points of interest,
>> elevation and other parameters) in an interactive fashion. What's more, the
>> algorithms that drive SIS/GIS can be leveraged to represent data which is
>> not just geographical based, such as bio-informatics, fingerprints search,
>> facial search etc., providing substantial reuse benefits if an ASF-friendly
>> software system that provided SIS/GIS functionality existed. Apache SIS will
>> provide a manner in which spatial data such as that described above can be
>> represented and used with existing technologies. The proposed founders of
>> Apache SIS all have relevant and experience either developing spatial
>> software that can easily perform the above tasks, or have experience working
>> on the domains containing the georeferenced data of interest. We will
>> leverage this experience and data expertise to deliver an Apache SIS system
>> of use to a broad community of interest, making Apache an ideal home for
>> this important software.
>> Background
>> There are several projects of different spatial capabilities available
>> today, the two most common are:
>>     * GeoTools
>>     * PostGIS
>> Apache SIS goal is not aiming to compete with these tools but, instead, to
>> provide a spatial framework that enables better representation of
>> coordinates for searching, data clustering, archiving, or any other relevant
>> spatial needs. By developing a toolkit framework that is independent of
>> underlying implementation we hope to also reduce duplication of both
>> software and effort with a published interface which other software projects
>> can simply tie it into their own frameworks. The initial concept behind
>> Apache SIS comes from LocalLucene, an extension to Apache Lucene that
>> provided a Geographical filter on top of the Lucene search library.
>> LocalLucene went on to become LocalSolr, and has since been included in many
>> frameworks from Spring to Hibernate, to Hbase, and to Compass. The
>> LocalLucene framework has also been contributed to Apache Lucene under the
>> moniker "Spatial Lucene", and currently exists as a contrib module within
>> the Lucene project, version 2.9 and later. From January 2009-Dec 2009, while
>> working on building out spatial capabilities in Apache SOLR for oceans-data
>> and lunar-data related projects at NASA JPL, Chris Mattmann stumbled across
>> LocalLucene and LocalSOLR, and eventually discussed its limitations and
>> benefits with Patrick O'Leary, along with the rest of the proposed
>> committers in this effort. The consensus was there was a significant lack of
>> a generic spatial data focused library out there in Apache land, and if
>> present, such a library would present a unique contribution to the folks who
>> were working with GIS data, that weren't only interested in search. In other
>> words, there are a host of activities besides search (visualization, data
>> reduction, statistical analysis) where a generic SIS/GIS library would be of
>> prime importance. Both Chris, and Patrick, as well as the other committers
>> had been stung by the issues in dealing with LGPL libraries and there was a
>> difficult time finding any SIS library that was useful, and also ASL
>> licensed. From these conversations, Patrick and Chris approached Ian
>> Holsman, and asked for his support in championing this proposal and helping
>> to get this effort started. From there, we all agreed that the general
>> community at large would be best served by establishing a top level project
>> that focused primarily on solving spatial problems including search,
>> visualization, data reduction and the aforementioned use cases.
>>     * Apache SIS will also be the first known spatial project of this nature
>> to be licensed under Apache License v2.0, the vast majority of other GIS
>> projects are LGPL. Further Apache SIS will be the first known (to our
>> knowledge) Apache top level project focused on implementing spatial
>> standards, and focused on building an Apache-based community in this
>> thriving area.
>> Initial Goals
>>     * The initial goals of the proposed project are:
>>     * Viable community around the Apache SIS codebase
>>     * Active relationships and possible cooperation with related projects
>> and communities such as OGC
>>     * Provide a geo-spatial coordinate system, with planetary plugins.
>>     * Provide a polygon and line string coordinate comparison system.
>>     * Build a Java framework to start out, but look to develop other P/L
>> support (Python, Ruby, as a start).
>> Current Status
>> Meritocracy
>> All the initial committers are familiar with the meritocracy principles of
>> Apache, and have already worked on the various source code bases (incl.
>> Lucene Contrib, Tika, Nutch, and SOLR), providing issue comments, patches,
>> and in some cases, committing (O'Leary&  Mattmann) and participating as PMC
>> members (Mattmann). We will follow the normal meritocracy rules also with
>> other potential contributors.
>> Community
>> That Apache SIS community will be a co-mingling of several other communities
>> that depend on Spatial&  Geo Spatial solutions for their projects, the
>> expectation is there will be members from the original LocalLucene project,
>> the strong LocalSolr project, as well as Compass, Lucene and Solr at very
>> early if not immediate stages. We will also look to garner support and
>> contributions from other projects that are working in spatial, e.g.,
>> PostGIS, and other OGC efforts as well. There is already a growing number of
>> folks at NASA who are also interested in spatial systems and work in the
>> area. We will approach those people as well and attempt to bring them into
>> the Apache SIS community. The idea would be for Apache SIS to grow into a
>> top-level project that allows for sub projects based on SIS focus
>> (visualization, data reduction/algorithms, OGC standards, etc.)
>> Core Developers
>> The initial developers come from a diverse set of backgrounds ranging from
>> software architecture, search, academic, research/practice, to data mining.
>> All of the proposed initial developers require the functionality of Apache
>> SIS (Ramirez - LMMP, McCleese - oceans data, Mattmann -lunar/oceans, O'Leary
>> - local search) in a compatible way.
>> Alignment
>> Existing Apache projects currently rely on the proposed starting point for
>> Apache SIS, such as Lucene and Solr. We will begin by refactoring the
>> LocalLucene contribution into a library independent of any underlying
>> substrate (e.g., independent of Lucene). We will then look to add in
>> functionality for calculating distances, functionality for persisting
>> spatial data (to DBMS'es, search indexes, key/value stores, to Hadoop/etc.)
>> We will follow by then focusing on data models and export of spatial data,
>> culminating in an initial release that includes all of the basic
>> functionality to at a minimum compute on spatial data, and store/export it.
>> Known Risks
>> Orphaned products
>> Several projects currently contain implementations of the initial code basis
>> for Apache SIS, these projects can continue with the existing code base
>> without impact, or adopt Apache SIS and reap the benefits of a common code
>> base. Our goal is to provide value-added, shared ASL-licensed spatial
>> software that is easy to adapt and adopt in any of the existing Apache (and
>> external communities) developing SIS/GIS. Our initial focus will be on
>> building a Java library but we will look at means for extending the Java
>> library into additional P/Ls and frameworks.
>> Inexperience with Open Source
>> All the initial developers have worked on open source before and many are
>> committers (O'Leary, Mattmann) and PMC members (Mattmann) within other
>> Apache projects. McCleese and Ramirez are recent Apache committers on the
>> soon to be initiated OODT project that was accepted into the Incubator.
>> Homogenous Developers
>> The initial developers come from a variety of backgrounds and with a variety
>> of needs for the proposed toolkit. Further, the developers consist of folks
>> from at least two widely diverse companies, AT&T Interactive and NASA's Jet
>> Propulsion Laboratory, spanning industry and government/research.
>> Relationships with Other Apache Products
>> Apache SIS is related to the following projects, non of the projects are
>> direct competitors, but contain some functionality provided by Apache SIS
>>     * Lucene Java, contains Spatial Lucene. We will look to leverage this
>> code, combined with updates present at Local Lucene at Sourceforge as a
>> starting point for the refactoring activity.
>>     * Apache Solr, uses functionality from Spatial Lucene and may have some
>> inspiration for how to perform some of the spatial computations we would
>> like to have present in Apache SIS. Once Apache SIS matures, Solr could rely
>> on SIS as a library component.
>>     * Apache HBase - can index spatial reference id's and incorporate SIS
>> query methodology to extend it to providing Spatial services once Apache SIS
>> matures.
>> Initial Source
>> Apache SIS is an amalgamation of Spatial Lucene, and LocalSolr components.
>>     * Spatial Lucene contains the original Spatial Coordinate system
>>     * LocalSolr provides polygon and line string builders and comparator
>> features.
>>     * Local Lucene at Sourceforge contains a number of updates that we will
>> merge into Apache SIS
>> The above code sources will serve as a basis for a fundamental
>> generalization and refactoring activity that will result in an Apache SIS
>> system focused on: spatial computation, and spatial data storage/export to
>> start out. Activities such as visualization, reduction, and standards will
>> occur downstream of this initial activity once the code base becomes stable.
>> Source and Intellectual Property Submission Plan
>> All seed code and other contributions will be handled through the normal
>> Apache contribution process.
>> We will also contact other related efforts for possible cooperation and
>> contributions. Local Lucene is ASL-licensed, as is the other code bases
>> (Local SOLR, and Spatial Lucene). All proposed committers have CLAs on file
>> and are familiar with the code contribution process in Apache.
>> External Dependencies
>> At the moment, we will build Apache SIS so that is has no external
>> dependencies, and is self contained. If we do require common dependencies,
>> such as libraries for computation, or for storage/persistence, we will
>> ensure that they leverage an ASL or compatible license. For example, to
>> support persistence, we may leverage other libraries (e.g., Derby, K/V
>> stores, etc.), and in these cases, we will focus on those libraries with a
>> compatible license.
>> Cryptography
>> There is no cryptography required in Apache SIS at present time.
>> Required Resources
>>     * Mailing lists
>>     *
>>     *
>>     *
>>     *
>> Subversion Directory
>>     *
>> Issue Tracking
>>     * JIRA SIS (SIS)
>> Other Resources
>> none
>> Initial Committers
>> Name        | Email        Institution    CLA
>> Patrick O'Leary    | pjaol at apache dot org | AT&T Interactive| yes
>> Chris A. Mattmann|mattmann at apache dot org| NASA Jet Propulsion
>> Laboratory|yes
>> Sean McCleese| smcclees at jpl dot nasa dot gov| NASA Jet Propulsion
>> Laboratory|yes
>> Paul Ramirez| pramirez at jpl dot nasa dot gov|NASA Jet Propulsion
>> Laboratory|yes
>> Sponsors
>>     * Champion
>>     * Ian Holsman (ianh at apache dot org)
>> Nominated Mentors
>>     * Ian Holsman (ianh at apache dot org)
>> Sponsoring Entity
>>     * Apache Incubator
> +1 I'm very happy to see this come about. In 2004 I worked at a
> company where we began using the Jump Project for some spatial tasks.
> At the time, it was still somewhat limited and we wound having to rely
> upon Oracle's spatial features in the database which were very costly
> and slooooooowwwww.
> Bruce

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message