incubator-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ramirez, Paul M (388J)" <>
Subject Re: [VOTE] Accept Apache Open Climate Workbench into the Incubator
Date Tue, 05 Feb 2013 16:48:50 GMT
+1 binding


On Feb 5, 2013, at 8:30 AM, "Andrew Hart" <> wrote:

> +1 (binding)
> -Andrew
> On 2/5/13 8:18 AM, Mattmann, Chris A (388J) wrote:
>> Hi Folks,
>> OK, now that discussion has settled down, I'd like to call a VOTE for
>> acceptance of Apache Open Climate Workbench into the Incubator.
>> I'll leave the VOTE open the rest of the week and close it out next
>> Monday, February 11th early am PT.
>> [ ]  +1 Accept Apache Open Climate Workbench into the Incubator
>> [ ]  +0 Don't care.
>> [ ]  -1 Don't accept Apache Open Climate Workbench into the Incubator
>> because...
>> Full proposal is pasted at the bottom of this email. Only VOTEs from
>> Incubator PMC members are binding, but all are welcome to express their
>> thoughts.
>> Thank you!
>> Cheers,
>> Chris
>> P.S. Here's my +1 (binding)
>> -------------
>> = Apache Open Climate Workbench, tool for scalable comparison of remote
>> sensing observations to climate model outputs, regionally and globally. =
>> === Abstract ===
>> The Apache Open Climate Workbench proposal desires to contribute an
>> existing community of software related to the analysis and evaluation of
>> climate models, and related to the use of remote sensing data in that
>> process.
>> Specifically, we will bring a fundamental software toolkit for analysis
>> and evaluation of climate model output against remote sensing data. The
>> toolkit is called the [[|Regional Climate Model
>> Evaluation System (RCMES)]]. RCMES provides two fundamental components for
>> the easy, intuitive comparison of climate model output against remote
>> sensing data. The first component called RCMED (for "Regional Climate
>> Model Evaluation Database") is a scalable cloud database that decimates
>> remote sensing data and renalysis data related to climate using Apache
>> OODT extractors, Apache Tika, etc. These transformations make
>> traditionally heterogeneous upstream remote sensing data and climate model
>> output homogeneous and unify them into a data point model of the form
>> (lat, lng, time, value, height) on a per parameter basis. Latitude (lat)
>> and Longitude (lng) are in WGS84 format, but can be reformatted on the
>> fly. time is in ISO 8601 format, a string sortable format independent of
>> underlying store. value carries with it units, related to interpretation
>> and height allows for different values for different atmospheric vertical
>> levels. All of RCMES is built on Apache OODT, Apache Sqoop/Apache Hadoop
>> and Apache Hive, along with hooks to PostGIS and MySQL (traditional
>> relational databases). The second component of the system, RCMET (for
>> "Regional Climate Model Evaluation Toolkit") provides facilities for
>> connecting to RCMED, dynamically obtaining remote sensing data for a
>> space/time region of interest, grabbing associated model output (that the
>> user brings, or from the Earth System Grid Federation) of the same form,
>> and then regridding the remote sensing data to be on the model output
>> grid, or the model output to be on the remote sensing data grid. The
>> regridded data spatially is then temporally regridded using techniques
>> including seasonal cycle compositing (e.g., all summer months, all
>> Januaries, etc.), or by daily, monthly, etc. The uniform model output and
>> remote sensing data are then analyzed using pluggable metrics, e.g.,
>> Probability Distribution Functions (PDFs), Root Mean Squared Error (RMSE),
>> Bias, and other (possibly user-defined) techniques, computing an analyzed
>> comparison or evaluation. This evaluation is then visualized by plugging
>> in to the NCAR NCL library for producing static plots (histograms, time
>> series, etc.)
>> We also have performed a great deal of work in packaging RCMES to make the
>> system easy to deploy. We have working Virtual Machines (VMWare VMX and
>> Virtual Box OVA compatible formats) and we also have an installer built on
>> Python Buildout ( called "Easy RCMET" for dynamically
>> constructing the RCMET toolkit.
>> RCMES is currently supporting a number of recognized climate projects of
>> (inter-)national significance. In particular, RCMES is supporting the
>> [[|U.S. National Climate
>> Assessment (NCA) activities]] on behalf of NASA's contribution to the NCA;
>> is working with the [[|North American Regional
>> Climate Change Assessment Program (NARCCAP)]]; and is also working with
>> the International [[|Coordinated
>> Regional Downscaling Experiment (CORDEX)]].
>> === Proposal ===
>> We propose to transition the RCMES software community, which includes
>> developers of the RCMET and RCMED software, along with users of RCMES in
>> the CORDEX project across a variety of academic institutions, scientists
>> helping to improve the RCMES metrics, and visualizations, and regridding
>> algorithms, packagers making RCMES easier to install, and scientists
>> helping to lead some of these international projects that are already
>> using RCMES.
>> We have been working on the RCMES project since 2009 funded initially by
>> the American Recovery and Reinvestment Act (ARRA) project out at NASA, and
>> then branching out into other sources of support and sustainability (NASA;
>> NSF, etc. -- see the
>> [[|acknowledgements]] section on
>> the RCMES website for a full list of supporting U.S. and international
>> partners).
>> With the existing RCMES community at Apache, we will also work to
>> encourage other climate software projects e.g., Open Climate GIS, elements
>> of the Earth System Grid Federation, other NASA climate projects funded
>> under the Computational Modeling, Algorithms and Cyberinfrastructure
>> (CMAC) to contribute to the Open Climate Workbench here at Apache.
>> RCMED is a Big Data project that combines several underlying Apache
>> software -- OODT, Tika, Hadoop, HIVE, and Sqoop -- and other related data
>> management software. Its primary language is Java; RCMET, on the other
>> hand, is a Python API, associated set of classes (framework), set of
>> Python Bottle Web services, and a PHP "Wizard"-based User Interface that
>> leverages Apache OODT Balance.
>> === Background ===
>> Bringing RCMES to Apache was the brain-child of Chris Mattmann, based on
>> his solid experience with Apache OODT and bringing it to the ASF. Chris
>> worked for a year to get the support of the JPL community including
>> approvals from the Software Release authority at JPL to release the
>> software.
>> The initial code drop will include the RCMES SVN repository from JPL
>> including prior revisions. We anticipate also including a smaller package,
>> CDX, which contains some useful facilities for regridding, and command
>> line tools for manipulating large datasets, and working with OPeNDAP, etc.
>> After the code drop, we will work with our developers, users, documentors,
>> and other members of the team to teach those unfamiliar with the Apache
>> way how it works around here at Apache. 30% of the community from RCMES
>> includes those intimately familiar with Apache including 6 ASF members --
>> the other 70% include a range of scientific code developers, climate
>> scientists that use RCMES, program officers that will help make
>> documentation and slides for the code, and advocate for it in the
>> community. Their experience with Apache ranges from using various ASF
>> products, to contributing patches to them, to not using any ASF software
>> at all.
>> With this diversity, we anticipate that while everything may not just work
>> turnkey out of the box, this represents a unique opportunity to
>> demonstrate Apache to the international community and to show the benefits
>> of its community and social models. That said, we also have a lot of ASF
>> experience to make sure everyone learns the Apache way.
>> === Rationale ===
>> We are bringing RCMES to Apache for a few reasons. First, we feel that it
>> will immediately enable our collaborators across a number of institutions
>> both nationally and internationally have the opportunity to work on a
>> common software base, and to improve it with contributions from their own
>> sites. Currently these are difficult to negotiate now because of varied
>> legal and contribution frameworks -- Apache allows us to simplify this to
>> a unified one. Second, using the ASF's world-wide mirroring system, we
>> will be able to deliver climate software broadly to the community as we
>> release it, rather than sneaker netting the software around or
>> establishing our own point release infrastructure.
>> Bringing this project to Apache also immediately thrusts the ASF into the
>> thriving ecosystem of the Coordinated Regional Downscaling Experiment
>> (CORDEX), the US National Cimate Assessment, the North American Regional
>> Climate Change Assessment Program (the US contribution to CORDEX) and into
>> relevance for upcoming Intergovernmental Panel on Climate Change (IPCC)
>> assessment activities at a number of different institutions. We also seek
>> to help lead and encourage du jour standard development rather than top
>> down level dictating of standards for climate software and the ASF will
>> provide us a means for that.
>> === Initial Goals ===
>> The initial goals of the proposed project are:
>>  * Stand up a sustaining Apache-based community around the JPL RCMES
>> codebase.
>>  * Active relationships and possible cooperation with related projects and
>> communities, including end user and scientific communities, CORDEX,
>>  * Active relationships and possible cooperation with existing Apache
>> communities, e.g., OODT, Hadoop/HIVE, Sqoop, Tika, SIS, etc.
>>  * Initial Apache release.
>>  * Leverage Apache Open Climate Workbench in climate activities at NASA,
>> in the international community as mentioned above, and beyond.
>>  * Vetting all software licenses and making sure IP is clear (software
>> grant from JPL forthcoming).
>>  == Current Status ==
>> === Meritocracy ===
>> 30% of the proposed initial committers are familiar with the meritocracy
>> principles of Apache. As stated above this includes 6 ASF members. Of the
>> mentorship list, we have included Chris Douglas, a PMC member from Hadoop
>> and ASF member to help guide the community. Chris M. and Chris D. have
>> guided a number of projects through the Incubator over the years. The
>> other mentor includes Paul Ramirez, who has experience with the Incubator
>> -- he was a mentor for Apache Any23, and also was  one of the PPMC members
>> and eventual mentor for Apache SIS. The 70% of proposed initial committers
>> that aren't as familiar with Apache have a broad range of experience in
>> other open source projects, and have a deep respect and affinity for the
>> foundation and the work that gets done here. The more experience ASF
>> mentors and project members will help to guide them.
>> === Community ===
>> There is an existing, established community of developers and users of
>> this projet. This includes established communities including the
>> Coordinated Regional Downscaling Experiment, the U.S. National Climate
>> Assessment (NCA), the North American Regional Climate Change Assessment
>> Program (NARCCAP), and more. The Coordinated Regional Climate Downscaling
>> Experiment (CORDEX, is a
>> world wide effort of coordination of regional climate downscaling (RCD)
>> experiments driven by the World Climate Research Program (WRCP,
>> Recently, a large number of RCD
>> projects have been carried out on a large parts of the world. To maximize
>> the benefits of these research activities the WCRP designed a framework
>> (Giorgi, WMO-Bulletin, 2009) focused on "quality-control [of] data sets of
>> RCD-based information for the recent historical past and 21st century
>> projections, covering the majority of populated land regions on the
>> globe".  CORDEX defined different control domains (up to 10,
>> for almost all the populated regions of the
>> world in a way to standardize the experiments and make them comparable. A
>> key region focused on Africa was also designated as the top priority by
>> WRCP. CORDEX also provides a a series of conventions and list of variables
>> that have to be followed by any project that wants to contribute to the
>> experiment. Each CORDEX region has a coordinator and regional and
>> international periodic meetings are scheduled in a way to ensure the
>> global well being. NARCCAP is the U.S. contribution to CORDEX. From the
>> [[|US
>> National Climate Assessment]] site, work is "being conducted under the
>> auspices of the Global Change Research Act of 1990. The GCRA requires a
>> report to the President and the Congress every four years that integrates,
>> evaluates, and interprets the findings of the U.S. Global Change Research
>> Program (USGCRP); analyzes the effects of global change on the natural
>> environment, agriculture, energy production and use, land and water
>> resources, transportation, human health and welfare, human social systems,
>> and biological diversity; and analyzes current trends in global change,
>> both human-induced and natural, and projects major trends for the
>> subsequent 25 to 100 years."
>> Apache Open Climate Workbench will support all of these communities above,
>> with an eye towards being a general purpose climate evaluation toolkit for
>> model output and remote sensing data.
>> === Core Developers ===
>> The initial set of developers comes from various NASA centers (JPL, and
>> Goddard Space Flight Center), NASA HQ, various  Universities participating
>> in CORDEX (Cape Town, University of New South Wales), the Indian Institute
>> of Tropical Meteorology, the Free Univ. Berlin), the University of
>> California Los Angeles, and Howard University. As mentioned previously
>> several of our developers are Apache veterans and understand how it works
>> around here and for those that don't, they will have great mentorship.
>> === Alignment ===
>> Our proposed effort aligns with the U.S. National Climate Assessment, the
>> CORDEX effort, other efforts, including the Earth System Grid Federation,
>> other climate software including the Open Climate GIS toolkit, other
>> science portals for climate including the Climate Information Portal (CIP)
>> at the University of Cape Town, and other related projects.
>> There are also a number of related Apache projects and dependencies, that
>> will be mentioned in the Relationships with Other Apache products section.
>> == Known Risks ==
>> === Orphaned products ===
>> Our project has a history of funding support from JPL, NASA (Applications
>> program/ARRA, NCA, AIST), NSF (ExArch project), international investment
>> from collaborators, and from other funding sources. The funding sources
>> are all target future deliverables and activities, so there is little
>> chance this software and community will be orphaned.
>> === Inexperience with Open Source ===
>> All the initial developers have worked on open source before -- 30% of the
>> proposed initial community are experience with the ASF, and are PMC
>> members and committers on ASF project including 6 ASF members. Our mentors
>> are all ASF members, and we welcome any interest from additional Apache
>> mentors in the effort. Those 70% of our project that aren't Apache
>> committers, PMC members, or members will benefit from the leadership of
>> the other 30% of the project.
>> === Homogenous Developers ===
>> The initial developers come from a variety of backgrounds and with a
>> variety of needs for the proposed framework. Everyone is used to
>> communicating on mailing lists as the project spans timezones,
>> international institutions and centers of excellence for climate science.
>> === Reliance on Salaried Developers ===
>> All of the proposed initial developers are paid to work on this or related
>> projects, but the proposed project is not the primary task for anyone.
>> === Relationships with Other Apache Products ===
>> As mentioned above, RCMES and the Apache Open Climate Workbench already
>> depend on Apache OODT for facade interfaces to underlying data warehouses
>> for storing remote sensing data; and for metadata extraction and
>> transformation. The software also uses Apache Tika for this (through a
>> transitive dependency from OODT). In addition, we have hooks to Apache
>> Hadoop/HIVE, as well as dependencies on Apache Sqoop for dumping out
>> remote sensing data from MySQL and into HIVE.
>> === A Excessive Fascination with the Apache Brand ===
>> All of us are familiar with Apache and have a respect for its brand and
>> community. Chris Mattmann is a big proponent of Apache's sustainability
>> factor -- and it's ability to grow software communities, in an
>> institution, or funding source neutral manner. All of the community have
>> an extreme respect for Apache, including those in our communities who
>> aren't necessarily trained computer scientists, but are Scientists (big
>> "S", e.g., land, physical, Earth/Climate scientists).
>> == Documentation ==
>> The initial RCMES code base will come from the internal JPL Subversion
>> repository. The [[Regional Climate Model Evaluation System (RCMES)
>> project|]] at [[JPL|]]
>> has documentation on the existing software, including links to funding
>> support, communities, and other projects. We will continue to maintain
>> that site at JPL, which is part of the reason for rebranding the project
>> here at Apache with a new name to not interfere with the existing RCMES
>> one that has a following. In addition, we hope to evolve RCMES@JPL to have
>> increasing levels of dependency on Apache Open Climate Workbench, so that
>> we can incrementally transition with little impact to existing customers.
>> In addition, JPL's [[|Climate Data eXchange (CDX)]]
>> website also has documentation on the existing software.
>> == Initial Source ==
>> The project will start with seed code donated by NASA JPL via Mattmann and
>> the rest of the initial committers, which consists of the Regional Climate
>> Model Evaluation System (RCMES) toolkit, and the Climate Data eXchange
>> (CDX) software. This will include the core Python API for RCMET, the RCMED
>> OODT catalog project (which stores remote sensing data to MySQL/PostGIS,
>> and HIVE), and the RCMED extractors for various climate formats. The
>> source will also include Easy-RCMET, the Python Buildout for RCMET. In
>> addition, we will bring along the CDX toolkit, which includes a CDX client
>> package that performs subsetting, access, regridding of climate data; and
>> also includes a Python Buildout installer of its own called Uber CDX.
>> == Source and Intellectual Property Submission Plan ==
>> All seed code and other contributions will be handled through the normal
>> Apache contribution process. Mattmann has been authorized by NASA JPL to
>> lead the contribution of RCMES and CDX into the Incubator via his existing
>> Apache CLA, and a Software Grant to be provided.
>> We will also contact other related efforts for possible cooperation and
>> contributions.
>> == External Dependencies ==
>> Our project depends on a number of external libraries with various
>> licensing conditions. An initial list of such dependencies is shown below.
>> ||<tableclass="bodyTable"rowclass="b">'''Library''' ||'''License''' ||
>> ||<rowclass="b">[[|NCAR NCL]]||MIT compat||
>> ||<rowclass="a">[[|PyNIO]]||MIT compat||
>> ||<rowclass="b">[[|PyNGL]]||MIT compat||
>> ||<rowclass="a">[[|Matplotlib]]||Modified
>>  PSF license||
>> ||<rowclass="b">[[|Scipy]]||MIT compat||
>> ||<rowclass="a">[[|NumPy]]||MIT compat||
>> ||<rowclass="b">[[|HDF5]]||BSD||
>> ||<rowclass="a">[[|NetCDF]]||MI
>> T||
>> == Cryptography ==
>> The project itself will not use cryptography, but it is possible that some
>> of the external software libraries will include cryptographic code to
>> handle features present in various science data formats. If we need to
>> provide an export control statement regarding cryptographic code per
>> Apache policy, we will follow a similar approach by Mattmann in
>> [[|Apache Nutch]] and by Jukka Zitting lead
>> this effort in Apache Tika. Mattmann is familiar with this process.
>> == Required Resources ==
>> Mailing lists
>>  *
>>  *
>>  *
>> Subversion Directory
>>  *
>> Issue Tracking
>> Other Resources
>>  * CLIMATE Wiki
>>  * Review Board instance - CLIMATE
>>  * Jenkins instance - CLIMATE
>> == Initial Committers ==
>> ||'''Name''' ||'''Email''' ||'''Affiliation''' ||'''CLA''' ||
>> ||Chris A. Mattmann ||mattmann at apache dot org
>> ||[[|NASA Jet Propulsion Laboratory]] ||yes ||
>> ||Cameron E. Goodale ||goodale at apache dot org
>> ||[[|NASA Jet Propulsion Laboratory]] ||yes ||
>> ||Paul Ramirez ||pramirez at apache dog org
>> ||[[|NASA Jet Propulsion Laboratory]] ||yes ||
>> ||Andrew F. Hart ||ahart at apache dot org
>> ||[[|NASA Jet Propulsion Laboratory]] ||yes ||
>> ||Jinwon Kim||jkim at atmos dot ucla dot edu
>> ||[[|UCLA Joint Institute for Regional Earth
>> System Science and Engineering]] || no||
>> ||Duane Waliser||duane dot waliser at jpl dot nasa dot gov
>> ||[[|NASA Jet Propulsion Laboratory]] || no ||
>> ||Huikyo Lee||Huikyo dot Lee at jpl dot nasa dot
>> gov||[[|NASA Jet Propulsion Laboratory]] || no ||
>> ||Paul Loikith|| Paul dot C dot Loikith at jpl dot nasa dot gov
>> ||[[|NASA Jet Propulsion Laboratory]] || no ||
>> ||Daniel J. Crichton||crichton at apache dot
>> org||[[|NASA Jet Propulsion Laboratory]] || yes ||
>> ||Kim Whitehall||Kim dot D dot Whitehall at jpl dot nasa dot gov
>> ||[[
>> |Howard University]] || no ||
>> ||Paul Zimdars||pzimdars at apache dot
>> org||[[|NASA Jet Propulsion Laboratory]] || yes ||
>> ||Chris Jack||cjack at csag dot uct dot ac dot
>> za||[[|University of Cape Town]] || no ||
>> ||Bruce Hewitson||hewitson at csag dot uct dot ac dot
>> za||[[|University of Cape Town]] || no ||
>> ||Lluis Fita Borrell||l dot fitaborrell at unsw dot edu dot
>> au||[[|University of New South Wales]] || yes ||
>> ||Jason Evans||jason dot evans at unsw dot edu dot
>> au||[[|University of New South Wales]] || no ||
>> ||Estani Gonzalez||estanislao dot gonzalez at met dot fu-berlin dot de
>> ||[[|Free University Berlin]] || yes ||
>> ||Luca Cinquini||luca dot cinquini at jpl dot nasa dot gov
>> ||[[|NASA Jet Propulsion Laboratory]] || yes ||
>> ||J. Sanjay||sanjay at tropmet dot res dot in ||
>> [[|Indian Institute of Tropical Meteorology]] || yes
>> ||
>> ||M. V. S. Rama Rao||ramarao at tropmet dot res dot in
>> ||[[|Indian Institute of Tropical Meteorology]] ||
>> yes ||
>> ||Tsengdar Lee||tsengdar dot j dot lee at nasa dot gov ||
>> [[|NASA HQ]] || no ||
>> ||Laura Carriere||laura dot carriere at nasa dot gov
>> ||[[|NASA Goddard Space Flight Center]] || no ||
>> ||Denis Nadeau|| denis dot nadeau at nasa dot
>> gov||[[|NASA Goddard Space Flight Center]] || no
>> ||
>> ||Michael Joyce|| Michael dot J dot Joyce at jpl dot nasa dot
>> gov||[[|NASA Jet Propulsion Laboratory]] ||yes||
>> ||Shakeh Khudikyan||Shakeh dot E dot Khudikyan at jpl dot nasa dot
>> gov||[[|NASA Jet Propulsion Laboratory]] ||yes||
>> ||Maziyar Boustani||Maziyar dot Boustani at jpl dot nasa dot
>> gov||[[|NASA Jet Propulsion Laboratory]] ||no||
>> ||Suresh Marru||smarru at apache dot org||[[|Indiana
>> University]] ||yes||
>> == Sponsors ==
>> Champion
>>  * Chris Mattmann (mattmann at apache dot org)
>> Nominated Mentors
>>  * Chris A. Mattmann (mattmann at apache dot org)
>>  * Chris Douglas (cdouglas at apache dot org)
>>  * Paul Ramirez (pramirez at apache dot org)
>> Sponsoring Entity
>>  * Apache Incubator
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail:
>> For additional commands, e-mail:
> -- 
> Andrew F. Hart
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message