incubator-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eric Nyberg <>
Subject Re: [VOTE] accept UIMA as a podling
Date Wed, 20 Sep 2006 16:03:15 GMT

I understand from the message traffic that there are some concerns about 
the current state of the UIMA proposal, but I'd like to offer my support 
(and my hope that the issues with the proposal are resolved).

Carnegie Mellon has been building and deploying text analysis programs 
using pluggable components for the last 3-4 years. Large-scale text 
analysis (e.g., for text data mining, populating a knowledge base, etc.) 
requires significant programming at many different levels of text 
representation (segmentation into sentences and tokens; recognition of 
basic entities such as organization names and person names; analysis of 
grammatical structure (parse trees); assignment of domain specific 
meaning to parse trees; etc.).

Until UIMA came along, there was standard for how all these separate 
analysis steps could be integrated, and those of us trying to build 
end-to-end applications had to either write everything ourselves using a 
one-off proprietary design, or spend lots of time writing wrapper code 
to integrate existing components that didn't share the same underlying 
data model.

UIMA provides all the necessary ingredients to ease these issues. The 
data models used by individual components are represented by formal type 
systems; the components themselves implement (or are wrapped by 
implementations of) well-designed abstract interfaces; and tools are 
provided for creating aggregate analysis engines which integrate 
components in (possibly distributed) run-time configurations. The fact 
that IBM has made UIMA open source, and is searching for an appropriate 
open-source development venue, represents a significant opportunity. If 
things continue to move ahead, I expect that the students and staff 
working with me will be contributing cycles to the development effort.

In addition to using UIMA on various R&D projects at CMU-LTI, we're also 
using UIMA in our Software Engineering course to teach architectural 
design for text analysis 
( Our students 
recently created the UIMA Component Repository (, 
which we are promoting as a venue for sharing of completed components, 
type systems, and end-to-end solutions.

Eric Nyberg
Associate Professor
Language Technologies Institute
School of Computer Science
Carnegie Mellon University

Ian Holsman wrote:
> Hi,
> There has been some discussion around the UIMA proposal,
> we feel that all the issues forwarded have been addressed, and we
> would now like to officially propose UIMA to the Incubator for
> consideration.
> The proposal can be found in the Incubator wiki here:
> [ ] +1 Accept UIMA as an Incubator podling
> [ ]  0 Don't care
> [ ] -1 Reject this proposal for the following reason:
> -- 
> Ian Holsman
> -- what do parents know?
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message