incubator-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jörn Kottmann <>
Subject Re: [VOTE] Apache cTAKES 3.0.0-incubating RC5 release
Date Tue, 22 Jan 2013 09:24:52 GMT
On 01/22/2013 12:48 AM, Benson Margulies wrote:
> On Mon, Jan 21, 2013 at 6:05 PM, Masanz, James J.<>  wrote:
>> >
>> >Many of the models are derived from data that is not publicly released. But those*models*
 have been contributed to Apache cTAKES.
>> >
>> >This is the primary mechanism for distribution of these models.
> This discussion has come up before with SpamAssassin and OpenNLP.
> The exception for SpamAssassin, as I have understood it, is that there
> is a place where all members of the community have access to the
> source texts behind the models. At OpenNLP, again according to my
> memory, the solution is to treat the models as not part of the release
> at all.

Yes, thats right, OpenNLP does not release any models at Apache. There is
a page over at SourceForge where people can download them and to help
the people who need to train them self we release the corpora specific 
code as part of the OpenNLP release.

> I fear that cTakes needs to have an interaction with LEGAL to adopt
> the SpamAssassin model, since, from a strict constructionist
> perspective, the source of the models is precisely what you cannot
> release.

As far as I understand it a part of the data is confidential and can't 
be shared
with the community at all, is that right?


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message