lucenenet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Itamar Syn-Hershko (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LUCENENET-551) Latin language Stemmer (feature request)
Date Mon, 02 Feb 2015 16:58:35 GMT

    [ https://issues.apache.org/jira/browse/LUCENENET-551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14301457#comment-14301457
] 

Itamar Syn-Hershko commented on LUCENENET-551:
----------------------------------------------

We are currently in the process of porting Lucene 4.8.0. Once we are done we will have plenty
of new languages supported:

https://github.com/apache/lucene-solr/tree/lucene_solr_4_8_0/lucene/analysis/common/src/java/org/apache/lucene/analysis
https://github.com/apache/lucene-solr/tree/lucene_solr_4_8_0/lucene/analysis/common/src/java/org/tartarus/snowball/ext

However, it doesn't seem like this Latin analyzer is supported. When we get to that stage
I will look into it.

> Latin language Stemmer (feature request)
> ----------------------------------------
>
>                 Key: LUCENENET-551
>                 URL: https://issues.apache.org/jira/browse/LUCENENET-551
>             Project: Lucene.Net
>          Issue Type: Improvement
>          Components: Lucene.Net Contrib
>            Reporter: Peter Halasz
>
> I would find a Latin language stemmer very helpful. The Schinke Latin stemming algorithm
has been converted to Snowball here: http://snowball.tartarus.org/otherapps/schinke/intro.html
. I have not worked out how to compile Snowball into .cs to try it.
> There are currently 5 romance-languages supported (French, Spanish, Portuguese, Italian,
Romanian). so if the above doesn't work, I imagine one of these could be modified to support
Latin.
> I realise SF.Snowball is considered a contrib package rather than core, but Lucene.Net
seems to be the main place where Snowball stemmers are provided and maintained for C# / .Net.
> Note, other language ports of Snowball support Latin (using the Schinke contribution),
such as Ruby: https://github.com/aurelian/ruby-stemmer



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message