lucenenet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Peter Halasz (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (LUCENENET-551) Latin language Stemmer (feature request)
Date Sun, 25 Jan 2015 03:32:36 GMT

     [ https://issues.apache.org/jira/browse/LUCENENET-551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Peter Halasz updated LUCENENET-551:
-----------------------------------
    Description: 
I would find a Latin language stemmer very helpful. The Schinke Latin stemming algorithm has
been converted to Snowball here: http://snowball.tartarus.org/otherapps/schinke/intro.html
. I have not worked out how to compile Snowball into .cs to try it.

There are currently 5 romance-languages supported (French, Spanish, Portuguese, Italian, Romanian).
so if the above doesn't work, I imagine one of these could be modified to support Latin.

I realise SF.Snowball is considered a contrib package rather than core, but Lucene.Net seems
to be the main place where Snowball stemmers are provided and maintained for C# / .Net.

Note, other language ports of Snowball support Latin (using the Schinke contribution), such
as Ruby: https://github.com/aurelian/ruby-stemmer

  was:
I would find a Latin language stemmer very helpful. The Schinke Latin stemming algorithm has
been converted to Snowball here: http://snowball.tartarus.org/otherapps/schinke/intro.html
. I have not worked out how to compile Snowball into .cs to try it.

There are currently 5 romance-languages supported (French, Spanish, Portuguese, Italian, Romanian).
so if the above doesn't work, I imagine one of these could be modified to support Latin.

I realise SF.Snowball is considered a contrib package rather than core, but Lucene.Net seems
to be the main place where Snowball stemmers are provided and maintained for C# / .Net.


> Latin language Stemmer (feature request)
> ----------------------------------------
>
>                 Key: LUCENENET-551
>                 URL: https://issues.apache.org/jira/browse/LUCENENET-551
>             Project: Lucene.Net
>          Issue Type: Improvement
>          Components: Lucene.Net Contrib
>            Reporter: Peter Halasz
>
> I would find a Latin language stemmer very helpful. The Schinke Latin stemming algorithm
has been converted to Snowball here: http://snowball.tartarus.org/otherapps/schinke/intro.html
. I have not worked out how to compile Snowball into .cs to try it.
> There are currently 5 romance-languages supported (French, Spanish, Portuguese, Italian,
Romanian). so if the above doesn't work, I imagine one of these could be modified to support
Latin.
> I realise SF.Snowball is considered a contrib package rather than core, but Lucene.Net
seems to be the main place where Snowball stemmers are provided and maintained for C# / .Net.
> Note, other language ports of Snowball support Latin (using the Schinke contribution),
such as Ruby: https://github.com/aurelian/ruby-stemmer



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message