lucenenet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Christopher Currens (Closed) (JIRA)" <j...@apache.org>
Subject [jira] [Closed] (LUCENENET-466) optimisation for the GermanStemmer.vb‏
Date Fri, 23 Mar 2012 02:18:22 GMT

     [ https://issues.apache.org/jira/browse/LUCENENET-466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Christopher Currens closed LUCENENET-466.
-----------------------------------------

    Resolution: Fixed

I've added a new stemmer in trunk called GermanDIN2Stemmer.  You can specify GermanAnalyzer
use it via some new constructors that take a bool indicating if you want to use the DIN-5007-2
stemmer instead of the default DIN-5007-1 stemmer.

This won't break compatibility with users who want to use the old default DIN1 stemmer, but
enables anyone who wants to use the other.
                
> optimisation for the GermanStemmer.vb‏
> --------------------------------------
>
>                 Key: LUCENENET-466
>                 URL: https://issues.apache.org/jira/browse/LUCENENET-466
>             Project: Lucene.Net
>          Issue Type: Improvement
>          Components: Lucene.Net Contrib
>    Affects Versions: Lucene.Net 2.9.4, Lucene.Net 2.9.4g, Lucene.Net 3.0.3
>            Reporter: Prescott Nasser
>            Priority: Minor
>             Fix For: Lucene.Net 3.0.3
>
>
> I have a little optimisation for the GermanStemmer.vb (in 
> Contrib.Analyzers) class. At the moment the function "Substitute" 
> converts the german "Umlaute" "ä" in "a", "ö" in"o" and "ü" in "u". This 
> is not the correct german translation. They must be converted to "ae", 
> "oe" and "ue". So I can write the name "Björn" or "Bjoern" but not 
> "Bjorn". With this optimization a user can search for "Björn" and also 
> find "Bjoern".
>  
> Here is the optimized code snippet:
>  
> else if ( buffer[c] == 'ä' )
>  {
>  buffer[c] = 'a';
>  buffer.Insert(c + 1, 'e');
>  }
>  else if ( buffer[c] == 'ö' )
>  {
>  buffer[c] = 'o';
>  buffer.Insert(c + 1,'e');
>  }
>  else if ( buffer[c] == 'ü' )
>  {
>  buffer[c] = 'u';
>  buffer.Insert(c + 1,'e');
>  }
>  
> Thank You
> Björn

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

Mime
View raw message