lucenenet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "DIGY" <digyd...@gmail.com>
Subject RE: PrefixQuery's rewrite method does not work as expected when used under .Net 2.0
Date Thu, 25 Oct 2007 18:40:29 GMT
Hi Christopher,

Can you open an issue on JIRA and attach your patch to it?

The test case below justifies you. 

	  void WriteQuery()
        {
            string sToIndex = "paraaminosyre";
            string sToSearch = "para";

            Lucene.Net.Store.RAMDirectory ramDir = new
Lucene.Net.Store.RAMDirectory();

            //INDEX
            Lucene.Net.Index.IndexWriter writer = new
Lucene.Net.Index.IndexWriter(ramDir, new
Lucene.Net.Analysis.Standard.StandardAnalyzer());
            Lucene.Net.Documents.Document doc = new
Lucene.Net.Documents.Document();
            doc.Add(new Lucene.Net.Documents.Field("Field1", sToIndex,
Lucene.Net.Documents.Field.Store.YES,
Lucene.Net.Documents.Field.Index.TOKENIZED));
            writer.AddDocument(doc);
            writer.Close();
            
            //SEARCH
            Lucene.Net.Index.IndexReader reader =
Lucene.Net.Index.IndexReader.Open(ramDir);
            Lucene.Net.Search.PrefixQuery pq = new
Lucene.Net.Search.PrefixQuery(new Lucene.Net.Index.Term("Field1", sToSearch
));
            Lucene.Net.Search.Query q = pq.Rewrite(reader);
            Console.WriteLine("#" + q.ToString() + "#");
            reader.Close();
        }

        public void Test()
        {
            System.Threading.Thread.CurrentThread.CurrentCulture =
System.Globalization.CultureInfo.GetCultureInfo("en-us");
            WriteQuery();

            System.Threading.Thread.CurrentThread.CurrentCulture =
System.Globalization.CultureInfo.GetCultureInfo("nn-no");
            WriteQuery();
        }

Output:
#Field1:paraaminosyre#
##


Regards,

DIGY

-----Original Message-----
From: christopher.kolstad@gmail.com [mailto:christopher.kolstad@gmail.com]
On Behalf Of Christopher Kolstad
Sent: Thursday, October 25, 2007 12:02 PM
To: lucene-net-dev@incubator.apache.org
Subject: PrefixQuery's rewrite method does not work as expected when used
under .Net 2.0

Hi.

I'm currently using the latest version from SVN of Lucene.Net 2.1. When
compiling it for .Net 2.0 I ran into an interesting "feature".
Query expansion stops before expected. After a lot of debugging, I found the
culprit to be PrefixQuery's Rewrite method. It uses string.StartsWith
(string);
However in .Net 2.0 this will then break on any box with different culture
settings than English. Since .Net 2.0 StartsWith has the overload
StartsWith(string, StringComparisonType) and this is set to default to
StringComparison.CurrentCulture

Norwegian culture for instance states that 'aa' is the same as the single
letter 'å'

So when my index contains para-orange, paraaminosyre para.... and I do the
PrefixQuery para*

this method fails on the if(term.Text().StartsWith(prefixText)) test. on
paraaminosyre.StartsWith(para).

The fix for me was to change the line in PrefixQuery.cs::Rewrite() from

if(term != null && term.Text().StartsWith(prefixText) && term.Field() ==
prefixField)

to

if (term != null && term.Text().StartsWith(prefixText,
StringComparison.Ordinal) && term.Field() == prefixField)

This is a new "feature" in .Net 2.0 and did not affect my .Net 1.1 projects
which still uses the .Net 1.1 library.


Conditional compilation might be required.
-- 
Regards,
Christopher Kolstad
E-mail: chriswk@ifi.uio.no (University)
christopher.kolstad@gmail.com (Home)
chriswk@ovitas.no (Job)


Mime
View raw message