lucenenet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Granroth, Neal V." <neal.granr...@thermofisher.com>
Subject RE: Cannot Escape Special charectors Search with Lucene.Net 2.0
Date Fri, 17 Dec 2010 16:05:57 GMT

Robert's correct the StandardAnalyzer will split the input text at the "&&" characters
so your index will not contain them.  As in this simple example:

StandardAnalyzer aa = new StandardAnalyzer();

System.IO.StringReader srs = new System.IO.StringReader("aaa bbb test&&test ccc ddd");

Lucene.Net.Analysis.TokenStream ts = aa.TokenStream(srs);
			
Lucene.Net.Analysis.Token tk;
while( (tk = ts.Next()) != null )
{
   System.Console.WriteLine(String.Format("Token: \"{0}\": S:{1}, E:{2}",
      tk.TermText(),tk.StartOffset(),tk.EndOffset()));
}

The output looks like this:
Token: "aaa": S:0, E:3
Token: "bbb": S:4, E:7
Token: "test": S:8, E:12
Token: "test": S:14, E:18
Token: "ccc": S:19, E:22
Token: "ddd": S:23, E:26

You can see that the "&&" characters were identified as separators and two "test"
tokens were emitted not the single "test&&test" you expected.


- Neal

-----Original Message-----
From: Robert Jordan [mailto:robertj@gmx.net] 
Sent: Friday, December 17, 2010 6:25 AM
To: lucene-net-dev@incubator.apache.org
Subject: Re: Cannot Escape Special charectors Search with Lucene.Net 2.0

On 17.12.2010 12:29, abhilash ramachandran wrote:
> q = new global::Lucene.Net.QueryParsers.QueryParser("content", new
> StandardAnalyzer()).Parse(query);

I believe the issue has nothing to do with your query
syntax. StandardAnalyzer is skipping chars like "&" during
the indexing process, so you can't search for them.

Robert


Mime
View raw message