lucenenet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andy Berryman" <>
Subject Looking for some help with Programmatic vs Parse Query Building
Date Wed, 14 Feb 2007 21:50:30 GMT
In my application, I was previously building queries as a string and I'm
having to convert over to the API because of the need to use the Wildcard
Query.  I'm running into a few searching issues and they all seem to center
around the fact that the field is of *TEXT* type which means it is Analyzed
when indexed.

Assume that my field name is *Title* and it is of *TEXT* type.  Also assume
that I am using the StandardAnalyzer.

I have a document stored in the index that had the original text of "I was
on the cat-walk".  During the index process, I know that the stop words are
removed and that certain characters are stripped.  So basically, the end
result was that the terms ... "I", "cat", and "walk" ... were stored in the

My previous code was doing the simplest case to get the Query by just
building ... *Title:"I was on the cat-walk"* ... and passing that into the
Parse method.  Since the analyzer is part of that method call, it was doing
all of the necessary stripping within the query for me and thus the search
was working just fine.  It was returning the Query ... *Title:"i cat walk"*.

With the new code, I'm now buidling the query like this ... TermQuery tq =
new TermQuery(new Term("Title", "I was on the cat-walk")) ... And this is
NOT working.  And the reason is because there is no analysis being done on
the string being searched.  I can certainly write a loop pretty simply to do
the stripping of the stop words, but I dont really know what to do about the
special characters.

The main problem I'm looking into is that my end-users are unable to search
for just "cat-walk" and get results.  But if they search for "cat walk",
they get the result you would expect.

Hopefully someone out there has tackled this issue before and can show me an
example of how to do this without having to re-invent the wheel.


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message