lucenenet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Timothy Januario" <>
Subject RE: lucene Search
Date Tue, 23 Dec 2008 17:01:46 GMT
The QueryParser has a method SetAllowLeadingWildcard that needs to be passed a true.

-----Original Message-----
From: tony njedeh [] 
Sent: Tuesday, December 23, 2008 11:55 AM
Subject: RE: lucene Search

Hi Timothy,
We wanted to give the users the option of searching for the words before and after, I know
it would be a slower search but the users insisted on the that type of search.
Presently, I am using a query parser, as I mentioned to Neal and Jokin, I tried adding a* before
the word and i got this error
Lexical error at line 1, column 1.  Encountered: "*" (42), after : "" 
This is how I call the lucene search below
Dim searcher As IndexSearcher = New IndexSearcher(path)
Dim queryParser As QueryParser = New QueryParser("texts", New StandardAnalyzer)
Dim query As Query = queryParser.Parse("*man")
Dim hits As Hits = searcher.Search(query)
If I put * after the word, it works perfectly. I got the lucene in action book, but its in
Java and thats not my strongest suit, but I would try and figure it out.

--- On Tue, 12/23/08, Timothy Januario <> wrote:

From: Timothy Januario <>
Subject: RE: lucene Search
Date: Tuesday, December 23, 2008, 10:05 AM

What Neal said is correct about the Wildcard search.  It will work but will be
slow if you put a wildcard at the beginning of your search string because it
will need to enumerate all of the terms in the index.  To give you an example of
why this is, think about using a phone book.  The index of a phone book is on
last name - first name so if you search for somebody by last name you would be
able to find them quickly ( a computer obviously even quicker), but if you
search for somebody by first name only, you would have to look at every entry in
the phone book in order to find all valid entries.  Obviously, this brute force
approach is going to give you slower results.  If your index is relatively
small, this may not be so bad, but if you have a large index that needs to scale
to many users, this is probably not the best approach.  Regardless, if you are
to take this approach to performing this type of search, I would make it the
exception rather than the rule by putting some type of logic into your search
which would allow the user to explicitly select that they would like to look for
all instances containing a substring rather than just appending an "*"
to the front and back of their search string.  

On a further note, if you are using the queryparser, you can already do this
type of search by adding the wildcard since the queryparser will find them and
create the appropriate queries.

-----Original Message-----
From: tony njedeh [] 
Sent: Monday, December 22, 2008 8:09 PM
Subject: RE: lucene Search

Thanks for the reply guys, i would get the book.

--- On Mon, 12/22/08, Granroth, Neal V. <>

From: Granroth, Neal V. <>
Subject: RE: lucene Search
To: ""
<>, ""
Date: Monday, December 22, 2008, 6:46 PM

Splitting words or other meta-data into searchable terms is the best approach,
taking full advantage of the Lucene data structure and offering the best search
speed.  But it is not always possible to know, ahead of time, how the words or
meta-data should be parsed and split, or from what terms the end-user will want
to construct their search.  In those cases wildcards provide a slower, but
workable solution.

I too highly recommend "Lucene in Action"; though its java-based
unit-test examples might be confusing for less experienced developers.

-- Neal

-----Original Message-----
From: Jokin Cuadrado []
Sent: Monday, December 22, 2008 4:46 PM
Subject: Re: lucene Search

Lucene is a full text search engine, so it index the full words. If
you want to search within words you must index all the posibles search
terms with the proper analyzer.

I suggest you to take a look to the lucene in action book to
understand how the full text index searchs work.

On Mon, Dec 22, 2008 at 8:56 PM, tony njedeh <> wrote:
> I know everyone is into the more complex stages of lucene, but I was
wondering if anyone help me with an answer to this.
> I am presently using lucene in an 2.0 framework. If I do a search
for the word man, it finds everything that starts with man but it doesn't 
find anything that ends with Man, like Woman, or SeaMan. Can anyone tell me how
to get lucene to search within words. If its not possible for lucene to do this
search, please can someone let me know
> Thank you for your time

View raw message