lucenenet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Timothy Januario" <Timothy.Janua...@BDMetrics.com>
Subject RE: lucene Search
Date Tue, 23 Dec 2008 15:05:33 GMT
Tony,
What Neal said is correct about the Wildcard search.  It will work but will be slow if you
put a wildcard at the beginning of your search string because it will need to enumerate all
of the terms in the index.  To give you an example of why this is, think about using a phone
book.  The index of a phone book is on last name - first name so if you search for somebody
by last name you would be able to find them quickly ( a computer obviously even quicker),
but if you search for somebody by first name only, you would have to look at every entry in
the phone book in order to find all valid entries.  Obviously, this brute force approach is
going to give you slower results.  If your index is relatively small, this may not be so bad,
but if you have a large index that needs to scale to many users, this is probably not the
best approach.  Regardless, if you are to take this approach to performing this type of search,
I would make it the exception rather than the rule by putting some type of logic into your
search which would allow the user to explicitly select that they would like to look for all
instances containing a substring rather than just appending an "*" to the front and back of
their search string.  

On a further note, if you are using the queryparser, you can already do this type of search
by adding the wildcard since the queryparser will find them and create the appropriate queries.
-tim

-----Original Message-----
From: tony njedeh [mailto:njedeh@yahoo.com] 
Sent: Monday, December 22, 2008 8:09 PM
To: lucene-net-dev@incubator.apache.org
Subject: RE: lucene Search

Thanks for the reply guys, i would get the book.
 
Anthony

--- On Mon, 12/22/08, Granroth, Neal V. <neal.granroth@thermofisher.com> wrote:

From: Granroth, Neal V. <neal.granroth@thermofisher.com>
Subject: RE: lucene Search
To: "lucene-net-dev@incubator.apache.org" <lucene-net-dev@incubator.apache.org>, "njedeh@yahoo.com"
<njedeh@yahoo.com>
Date: Monday, December 22, 2008, 6:46 PM

Splitting words or other meta-data into searchable terms is the best approach,
taking full advantage of the Lucene data structure and offering the best search
speed.  But it is not always possible to know, ahead of time, how the words or
meta-data should be parsed and split, or from what terms the end-user will want
to construct their search.  In those cases wildcards provide a slower, but
workable solution.

I too highly recommend "Lucene in Action"; though its java-based
unit-test examples might be confusing for less experienced developers.

-- Neal


-----Original Message-----
From: Jokin Cuadrado [mailto:jokin.c@gmail.com]
Sent: Monday, December 22, 2008 4:46 PM
To: lucene-net-dev@incubator.apache.org; njedeh@yahoo.com
Subject: Re: lucene Search

Lucene is a full text search engine, so it index the full words. If
you want to search within words you must index all the posibles search
terms with the proper analyzer.

I suggest you to take a look to the lucene in action book to
understand how the full text index searchs work.

On Mon, Dec 22, 2008 at 8:56 PM, tony njedeh <njedeh@yahoo.com> wrote:
> I know everyone is into the more complex stages of lucene, but I was
wondering if anyone help me with an answer to this.
>
> I am presently using lucene in an asp.net 2.0 framework. If I do a search
for the word man, it finds everything that starts with man but it doesn't 
find anything that ends with Man, like Woman, or SeaMan. Can anyone tell me how
to get lucene to search within words. If its not possible for lucene to do this
search, please can someone let me know
>
> Thank you for your time
>
>

Mime
View raw message