lucenenet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sergey Shiroky <>
Subject Lucene.Net search speed
Date Sat, 19 Sep 2015 14:36:29 GMT

Sorry for the concern, but I hope to get a help from your side as from Lucene.Net developers.

Now we use in our application Lucene.Net 3.0.3 to index and search by ~2.500.000 items.
Each entity contains 27 searchable field, which added to index in this way: new Field(key,
value, Field.Store.YES, Field.Index.ANALYZED))

Now we have two search options:

1.       Search only by 4 fields using fuzzy search

2.       Search by 4-27 fields using exact search

We have a search service that every week automatically searches by about 53000 people such
"Bob Huston", "Sara Conor", "Sujan Hong Uin Ho", etc.
So we experience slow search speed in 1), it`s an average 4-8 sec in searcher.Search and it`s
our major problem.

Search sample code:

                    var index = FSDirectory.Open(indexPath);
                    var searcher = new IndexSearcher(index, true);
                    this.analyzer = new StandardAnalyzer(Version.LUCENE_30, new HashSet<string>())
                    var queryParser = new MultiFieldQueryParser(Version.LUCENE_30, queryFields,
                    queryParser.AllowLeadingWildcard = false;
                    Query query;
                    query = queryParser.Parse(token);
                    var results = searcher.Search(query, NumberOfResults);// NumberOfResults==500

Our fuzzy search query to find "bob cong hong" in 4 fields:

(((PersonFirstName:bob~0.6) OR (PersonLastName:bob~0.6) OR (PersonAliases:bob~0.6) OR (PersonAlternativeSpellings:bob~0.6))
AND ((PersonFirstName:cong~0.6) OR (PersonLastName:cong~0.6) OR (PersonAliases:cong~0.6) OR
(PersonAlternativeSpellings:cong~0.6)) AND ((PersonFirstName:hong~0.6) OR (PersonLastName:hong~0.6)
OR (PersonAliases:hong~0.6) OR (PersonAlternativeSpellings:hong~0.6)))

Current improvements:
1.    We combined these 4 fields to 1 search field
2.    We decided to use single IndexSearcher in service instead of open in every search request
3.    MergeFactor=2
Total combination of improvements produces about 30-40% speed increasing.

Following this article<> we`ve
made most of possible optimizations:
*         Index is placed on SAS drive which is quite fast:
*         We have enough RAM memory
*         MergeFactor 2
*         Tried to move index to RAMDirectory, but test results aren`t stable, sometimes speed
is the same

We found some expalantion why the performance is bad with fuzzy search in Lucene (every request
takes 6-8 seconds):

Do you know if this issues is fixed in the latest releases of Lucene and can we use you not
stable version of Lucene.Net 4.8 to solve our issue?

Stackoverflow topic<>

Thank you.

Phone (cell): +375 29 881 44 10
Skype: shiroky.sergey
Be professional & be profitable

  • Unnamed multipart/related (inline, None, 0 bytes)
View raw message