lucenenet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sergey Shiroky <Sergey.Shir...@itechart-group.com>
Subject Lucene.Net search speed
Date Sat, 19 Sep 2015 14:36:29 GMT
Hi,

Sorry for the concern, but I hope to get a help from your side as from Lucene.Net developers.

Now we use in our application Lucene.Net 3.0.3 to index and search by ~2.500.000 items.
Each entity contains 27 searchable field, which added to index in this way: new Field(key,
value, Field.Store.YES, Field.Index.ANALYZED))

Now we have two search options:

1.       Search only by 4 fields using fuzzy search

2.       Search by 4-27 fields using exact search

We have a search service that every week automatically searches by about 53000 people such
"Bob Huston", "Sara Conor", "Sujan Hong Uin Ho", etc.
So we experience slow search speed in 1), it`s an average 4-8 sec in searcher.Search and it`s
our major problem.

Search sample code:

                    var index = FSDirectory.Open(indexPath);
                    var searcher = new IndexSearcher(index, true);
                    this.analyzer = new StandardAnalyzer(Version.LUCENE_30, new HashSet<string>())
                    var queryParser = new MultiFieldQueryParser(Version.LUCENE_30, queryFields,
this.analyzer);
                    queryParser.AllowLeadingWildcard = false;
                    Query query;
                    query = queryParser.Parse(token);
                    var results = searcher.Search(query, NumberOfResults);// NumberOfResults==500


Our fuzzy search query to find "bob cong hong" in 4 fields:

(((PersonFirstName:bob~0.6) OR (PersonLastName:bob~0.6) OR (PersonAliases:bob~0.6) OR (PersonAlternativeSpellings:bob~0.6))
AND ((PersonFirstName:cong~0.6) OR (PersonLastName:cong~0.6) OR (PersonAliases:cong~0.6) OR
(PersonAlternativeSpellings:cong~0.6)) AND ((PersonFirstName:hong~0.6) OR (PersonLastName:hong~0.6)
OR (PersonAliases:hong~0.6) OR (PersonAlternativeSpellings:hong~0.6)))

Current improvements:
1.    We combined these 4 fields to 1 search field
2.    We decided to use single IndexSearcher in service instead of open in every search request
3.    MergeFactor=2
Total combination of improvements produces about 30-40% speed increasing.

Following this article<http://wiki.apache.org/lucene-java/ImproveSearchingSpeed> we`ve
made most of possible optimizations:
*         Index is placed on SAS drive which is quite fast:http://accessories.euro.dell.com/sna/productdetail.aspx?c=ie&l=en&s=dhs&cs=iedhs1&sku=400-AHWT#Overview
*         We have enough RAM memory
*         MergeFactor 2
*         Tried to move index to RAMDirectory, but test results aren`t stable, sometimes speed
is the same

We found some expalantion why the performance is bad with fuzzy search in Lucene (every request
takes 6-8 seconds): http://stackoverflow.com/questions/10880976/solr-lucene-fuzzy-search-too-slow

Do you know if this issues is fixed in the latest releases of Lucene and can we use you not
stable version of Lucene.Net 4.8 to solve our issue?

Stackoverflow topic<http://stackoverflow.com/questions/32668049/lucene-net-fuzzy-search-speed>

Thank you.



[MCPD_Win]
Phone (cell): +375 29 881 44 10
Skype: shiroky.sergey
Email: Sergey.Shiroky@itechart-group.com<mailto:Sergey.Shiroky@itechart-group.com>
Web: www.itechart.com<http://www.itechart.com/>
Be professional & be profitable



Mime
  • Unnamed multipart/related (inline, None, 0 bytes)
View raw message