lucenenet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Digy (JIRA)" <j...@apache.org>
Subject [Lucene.Net] [jira] [Issue Comment Edited] (LUCENENET-415) Contrib/Faceted Search
Date Sat, 21 May 2011 23:25:47 GMT

    [ https://issues.apache.org/jira/browse/LUCENENET-415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13037478#comment-13037478
] 

Digy edited comment on LUCENENET-415 at 5/21/11 11:24 PM:
----------------------------------------------------------

Hi Ben,
About performance test:

- One of the costly ops in this faceted-search is the creation of SimpleFacetedSearch. It
creates the bit sets for all of the group members. Since it should be created only once when
a new IndexReader is opened(if some documents are added or deleted), its creation time should
be excluded from the test.
- Another costly op is the fetching data from index. After each search, some data should be
read and this duration should be included in the test.
Eg.	
{code}
        TopDocs hits = sfs.Search(q, 100);
        for (int j = 0; j < hits.ScoreDocs.Length; j++)
        {
	        Document doc = reader.Document(hits.ScoreDocs[j].doc);
        	Fieldable f = doc.GetField("title");
        }

	SimpleFacetedSearch.Hits hits = sfs.Search(q,maxDocPerGroup);
        foreach (var h in hits.HitsPerGroup)
        {
            	 foreach (Document doc in h.Documents)
        	{
                	Fieldable f = doc.GetField("title");
        	}
        }
{code}
- Hits is a deprecated class and it repeates the search every N (AFAIK 100) document access.
It is not a "normal" search and should be excluded from the test.
- Since you don't delete the index after the test, with every run a new set of "groupByField"
is added to the index.

Thanks,
DIGY

      was (Author: digydigy):
    Hi Ben,
About performance test:

- One of the costly ops in this faceted-search is the creation of SimpleFacetedSearch. It
creates the bit sets for all of the group members. Since it should be created only once when
a new IndexReader is opened(if some documents are added or deleted), its creation time should
be excluded from the test.
- Another costly op is the fetching data from index. After each search, some data should be
read and this duration should be included in the test.
Eg.	
{code}
        TopDocs hits = sfs.Search(q, 100);
        for (int j = 0; j < hits.ScoreDocs.Length; j++)
        {
	        Document doc = reader.Document(hits.ScoreDocs[j].doc);
        	Fieldable f = doc.GetField("title");
        }

	SimpleFacetedSearch.Hits hits = sfs.Search(q,maxDocPerGroup);
        foreach (var h in hits.HitsPerGroup)
        {
            	 foreach (Document doc in h.Documents)
        	{
                	Fieldable f = doc.GetField("title");
        	}
        }
{code}
- Hits is a deprecated class and it repeates the search every N (AFAIK 100) document access.
It is not a "normal" search and should be excluded from the test.

Thanks,
DIGY
  
> Contrib/Faceted Search
> ----------------------
>
>                 Key: LUCENENET-415
>                 URL: https://issues.apache.org/jira/browse/LUCENENET-415
>             Project: Lucene.Net
>          Issue Type: New Feature
>    Affects Versions: Lucene.Net 2.9.4
>            Reporter: Digy
>            Priority: Minor
>         Attachments: PerformanceTest.cs, SimpleFacetedSearch.cs, TestSimpleFacetedSearch.cs
>
>
> Since I see a lot of questions about faceted search in these days, I plan to add a Faceted-Search
project to contrib.
> DIGY

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message