lucenenet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Scott Zhang <getyourconta...@gmail.com>
Subject Re: Hi. Does anyone know how to solve the OutOfMemory Exception during Search?
Date Thu, 02 Jul 2009 18:40:34 GMT
Hi. Wagner Junior.
Thanks for you message.

I was thinking this mailing list is dead. lol.

I was copying the sample code from test/demo application distributed with
lucene.net.

Hits hits = searcher.Search(rootQuery);
iResultCount = hits.Length();

int start = pageNum * pageSize;
int end = System.Math.Min(hits.Length(), start + pageSize);
List<string> bookIdList = new List<string>();
for (int i = start; i < end; i++)
 {
     Document doc = hits.Doc(i);
 }


But when I check lucene.net code.
In Hits.cs, L105
TopDocs topDocs = (sort == null) ? searcher.Search(weight, filter, n) :
searcher.Search(weight, filter, n, sort);

length = topDocs.totalHits;

Then in IndexSearch.cs, L179
There is a statement:
scorer.Score(collector);

The implement of Score function is :(Scorer.cs, L64)
public virtual void  Score(HitCollector hc)
{
while (Next())
{
hc.Collect(Doc(), Score());
}
}

Or BooleanScorer2.cs, L411.
public override void  Score(HitCollector hc)
{
if (allowDocsOutOfOrder && requiredScorers.Count == 0 &&
prohibitedScorers.Count < 32)
{
// fall back to BooleanScorer, scores documents somewhat out of order
BooleanScorer bs = new BooleanScorer(GetSimilarity(), minNrShouldMatch);
System.Collections.IEnumerator si = optionalScorers.GetEnumerator();
while (si.MoveNext())
{
bs.Add((Scorer) si.Current, false, false);
}
si = prohibitedScorers.GetEnumerator();
while (si.MoveNext())
{
bs.Add((Scorer) si.Current, false, true);
}
bs.Score(hc);
}
else
{
if (countingSumScorer == null)
{
InitCountingSumScorer();
}
while (countingSumScorer.Next())
{
hc.Collect(countingSumScorer.Doc(), Score());
}
}
}

So seems no matter what I am using, the implementation of lucene.net always
use "HitCollector". Is this real?


Another thing is I recompiled lucene.net and reupload the dll to my server,
now when search for keyword "book" which give me 30M records count. I
checked w3wp.exe which consumed 1.1G memory which is somewhat abnormal. But
lucene.net doesn't throw OutOfMemory anymore. It is weird.


Thanks.
Regards.
Scott
On Thu, Jul 2, 2009 at 2:20 AM, Wagner Ignacio Pinto Junior <
wagneripjr@hotmail.com> wrote:

>
> Hi Scott,
>
>
>
> I was reading Lucene in Action and it warns us about reading all hits at
> once.
>
>
>
> Do you use hits or HitCollector?
>
>
>
> If you use HitCollector or parses all hits that's the problem.
>
>
>
> Try to page through the hits it uses lazy loading.
>
>
>
>
>
> I'm new to Lucene, so, sorry if I made any mistake here ;)
>
>
>
> Wagner Junior
>
> > Date: Wed, 1 Jul 2009 01:09:55 +0800
> > Subject: Hi. Does anyone know how to solve the OutOfMemory Exception
> during Search?
> > From: getyourcontacts@gmail.com
> > To: lucene-net-dev@incubator.apache.org
> >
> > Hi.I have created an Index by lucene.net which contains 30M documents.
> The
> > result index file is ~4G.
> > Now the problem is, when I search for some keyword which get over many
> > results. Lucene.net get OutOfMemory Exception.
> >
> > I think if we could limit the results eg: 20K results at most could solve
> > this problem.
> >
> > Welcome any solution.
> >
> > Thanks.
> > Regards.
> > Scott
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message