lucenenet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "George Aroush" <geo...@aroush.net>
Subject RE: Question about query performance degredation
Date Mon, 06 Nov 2006 01:01:26 GMT
Hi Andy,

I am glad to see you got this solved.  How long did it take to optimize the
index?  I think you are trying keep your searcher fresh with the index
within 10 minute, right?  So if the optimization took longer then 10
minutes, you may have a new problem.  (Lets discuss this in the other email
thread.)

Regards,

-- George Aroush

-----Original Message-----
From: Andy Berryman [mailto:topdev1@gmail.com] 
Sent: Thursday, November 02, 2006 12:55 PM
To: lucene-net-dev@incubator.apache.org
Subject: Re: Question about query performance degredation

I dont really have the ability to just perform the operation in my
"production" environment.  So what I did to test this theory out was copy
one of my indexes from "production" down to my local machine.  I then setup
a test app to do the following:

- run 10 identicle searches against the index and output the hit count and
search time for each
- optimize the index
- run the same 10 searches over again

And what I saw was pretty astounding.  The results improved by almost 60%
and the size of the index shrunk by about 50%.

So I'm gonna guess that fragmentation is the key factor here.  So what I
think that I'm going to end up doing is adding a step into my indexing
process to optimize the index once every couple of days.  That should give
me some pretty nice results without adding too much overall load to the
system.

Thanks for the guidance
Andy

On 11/1/06, George Aroush <george@aroush.net> wrote:
>
> Hi Andy,
>
> Yes, please, let us know how it goes when you optimize.  If that 
> doesn't help, after optimizing, stop indexing for a bit.  Even a 
> better stop the indexer application, and re-start the searcher.  I.e.: 
> a reboot of your application with the indexer out of your way.
>
> Regards,
>
> -- George Aroush
>
> -----Original Message-----
> From: Andy Berryman [mailto:topdev1@gmail.com]
> Sent: Wednesday, November 01, 2006 9:29 AM
> To: lucene-net-dev@incubator.apache.org
> Subject: Re: Question about query performance degredation
>
> I'm maintaining the index at a pretty constant rate throughout the day.
> Right now its possible that at least 1 document is getting updated 
> every 10 minutes.  (The background process I am using runs every 10 
> minutes to look for changes that need to be indexed.)
>
> I my specific case ... For a document that I need to "update" in the 
> index ... I make a call to delete the document first and then I create 
> a new document (with the updated info from the database) and add it 
> into the index.
>
> As for optimizing ... Currently I am not making any calls to "Optimize()".
>
> So I guess your first suggestion would be to optimize the index and 
> check the query performance after that?
>
> Thanks
> Andy
>
>
> On 10/31/06, George Aroush <george@aroush.net> wrote:
> >
> > Hi Andy,
> >
> > I believe you are on the right track, index fragmentation maybe your 
> > issue.
> >
> > How frequently are you updating the index, vs. how frequently are 
> > you optimizing it?  Is the update adding new documents vs. modifying 
> > existing documents?
> >
> > If after optimizing you still don't get back the original 
> > performance, stop indexing for a bit and see if search gets better.
> >
> > If fragmentation is your issue, I have some suggestions that may 
> > work for you.
> >
> > Regards,
> >
> > -- George
> >
> > -----Original Message-----
> > From: Andy Berryman [mailto:topdev1@gmail.com]
> > Sent: Tuesday, October 31, 2006 1:25 PM
> > To: lucene-net-user@incubator.apache.org;
> > lucene-net-dev@incubator.apache.org
> > Subject: Question about query performance degredation
> >
> > I have a scenario where I'm seeing the performance (specifically 
> > time) of searches against my index degrade on a daily basis.  The 
> > amount of time it is taking to load the index is staying fairly 
> > constant however.  This is a fairly large index.  It has over a 
> > million documents
> in it.
> >
> > The scenario I have is that I'm maintaining the index from data in 
> > the database ... and I'm doing so on onstant basis.  So essentially 
> > as changes are made in the database I have a background task that 
> > updates
> the
> index.
> > So I'm supporting concurrent readers and writers on a constant basis 
> > throughout the day.  I'm NOT using compound files.  During my 
> > development and testing, the use of compound files caused a 
> > significant increase in Disk I/O usage and caused the maintenance of 
> > the index to take much longer.  As such ... I decided against them.
> >
> > My thoughts are that the reason the search is taking longer is 
> > because the index files are getting more and more "fragmented" over 
> > time because I'm not using the compound files.  And that's why the 
> > searches are taking longer.
> >
> > Thoughts?
> >
> > Thanks
> > Andy
> >
> >
>
>


Mime
View raw message