lucenenet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrei Alecu <and...@tachyon-labs.com>
Subject Re: Performance improvements - BitArrays
Date Wed, 17 Dec 2008 00:50:17 GMT
Michael,

Unsafe code is not necessarily required, it's just an extra squeeze of 
performance juice I use for myself. If you need to stay away from unsafe 
code, then it's fine, but by using pointers instead of accessing the 
array in a managed way, you get a pretty nice performance boost in tight 
loops.

You can look at the assembler code that the JIT generates for an array 
lookup vs accessing the same memory location with a pointer, you'll see 
that it's a bit more efficient the pointer way.

But, like I said, all BitArray needs is a more efficient next set bit 
implementation, and access to the underlaying memory store it uses (in 
.NET BitArray's case, an array of ints).

Andrei

Michael Garski wrote:
> In 2.3, the document id is checked in the filter after it is scored and
> before it is passed to the hit collector, which can result in a poor
> performing search executed with a common term and a sparsely populated
> filter.  I created my own filter implementation based off of the
> DocSet/OpenBitSet classes that are in Solr, where the implementation of
> getting the next set bit is very efficient, and does not use unsafe
> code.  With my own filter implementation I was also able to work around
> the memory leak issue with the cached BitArrays that Digy has noted
> earlier.
>
> Filter implementation in Lucene 2.4 is overhauled to allow you to create
> your own filter implementation, defaulting to the OpenBitSet.
> Additionally, I believe the filter is enumerated along with the
> termdocs, leading to faster searches with sparsely populated filters.
>
>
> Michael
>
>   


Mime
View raw message