lucenenet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From GitBox <...@apache.org>
Subject [GitHub] [lucenenet] NightOwl888 commented on issue #377: Use pooling on BooleanScorer.BucketTable
Date Thu, 15 Apr 2021 03:49:51 GMT

NightOwl888 commented on issue #377:
URL: https://github.com/apache/lucenenet/issues/377#issuecomment-820050358


   Sorry about the late feedback on this. I wanted to research more of the history of this
design choice before I answered, but never found the time.
   
   There are several reasons why making this change may not be a good idea, among them:
   
   1. There is a [specific comment](https://github.com/apache/lucenenet/blob/master/src/Lucene.Net/Search/BooleanScorer.cs#L144-L145)
suggesting that lazy initialization has already been attempted, and this design was chosen
over that in 4.8.0.
   2. In the current version of Lucene, the design of `BooleanScorer` has been changed but
it still relies on a `Bucket` object. Changing the design will make upgrading more difficult.
   3. We would be going from a proven design to an unproven one, which requires extra testing
and verification that will delay the release.
   
   That being said, it would be worth considering as a contribution if:
   
   1. It were implemented as a non-default experimental feature that users can enable if they
(perhaps) prefer scalability over performance
   2. It doesn't break any existing public APIs (additive changes only)
   3. There are full tests
   4. There is adequate documentation
   5. It is 100% a community contribution (we won't delay the release to work on this)
   
   Ideally, we would also have a performance profile of the bucket table design vs the object
pool so users could easily understand which is preferred in a given scenario.
   
   If there is a way to make `BooleanScorer` pluggable it is also worth discussing even if
it might introduce breaking changes.
   
   Frankly, according to [The most misunderstood design pattern – Object Pool](https://blog.aspiresys.pl/technology/how-to-fool-garbage-collector/),
this doesn't sound like the ideal situation to use an object pool in.
   
   > When using the Object Pool pattern you declare one thing and one thing only: “I,
as a developer, know how lifetime of these objects should be handled better than garbage collector.”
And if you actually do, that is fine. If you are not 100% sure about where to use the Object
Pool, here are some indicators:
   >
   > -   Make sure that the cost of creating an instance of the class you desire to pool
is much higher than resetting some of its internal properties.
   > -  The frequency of creating the target class is also high.
   > -  The number of concurrent instances of the target class is relatively small.
   
   And from Microsoft:
   
   > Object pooling doesn't always improve performance:
   
   > -  Unless the initialization cost of an object is high, it's usually slower to get
the object from the pool.
   > -  Objects managed by the pool aren't de-allocated until the pool is de-allocated.
   
   But if presented with data showing a scalability or performance trend is that is better
with this design it would probably be a reasonable option for some people.
   
   
   Also, implementing `IDisposable` on `BulkScorer` doesn't sound like the right choice as
it is possible to use a `Release()` method that doesn't introduce a leaky abstraction. We
aren't dealing with unmanaged resources in this scenario so if they are not disposed at the
end, the GC will clean them up anyway.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



Mime
View raw message