lucenenet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vanlerberghe, Luc" <>
Subject Problems with patch for LUCENENET-106
Date Wed, 31 Dec 2008 09:48:04 GMT

I reviewed the fix for LUCENENET-106 and I see a couple of issues:

1. The implementation of WeakHashMap in SupportClass.cs has a bug in that it assumes all keys
will have a unique HashCode.  Although the chances of this occurring are *very* small, it
cannot be guaranteed. A collision would mean that the cached value for one indexReader could
be returned as cached value for another indexReader (using the way this class is used in Lucene
as example)

2. This class attempts to be a complete port of the java.util.WeakHashMap, but:
- A correct implementation is difficult to implement, and .Net does not have an equivalent
of java.lang.ref.ReferenceQueue to accelerate the cleanup process.
- Since the behaviour of the WeakHashMap is closely tied to the behaviour of the garbage collector,
it is very difficult to test properly, if at all.
- Lucene only uses the get(Object key) and put(K key, V value) methods
- In Lucene, the keys are IndexReader instances.  In normal use cases there will be only 1
IndexReader live, and perhaps another one if 'warming up' is used before switching IndexSearchers
after an index update.
- If this class is presented as a complete port of java.util.WeakHashMap, users might assume
this is production quality code and copy/paste its code in their own projects.  Any further
bugs found might harm the credibility of the Lucene project, even if those sections are never
used in Lucene. (e.g.: The collection returned by Keys might contain null values.  There's
no guarantee the GC won't intervene between the call to Cleanup and the calls to keys.Add(w.Key.Target).

3. Most support classes that are needed for the conversion of java to .Net are in the anonymous
namespace, but this one is in Lucene.Net.Util.  I would propose to keep the namespaces corresponding
to the original java packages clean and either put all support classes in the anonymous namespace
or in a Lucene.Net specific one (Lucene.Net.DotNetPort ?) (I see that George moved it already
to SupportClass, thanks George!)

I have been using the java version of Lucene in a project for a long time.
For a new project that is currently under development, we will be using the .Net version.
For now we are using the 2.1 version using a custom patch to avoid the memory leak problem.

I'll post an up to date version of this patch as a replacement for the current implementation
of Lucene.Net.Util.WeakHashMap sometime next week...
- It doesn't try to be a complete implementation of WeakHashMap: only the methods strictly
necessary are implemented, all other methods throw a NotImplementedException. (It should probably
be renamed to SimplifiedWeakHashMap or something)
- It's optimized for the 'low number of keys' case.
- It's simple code, but I want to test it thoroughly first.  The original patch was put directly
in the FieldCacheImpl code, I want to make sure I didn't make some stupid mistake while converting

I have an account on Jira, but I don't have rights to reopen the issue...



-----Original Message-----
From: Digy (JIRA) [] 
Sent: zaterdag 27 december 2008 19:47
Subject: [jira] Closed: () Lucene.NET (Revision: 603121) is leaking memory


Digy closed LUCENENET-106.

    Resolution: Fixed
      Assignee: Digy

Patches are applied.


> Lucene.NET (Revision: 603121) is leaking memory
> -----------------------------------------------
>                 Key: LUCENENET-106
>                 URL:
>             Project: Lucene.Net
>          Issue Type: Bug
>         Environment: .NET 2.0
>            Reporter: Anton K.
>            Assignee: Digy
>            Priority: Critical
>         Attachments: DIGY-FieldCacheImpl.patch, Digy.rar, luceneSrc_memUsage.patch, Paches
for v2.3.1.rar, WeakHashTable+FieldCacheImpl.rar, WeakReferences.rar
> readerCache Hashtable field (see FieldCacheImpl.cs) never releases some hash items that
have closed IndexReader object as a key. So a lot of Term instances are never released.
> Java version of Lucene uses WeakHashMap and therefore doesn't have this problem.
> This bug can be reproduced only when Sort functionality used during search. 
> See following link for additional information.
> Steps to reproduce:
> 1)Create index
> 2) Modify index by IndexWiter; Close IndexWriter
> 3) Use IndexSearcher for searching with Sort; Close InexSearcher
> 4) Go to step 2
> You'll get OutOfMemoryException after some time of running this algorithm.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message