lucenenet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Doug Sale" <dougs...@gmail.com>
Subject Re: Exception thrown in MultiPhraseQuery.ExtractTerms
Date Wed, 10 Sep 2008 21:36:56 GMT
Thanks, Jeroen.

This is indeed a bug in Lucene.Net.  System.Collections.Hashtable behavior
is divergent from java.util.HashSet behavior when adding (adding a duplicate
to HashSet replaces the prior added element).  This, then, is not a bug in
Lucene Java.  I will create a JIRA entry containing your patch.

-Doug

On Tue, Sep 9, 2008 at 3:12 AM, Jeroen Lauwers <Jeroen.Lauwers@ctlo.net>wrote:

> Hi,
>
> I think I may have found a bug in MultiPhraseQuery.ExtractTerms().
>
> If the same word occurs twice, an "System.ArgumentException: Item has
> already been added." is thrown.
>
> Original code:
> public override void  ExtractTerms(System.Collections.Hashtable terms)
> {
>      for (System.Collections.IEnumerator iter = termArrays.GetEnumerator();
> iter.MoveNext(); )
>      {
>            Term[] arr = (Term[]) iter.Current;
>            for (int i = 0; i < arr.Length; i++)
>            {
>                  terms.Add(arr[i], arr[i]);
>            }
>      }
> }
>
> Possible patch:
> public override void  ExtractTerms(System.Collections.Hashtable terms)
> {
>      for (System.Collections.IEnumerator iter = termArrays.GetEnumerator();
> iter.MoveNext(); )
>      {
>            Term[] arr = (Term[]) iter.Current;
>            for (int i = 0; i < arr.Length; i++)
>            {
>                  if(!terms.Contains(arr[i]))
>                      terms.Add(arr[i], arr[i]);
>            }
>      }
> }
>
>
> It looks like this a bug in the Java version too. (Or is the behaviour of a
> java Hashtable different???)
> Perhaps we should notify them.
>
> Jeroen
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message