lucenenet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael Garski (JIRA)" <j...@apache.org>
Subject [jira] Created: (LUCENENET-80) MultiSearcher with BooleanQuery
Date Thu, 16 Aug 2007 23:27:30 GMT
MultiSearcher with BooleanQuery 
--------------------------------

                 Key: LUCENENET-80
                 URL: https://issues.apache.org/jira/browse/LUCENENET-80
             Project: Lucene.Net
          Issue Type: Bug
         Environment: Windows Server 2003, Lucene.Net 2.0
            Reporter: Michael Garski


When using a MultiSearcher with a HitCollector and a BooleanQuery with two identical terms
an ArgumentException is thrown.  This does not happen when using Hits.  Sample code to reproduce:

		public void TestBoolean(string index)
		{
			IndexSearcher searcher = new IndexSearcher(index);
			SimpleCollector sc = new SimpleCollector();
			QueryParser qp = new QueryParser("Body", new StandardAnalyzer());
			searcher.Search(qp.Parse("test AND test"), null, sc);

			sc.Hits = 0;
			MultiSearcher ms = new MultiSearcher(new Searchable[] { searcher });
			ms.Search(qp.Parse("test AND test"), null, sc);
		}

	public class SimpleCollector : HitCollector
	{
		public int Hits = 0;
		public override void Collect(int doc, float score)
		{
			Hits++;
		}
	}

The stack trace on the exception is:
System.ArgumentException: Item has already been added. Key in dictionary: 'Body:test'  Key
being added: 'Body:test'
   at System.Collections.Hashtable.Insert(Object key, Object nvalue, Boolean add)
   at System.Collections.Hashtable.Add(Object key, Object value)
   at Lucene.Net.Search.TermQuery.ExtractTerms(Hashtable terms)
   at Lucene.Net.Search.BooleanQuery.ExtractTerms(Hashtable terms)
   at Lucene.Net.Search.BooleanQuery.ExtractTerms(Hashtable terms)
   at Lucene.Net.Search.MultiSearcher.CreateWeight(Query original)

The issue is within TermQuery.ExtractTerms(Hashtable terms), where the term is added to the
Hashtable when the key already exists.  The Java version uses a Set, which allows a key to
be added twice to the collection. 

A change in TermQuery from:

		public override void  ExtractTerms(System.Collections.Hashtable terms)
		{
            Term term = GetTerm();
			terms.Add(term, term);
		}

to:

		public override void  ExtractTerms(System.Collections.Hashtable terms)
		{
            Term term = GetTerm();
			terms[term] = term;
		}

Will correct this issue.  You could check the Hastable using ContainsKey, but this case should
be rare and overwriting the previous term in the collection would probably be better.





-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message