lucenenet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Stewart <Robert_Stew...@epam.com>
Subject Re: [Lucene.Net] Test case for: possible infinite loop bug in portuguese snowball stemmer?
Date Tue, 13 Sep 2011 14:55:19 GMT
Here is a test case:

string text = @"Califórnia";

Lucene.Net.Analysis.KeywordTokenizer tokenizer = new KeywordTokenizer(new StringReader(text));

Lucene.Net.Analysis.Snowball.SnowballFilter stemmer=
                new Lucene.Net.Analysis.Snowball.SnowballFilter(tokenizer, "Portuguese");

Lucene.Net.Analysis.Token token;
            
while ((token = stemmer.Next()) != null)
{
	System.Console.WriteLine(tokenText);
                
}

Seems to go into infinite loop.  Call to stemmer.Next() never returns.  Not sure if this is
the only stemmer I am having trouble with.  And it does happen to us on a near daily basis.
 

Thanks,
Bob


On Sep 13, 2011, at 9:37 AM, Robert Stewart wrote:

> Are there any known issues with snowball stemmers (portuguese in particular) going into
some infinite loop?  I have a problem that happens on a recurring basis where IndexWriter
locks up on AddDocument and never returns (it has taken up to 3 days before we realize it),
requiring manual killing of the process.  It seems to happen only on portuguese documents
from what I can tell so far, and the stack trace when thread is aborted is always as follows:
> 
> System.Threading.ThreadAbortException: Thread was being aborted.
>   at System.RuntimeMethodHandle._InvokeMethodFast(IRuntimeMethodInfo method, Object target,
Object[] arguments, SignatureStruct& sig, MethodAttributes methodAttributes, RuntimeType
typeOwner)
>   at System.RuntimeMethodHandle.InvokeMethodFast(IRuntimeMethodInfo method, Object target,
Object[] arguments, Signature sig, MethodAttributes methodAttributes, RuntimeType typeOwner)
>   at System.Reflection.RuntimeMethodInfo.Invoke(Object obj, BindingFlags invokeAttr,
Binder binder, Object[] parameters, CultureInfo culture, Boolean skipVisibilityChecks)
>   at System.Reflection.RuntimeMethodInfo.Invoke(Object obj, BindingFlags invokeAttr,
Binder binder, Object[] parameters, CultureInfo culture)
>   at Lucene.Net.Analysis.Snowball.SnowballFilter.Next()
> System.SystemException: System.Threading.ThreadAbortException: Thread was being aborted.
>   at System.RuntimeMethodHandle._InvokeMethodFast(IRuntimeMethodInfo method, Object target,
Object[] arguments, SignatureStruct& sig, MethodAttributes methodAttributes, RuntimeType
typeOwner)
>   at System.RuntimeMethodHandle.InvokeMethodFast(IRuntimeMethodInfo method, Object target,
Object[] arguments, Signature sig, MethodAttributes methodAttributes, RuntimeType typeOwner)
>   at System.Reflection.RuntimeMethodInfo.Invoke(Object obj, BindingFlags invokeAttr,
Binder binder, Object[] parameters, CultureInfo culture, Boolean skipVisibilityChecks)
>   at System.Reflection.RuntimeMethodInfo.Invoke(Object obj, BindingFlags invokeAttr,
Binder binder, Object[] parameters, CultureInfo culture)
>   at Lucene.Net.Analysis.Snowball.SnowballFilter.Next()
>   at Lucene.Net.Analysis.Snowball.SnowballFilter.Next()
>   at Lucene.Net.Analysis.TokenStream.IncrementToken()
>   at Lucene.Net.Index.DocInverterPerField.ProcessFields(Fieldable[] fields, Int32 count)
>   at Lucene.Net.Index.DocFieldProcessorPerThread.ProcessDocument()
>   at Lucene.Net.Index.DocumentsWriter.UpdateDocument(Document doc, Analyzer analyzer,
Term delTerm)
>   at Lucene.Net.Index.IndexWriter.AddDocument(Document doc, Analyzer analyzer)
> 
> 
> Is there another list of contrib/snowball issues?  I have not been able to reproduce
a small test case yet however.  Have there been any such issues with stemmers in the past?
> 
> Thanks,
> Bob


Mime
View raw message