lucenenet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Khindikaynen Aleksey (JIRA)" <j...@apache.org>
Subject [jira] [Closed] (LUCENENET-596) QueryParser produces a wrong query if KeywordRepeatFilter is used in analyzer
Date Wed, 11 Oct 2017 12:46:00 GMT

     [ https://issues.apache.org/jira/browse/LUCENENET-596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Khindikaynen Aleksey closed LUCENENET-596.
------------------------------------------
    Resolution: Not A Problem

> QueryParser produces a wrong query if KeywordRepeatFilter is used in analyzer
> -----------------------------------------------------------------------------
>
>                 Key: LUCENENET-596
>                 URL: https://issues.apache.org/jira/browse/LUCENENET-596
>             Project: Lucene.Net
>          Issue Type: Bug
>          Components: Lucene.Net.Analysis.Common
>    Affects Versions: Lucene.Net 4.8.0
>            Reporter: Khindikaynen Aleksey
>
> Below is a code sample illustrating how to reproduce the issue:
> {code:java}
>             var query = "+FieldName:Value_0";
>             var parser = new QueryParser(LuceneVersion.LUCENE_48, "FieldName", new CustomAnalyzer());
>             var res = parser.Parse(query); 
>     class CustomAnalyzer : Analyzer
>     {
>         protected override TokenStreamComponents CreateComponents(string fieldName, TextReader
reader)
>         {
>             var tokenizer = new LetterOrDigitTokenizer(LuceneVersion.LUCENE_48, reader);
>            
>             TokenStream stream = new StandardFilter(LuceneVersion.LUCENE_48, tokenizer);
>           
>             stream = new KeywordRepeatFilter(stream);
>            
>             return new TokenStreamComponents(tokenizer, stream);
>         }
>     }
>     class LetterOrDigitTokenizer : CharTokenizer
>     {
>         public LetterOrDigitTokenizer(LuceneVersion matchVersion, TextReader input) :
base(matchVersion, input)
>         {
>         }
>         protected override bool IsTokenChar(int c)
>         {
>             return char.IsLetterOrDigit((char)c);
>         }
>     }
> {code}
> Result query is different in 3.0.3 and 4.8 versions:
> Lucene 3.0.3
> +FieldName:"(value value) 0"
> Lucene 4.8 beta 4
> +((FieldName:value FieldName:valu) FieldName:0)
> So if we have a document with FieldName == "0" (without the word "value"), it would be
found with Lucene 4.8 anyway. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message