lucenenet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Björn Kremer <...@patorg.de>
Subject Re: Why does the FastVectorHighlighter crop words?
Date Thu, 01 Nov 2012 13:05:04 GMT
Hello,

I think the analyzer isn't the problem.(I'm using the german). My sample 
was inaccurate. The searched word is not cropped. But the context is 
cropped. So "Herr Karl Theodor Müller" becomes "eodor Müller" if you 
search for the word "Müller". I think this is because of a hardcoded 
"margin" in the "SimpleFragListBuilder.cs"(Function 
CreateFieldFragList). So the FastVectorHighlighter only tries to get the 
previous 6 chars not the complete word. The mailserver has remove my 
attachment. So I uploaded it here: http://pastehtml.com/view/cgw5kbc79.html

Greetings
Björn


Am 01.11.2012 12:56, schrieb Itamar Syn-Hershko:
> What analyzer do you use on this field?
>
>
> On Thu, Nov 1, 2012 at 1:39 PM, Björn Kremer <bkr@patorg.de> wrote:
>
>> Hello,
>>
>> why does the FastVectorHighlighter crop the result words? At the moment it
>> simply does a substring at a given start index. I think it's better to do a
>> simple whitespace tokenization and take the next complete token instead of
>> a cropped word.
>>
>> For example. If the real value is "Hans Müller" the FastVectorHighlighter
>> may crop the word to "ns Müller". At the moment I'm using the attached code
>> to solve my problem.
>>
>>
>> Greetings
>> Björn
>>


Mime
View raw message