lucenenet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ben Tregenna <>
Subject Re: What is the status of "DotLucene National Language Support Pack"?
Date Tue, 13 Jun 2006 23:16:05 GMT
George Aroush wrote:
> ...For the bug that you found in CJK
> (which I am confused why CJK is not in Snowball, even the Java version)
> please either submit it as a JIRA item or post the bug and fix here...
Submitted bug and fix to JIRA as suggested. It's keyed as  LUCENENET-5.

Short version:
CJKTokenizer fails to stop at EOS - change the terminating condition
because.NET's TextRead.Read() returns 0 not -1 (as does in

Suggested diff for fix:

CJKTokenizer.cs: 162c162
< if (dataLen == -1)
> if (dataLen == 0)

Long Version:


View raw message