lucenenet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From NightOwl888 <...@git.apache.org>
Subject [GitHub] lucenenet issue #191: Migrating Lucene.Net to .NET Core
Date Tue, 13 Dec 2016 22:13:04 GMT
Github user NightOwl888 commented on the issue:

    https://github.com/apache/lucenenet/pull/191
  
    @conniey 
    
    I have nearly finished Highlighter on this branch
    
    https://github.com/NightOwl888/lucenenet/tree/netcoremigration-highlighter
    
    There is still 1 failing test (that doesn't appear to be related to BreakIterator). I
was able to make [hacky solutions for the first 3 issues](https://github.com/NightOwl888/lucenenet/blob/netcoremigration-highlighter/src/Lucene.Net.Highlighter/IcuBreakIterator.cs#L289-L306)
I mentioned [above](https://github.com/apache/lucenenet/pull/191#issuecomment-266510336),
but after the link you provided I am convinced that none of them will suffice for production,
and we will need a `RuleBasedBreakIterator` to be sure we have all of the breaking rules setup
correctly for international support.
    
    You can pull this now if you wish - let me know if you need me to work on that failing
test.
    
    icu-dotnet
    ----
    
    Some classes that I have already ported that you may wish to migrate into icu-dotnet:
    
    - [BreakIterator](https://github.com/NightOwl888/lucenenet/blob/netcoremigration-highlighter/src/Lucene.Net.Core/Support/BreakIterator.cs)
- Note that is one change from the original  - the Text property returns string instead of
CharacterIterator. In hindsight, this change may not have been necessary.
    - [CharacterIterator](https://github.com/NightOwl888/lucenenet/blob/netcoremigration-highlighter/src/Lucene.Net.Core/Support/CharacterIterator.cs)
- a GetTextAsString() method was added primarily so we can get the text to pass on to icu-dotnet,
and the documentation hasn't yet been migrated from Java.
    - [StringCharacterIterator](https://github.com/NightOwl888/lucenenet/blob/netcoremigration-highlighter/src/Lucene.Net.Core/Support/StringCharacterIterator.cs)
- Exactly like Java (but documentation not yet migrated).
    - [IcuBreakIterator](https://github.com/NightOwl888/lucenenet/blob/netcoremigration-highlighter/src/Lucene.Net.Highlighter/IcuBreakIterator.cs)
- If you remove the hacks I put in to emulate the RuleBasedBreakIterator, it would probably
be a useful addition to icu-dotnet.
    - [TestBreakIterator](https://github.com/NightOwl888/lucenenet/blob/netcoremigration-highlighter/src/Lucene.Net.Tests.Highlighter/TestBreakIterator.cs)
- A partial set of tests that verify the sentence and word breaking RuleBasedBreakIterator
rules.
    
    Note that CharacterIterator is a dependency of [CharArrayIterator](https://github.com/NightOwl888/lucenenet/blob/netcoremigration-highlighter/src/Lucene.Net.Analysis.Common/Analysis/Util/CharArrayIterator.cs),
but since that is in Analysis.Common (which already depends on icu-dotnet), we can completely
remove all of these classes from Lucene.Net once the functionality is available in icu-dotnet.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

Mime
View raw message