lucenenet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From laimis <...@git.apache.org>
Subject [GitHub] lucenenet pull request: Port CharArrayIterator
Date Fri, 18 Dec 2015 13:53:21 GMT
Github user laimis commented on the pull request:

    https://github.com/apache/lucenenet/pull/157#issuecomment-165783222
  
    I had similar concerns to your @synhershko yet at the same time felt like it was the best
option for us. Some loose thoughts:
    
    - We all want to see some progress with Lucene.Net port and going to port icu4j feels
like a roadblock to that initiate. I better use icu4net and make headway with the Lucene port
than go onto another project port at this stage. Perhaps live with icu4net for as long as
we can but keep on thinking what it will be replaced with down the road?
    
    - Just from taking a very raw look at the icu4j and BreakIterator for instance, it does
not feel like it is a straightforward port. I could be wrong though. Also not sure if we can
do that from the licensing perspective, all of their classes have comment with IBM copyright.
Have no clue about those type of things (see example here: http://source.icu-project.org/repos/icu/icu4j/trunk/main/classes/core/src/com/ibm/icu/text/BreakIterator.java)
    
    - icu4net is not active, true, but it is just a wrapper around a very active ICU4C library.
I was able to take ICU4NET, compile it, etc. So we could fork that project and create our
own nuget packages and wrapper classes if we wanted to. Also that includes building 64 bit
version of ICU4NET by packaging 64 bit version ICU4C libs. None of this I have experience
with, so it is a bit of unknown.
    
    - The best part of ICU4C is that the classes it exposes, so far at least, have been a
perfect match for Analysis at the API level. Porting CharArrayIterator was straightforward.
I started porting SegmentingTokenizerBase in Utils just to see if I run into any issues, and
again from the API perspective did not run into anything, it was straightforward to get to
the point where it compiles (tests are failing, will figure out why :) ).
    
    It just feels like for the sake of making progress ICU4NET is the way to go. It allows
us to port Lucene code and then we can yank ICU4NET out once we feel like we have a good alternative
for it.
    
    What do you think?
    
    While you consider I am continuing to figure out what to do with ICU4NET on CI machine.
It depends on init.ps1 script to be run but nuget restore has a bug where it does not do that
(http://jeffhandley.com/archive/2013/12/09/nuget-package-restore-misconceptions.aspx, search
for "init.ps1" paragraph)
    
    And @jpsullivan thanks for your input, appreciated. We don't have a lot of people around
this so it is good to hear the opinion of others and have someone review the code, etc.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

Mime
View raw message