lucenenet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From NightOwl888 <...@git.apache.org>
Subject [GitHub] lucenenet issue #179: Analysis work - Standard and Core namespaces (mostly)
Date Mon, 22 Aug 2016 21:00:34 GMT
Github user NightOwl888 commented on the issue:

    https://github.com/apache/lucenenet/pull/179
  
    Okay, this is ready for review/merge now. I have synced it up with the master branch already,
so there are some commits you have already reviewed here. But you can merge the master branch
into your analysis-work branch (you might want to make a backup) to filter them out of the
review.
    
    About 95% of Analysis.Common is passing all tests. Of the 45 failing tests (out of 1405),
the Synonym and Th namespaces account for most of the failures. But there is enough working
functionality here that people will find it useful.
    
    Hunspell took quite a bit of time to get working right, but in the end I ended up using
the pure 4.8.0 implementation without any of the enhancements of more recent versions of Lucene.
I discovered that the dictionary files are easy to find if you search for the file names on
http://www.filewatcher.com/. There are also [several FTP sites](https://github.com/NightOwl888/lucenenet/blob/4d7b23c4269f0348a37fd470a3339befc64332ec/src/Lucene.Net.Tests.Analysis.Common/Analysis/Hunspell/TestAllDictionaries.cs#L34-L36)
where you can grab the OpenOffice dictionaries. As was done in Java, the dictionary binaries
are not part of the repository, and if you want to test Hunspell with real dictionaries you
must enable the tests manually and download the dictionaries yourself. 
    



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

Mime
View raw message