lucenenet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From NightOwl888 <...@git.apache.org>
Subject [GitHub] lucenenet pull request #179: Analysis work - Standard and Core namespaces (m...
Date Mon, 08 Aug 2016 19:23:19 GMT
GitHub user NightOwl888 opened a pull request:

    https://github.com/apache/lucenenet/pull/179

    Analysis work - Standard and Core namespaces (mostly)

    I have completed most of the Standard and Core namespaces and managed to get the number
of failing tests down to 11 (out of 251).
    
    There is still some work to be done on CharArrrayMap and CharArraySet. One (and I suspect
more) of the failing tests are because we are not passing an object reference into CharArraySet
so it gets out of sync with CharArrayMap. Unfortunately it isn't easy because of the CharArrayMap
is generic and CharArraySet is not. Ideas I have been kicking around:
    
    1. Since in Java, you would just call CharArrayMap.Copy(), CharArrayMap.UnmodifiableMap(),
etc. without specifying the generic type of CharArrayMap, perhaps there should be a static
CharArrayMap class that contains these methods (and make the methods themselves generic as
well as extension methods for the CharArrayMap instance).
    2. One option for fixing the referencing issue would be to create a non-generic ICharArrayMap
interface to pass to CharArraySet. The members would just access the generic internal type
as an object member of the interface (which is really no worse than what we have now). I did
this in BoboBrowse.Net, which had *many* issues with generics such as this, and it worked
well.
    
    Unfortunately, I didn't do as much cleanup work as I had hoped to here, but I hope you
will be lenient since this is a WIP branch and I suspect you are chomping at the bit to get
some more progress done on this. I have some urgent business to attend to and might not be
available for a few days so I am leaving you with what I have so far.
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/NightOwl888/lucenenet analysis-work

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/lucenenet/pull/179.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #179
    
----
commit 32ecd80d4d1f9d20760b41d2b7685f10d49f7f0c
Author: Shad Storhaug <shad@shadstorhaug.com>
Date:   2016-08-03T07:10:24Z

    Implemented IsReadOnly property and removed C# 6 features because the project doesn't
compile with VS 2012.

commit 58e4bd8fb4a87bc665e398adfa06c6f04c1a81a4
Author: Shad Storhaug <shad@shadstorhaug.com>
Date:   2016-08-03T07:16:03Z

    Ported WordlistLoader + tests.

commit 09bcbbc4fa4ec73efcf0a2147251350ff84dd453
Author: Shad Storhaug <shad@shadstorhaug.com>
Date:   2016-08-03T10:07:06Z

    Ported StopFilter + tests (except those that depend on StopAnalyzer).

commit b97c804c7a40dc271fa78070ffc20605eae1dd12
Author: Shad Storhaug <shad@shadstorhaug.com>
Date:   2016-08-03T10:20:00Z

    Stop filter cleanup

commit 1dd68c344cce2161af37ff380027b37fc0c737b0
Author: Shad Storhaug <shad@shadstorhaug.com>
Date:   2016-08-03T17:18:37Z

    Ported Analysis.Core.StopAnalzer + tests.

commit 2aad15489a08805e3d20639c881d081507b62e17
Author: Shad Storhaug <shad@shadstorhaug.com>
Date:   2016-08-04T11:18:11Z

    Fixed infinite recursion issue and several other bugs with CharArrayMap and CharArraySet.

commit 0bea87f5aa7b7ed607fd912e4c26be8510cf5c58
Author: Shad Storhaug <shad@shadstorhaug.com>
Date:   2016-08-04T22:35:45Z

    Fixed bug in Core.Util.SPIClassIterator - it was ignoring types that have constructor
parameters, but the AnalysisSPILoader uses parameters.

commit c8aefca23903f0fdc3d45ad83da10ef9df64ae6c
Author: Shad Storhaug <shad@shadstorhaug.com>
Date:   2016-08-05T00:07:01Z

    Added TestStandardFactories and all dependencies so the tests pass.

commit 6b42627ca6bede6fb6fdc430a89f2ac72deb6a1b
Author: Shad Storhaug <shad@shadstorhaug.com>
Date:   2016-08-05T14:58:29Z

    Ported URLEmailTokenizer classes and tests.

commit 52f3b5e4d56576704b3cb69a61abcafa4e5e3152
Author: Shad Storhaug <shad@shadstorhaug.com>
Date:   2016-08-05T17:48:36Z

    Reset all packed chars to Lucene 4.8.0 state to be 100% sure that no errors were made.

commit 3deb0bd487ed3e26af6044e283ff5030d0c654c5
Author: Shad Storhaug <shad@shadstorhaug.com>
Date:   2016-08-05T19:10:12Z

    Ported backward compatibility for StandardTokenizer.

commit f0d32c31ad44d97d42c18b8e8c787e9742e53167
Author: Shad Storhaug <shad@shadstorhaug.com>
Date:   2016-08-05T20:24:25Z

    Ported the ClassicAnalyzer to complete the Analysis.Standard namespace.

commit 3facf5eb23e0d4130204ce3bc4d113472ff1b90e
Author: Shad Storhaug <shad@shadstorhaug.com>
Date:   2016-08-05T20:27:58Z

    Added the LongRunnintTest attribute to TestSegmentingTokenizerBase.TestRandomStrings()
and TestRollingCharBuffer.Test().

commit 35ebd5447c7438413662a8fb08567aa97aeb9405
Author: Shad Storhaug <shad@shadstorhaug.com>
Date:   2016-08-05T23:24:08Z

    Added Analysis.Core.StopFilterFactory + tests. Fixed some bugs in the ClasspathResourceLoader
and CharArraySet in the process.

commit 4114f55e9e2930af448d555464f6eb950fd15f5d
Author: Shad Storhaug <shad@shadstorhaug.com>
Date:   2016-08-06T04:59:35Z

    Core.Util.NamedSPILoader bug fix: Added logic to check for a default constructor before
attempting to instantiate (previously removed from Core.Util.NamedSPIIterator to fix another
bug).

commit 14a8bce634b0e5fd6293d956ac295fdb965ae8b3
Author: Shad Storhaug <shad@shadstorhaug.com>
Date:   2016-08-06T12:44:54Z

    Ported TestAnalyzers.

commit 6d122a64e1a6da32456a79fe8f895a3521525855
Author: Shad Storhaug <shad@shadstorhaug.com>
Date:   2016-08-06T14:17:31Z

    Ported TestClassicAnalyzer

commit 03610c9e37585ff3009798159f868113e4c5d1d2
Author: Shad Storhaug <shad@shadstorhaug.com>
Date:   2016-08-06T15:44:44Z

    Ported TestStandardAnalyzer and fixed a bug in StandardTokenizerImpl40

commit 4838ea4da147b4d678b0787eac7428e4e936018d
Author: Shad Storhaug <shad@shadstorhaug.com>
Date:   2016-08-06T17:01:02Z

    Ported TestFactories and fixed bugs in MockTokenizer

commit fdb922cf32482c3559d7bc671f53dd94b8c03257
Author: Shad Storhaug <shad@shadstorhaug.com>
Date:   2016-08-06T20:17:26Z

    Ported TestUAX29URLEmailAnalyzer and TestUAX29URLEmailTokenizer and added extensions to
assist with reading embedded resources in test projects.

commit 81c7f72c9d464deac8447cc4e4d3c5f9f641a75d
Author: Shad Storhaug <shad@shadstorhaug.com>
Date:   2016-08-06T20:21:21Z

    Ported TestKeywordAnalyzer

commit 96b9f942bcc6d7aea33993b85401056b11a5032d
Author: Shad Storhaug <shad@shadstorhaug.com>
Date:   2016-08-06T20:24:04Z

    Ported TestDuelingAnalyzers

commit bd6dc633a459346659e57751e3b5fdbdb24ac80c
Author: Shad Storhaug <shad@shadstorhaug.com>
Date:   2016-08-06T21:36:55Z

    Ported TypeTokenFilter and TypeTokenFilterFactory + tests.

commit 11edb46dcff9f2ef5be046eaf71a77f847e9dbb4
Author: Shad Storhaug <shad@shadstorhaug.com>
Date:   2016-08-07T22:16:46Z

    Fixed both the CharArrayMap and the test to make the TestCharArrayMap test pass. Both
made the assumption that the return value of Put is the current value, when it is actually
the prior value before setting it.

commit 560aad7ad8e82b3114dae71c25e90a4b8f81c7d8
Author: Shad Storhaug <shad@shadstorhaug.com>
Date:   2016-08-07T22:33:35Z

    Revert "Trying to fix tests" - already mostly completed
    
    This reverts commit 76c4a537df45fe52aca8f305927ef368a895308f.

commit 03b463980b4fd5cd2071fbd24a0d1ed4730e1299
Author: Shad Storhaug <shad@shadstorhaug.com>
Date:   2016-08-07T22:35:28Z

    Merge branch 'analysis-work-stale' into analysis-work

commit f8bbf1295159a476bfe03c04b15f247b5030a210
Author: Shad Storhaug <shad@shadstorhaug.com>
Date:   2016-08-08T00:43:47Z

    Fixed bug in Core.Support.Character.IsLetter(). Char.ConvertFromUtf32() throws an exception
for characters from 0x00d800 to 0x00dfff inclusive, so there is no way to check if they are
a letter. Passing the char directly to Char.GetUnicodeCategory() resolves several test failures
including TestCrossPlatformNormalization (& 2), TestLetterUnicode, and TestLetterUnicodeHuge.

commit 58db5a397e3d49b19a3bebfbb79286560edfb856
Author: Shad Storhaug <shad@shadstorhaug.com>
Date:   2016-08-08T10:23:55Z

    Cleanup comments.

commit 4a4278cde2978dc1c5ed9e2a6b549bc189e3a5ec
Author: Shad Storhaug <shad@shadstorhaug.com>
Date:   2016-08-08T15:41:11Z

    Changed WordlistLoader to automatically close TextReaders as was done in the original.

commit c4c62196d89909438604b57741a80f97e9deec92
Author: Shad Storhaug <shad@shadstorhaug.com>
Date:   2016-08-08T16:01:41Z

    Fixed TextReader bug in RollingCharBuffer

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

Mime
View raw message