lucenenet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shad Storhaug <s...@shadstorhaug.com>
Subject Remaining Work/Priorities
Date Sun, 02 Oct 2016 19:01:02 GMT
Hello,

I just wanted to open this discussion to talk about the work remaining to be done on Lucene.Net
version 4.8.0. We are nearly there, but that doesn't mean we don't still need help!


FAILING TESTS
-------------------

We now have over 5000 passing tests and as soon as pull request #188 (https://github.com/apache/lucenenet/pull/188)
is merged, by my count we have only 20 (actual) failing tests. Here is the breakdown by project:

Lucene.Net (Core) - 15 failing / 1989 total
Lucene.Net.Analysis.Common - 0 failing / 1445 total
Lucene.Net.Classification - 0 failing / 9 total
Lucene.Net.Expressions - 0 failing / 94 total
Lucene.Net.Facet - (including #188 will be) 0 failing / 152 total
Lucene.Net.Join - 0 failing / 27 total
Lucene.Net.Memory - 0 failing / 10 total
Lucene.Net.Misc - 2 failing / 42 total
Lucene.Net.Queries - 2 failing / 96 total
Lucene.Net.QueryParser - 1 failing / 203 total
Lucene.Net.Suggest - 0 failing / 142 total

The reason why I said ACTUAL tests above is because I recently discovered that many of the
"failures" that are being reported are false negatives (in fact, the VS2015 NUnit test runner
shows there are 135 failing tests total and 902 tests total that don't belong to any project).
Most NUnit 2.6 test runners do not correctly run tests in shared abstract classes with the
correct context (test setup) to make them pass. These out-of-context runs add several additional
minutes to the test run.

As an experiment, I upgraded to NUnit 3.4.1 and it helped the situation somewhat - that is,
it ran the tests in the correct context and I was able to determine that we have more tests
than the numbers above and they are all succeeding. However, it also ran the tests in an invalid
context (that is, the context of the abstract class without any setup) and some of them still
showed as failures.

I know @conniey is currently working on porting the tests over to xUnit. Hopefully, swapping
test frameworks alone (or using some of the new fancy test attributes) is enough to fix this
issue. If not, we need to find another solution - preferably one that can be applied to all
of the tests in abstract classes without too much effort or changing them so they are too
different from their Java counterpart.

Remaining Pieces to Port
---------------------------------

I took an inventory of the remaining pieces left to port a few days ago and here is what that
looks like (alphabetical order):

1. Analysis.ICU (Depends on ICU4j)
2. Analysis.Kuromoji
3. Analysis.Morfologik (Depends on Morfologik)
4. Analysis.Phonetic (Depends on Apache Commons)
5. Analysis.SmartCN
6. Analysis.Stempel (currently in progress)
7. Analysis.UIMA (Depends on Tagger, uimaj-core, WhiteSpaceTokenizer)
8. Benchmark (many dependencies)
9. Demo
10. Highlighter (Depends on Collator (which is still being ported) and BreakIterator (which
we don't have a solution that works in .NET core yet))
11. Replicator (many dependencies)
12. Sandbox (Depends on Apache Jakarta)
13. Spatial (Already ported in #174 (https://github.com/apache/lucenenet/pull/174), needs
a recent version of spatial4n)
14. QueryParser.Flexible

Itamar, it would be helpful if you would be so kind as to organize this list in terms of priority.
It also couldn't hurt to update the contributing documents (https://github.com/apache/lucenenet/blob/master/CONTRIBUTING.md,
and https://cwiki.apache.org/confluence/display/LUCENENET/Current+Status with the latest information
so anyone who wants to help out knows the current status.

Of course, it is the known status of dependencies that we need clarification on. Which of
these dependencies is known to be ported? Which of them are ported but are not up to date?
Which of them are known not to be ported, and which of them are unknown?


Public API Inconsistencies
---------------------------------

One thing that I have had my eye on for a while now is the .NETification/consistency of the
core API (that is, in the Lucene.Net project). There are several issues that I would like
to address including:


1.       Method names that are still camelCase

2.       Properties that should be methods (because they do a lot of processing or because
they are non-deterministic)

3.       Methods that should be properties

4.       .Size() vs .Size vs .Count - should generally all be .Count in .NET

5.       Interfaces should begin with "I"

6.       Classes should not begin with "I" followed by another capital letter (for some reason
some of them were named that way)

7.       .CharAt() should probably be this[]

8.       Generic types nested within generic types (which cause Visual Studio to crash when
Intellisense tries to read them)

... and so on. The only thing is these are all sweeping changes that will affect everyone
helping out on Lucene.Net and anyone who is currently using the beta. So, I just wanted to
gather some input on when the most appropriate time to begin working on these sweeping changes
would be?


Thanks,
Shad Storhaug (NightOwl888)







Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message