lucenenet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From laimis <...@git.apache.org>
Subject [GitHub] lucenenet pull request: init bytesref on each iteration
Date Sat, 11 Apr 2015 13:16:22 GMT
GitHub user laimis opened a pull request:

    https://github.com/apache/lucenenet/pull/127

    init bytesref on each iteration

    Fixes a bug where bytes were written to the index incorrectly if Lucene40Codec was used.
The issue can be observed by following the data flow starting here:
    
    https://github.com/apache/lucenenet/blob/master/src/Lucene.Net.TestFramework/Codecs/lucene40/Lucene40DocValuesWriter.cs#L176
    
    values.ToArray() enumeration instead of returning an array of different values, it returns
an array with all members pointing to the same bytesref instance. This broke logic to detect
min / max Lengths down the stream and as a result incorrect bytes were written to an index.
    
    This bug did not show up until the Lucene40DocValuesWriter was rewritten to call ToArray().
Here is a gist illustrating the difference between the two usages with bug in place: https://gist.github.com/laimis/d49291d2f02a7f4830bf
    
    The fix removes  BytesRef value = new BytesRef(); initialization and instead initializes
new bytes ref instance anytime one should be returned.
    
    There might be more places in code base that suffer from this, will need to review the
iterators that use similar approach.
    
    With the fix in place all Lucene40Codec tests pass.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/laimis/lucenenet binarydocvalueswriter_fixes

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/lucenenet/pull/127.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #127
    
----
commit 07648a3b20ea9055bbae91cc5cf529c8fe1fc3f3
Author: Laimonas Simutis <laimis@gmail.com>
Date:   2015-04-11T13:01:53Z

    init bytesref on each iteration

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

Mime
View raw message