lucenenet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shad Storhaug <s...@shadstorhaug.com>
Subject RE: [Vote] Apache Lucene.Net 4.8.0-beta00002
Date Sat, 13 May 2017 16:55:32 GMT
Well, I am going to have to be the nay-sayer this time

-1

It turns out the "facet package issue" is actually a problem that causes index corruption
with many of the codecs, including the default Lucene46Codec. It will happen somewhat rarely
and randomly when reading or writing binary doc values because of a rounding bug, and when
it occurs during writing it causes index corruption. Since the bug is in the MonotonicBlockPacked**
classes, it means several codecs are affected.

The main issue is that it causes differences between how 32 and 64 bit applications read and
write indexes and I have confirmed with Lucene it is not supposed to be doing this.

The fix I have come up with is to use System.Numerics.BigInteger in order to work around the
loss of precision when doing mathematical operations on a float data type. For example, this
line (https://github.com/apache/lucenenet/blob/14c3e7b1262a62727b532ea22f093d7d99cb2311/src/Lucene.Net/Util/Packed/MonotonicBlockPackedReader.cs#L84)
becomes

return minValues[block] + (long)BigInteger.Multiply(new BigInteger(idx), new BigInteger(averages[block]))
+ BlockPackedReaderIterator.ZigZagDecode(subReaders[block].Get(idx));

and this line (https://github.com/apache/lucenenet/blob/14c3e7b1262a62727b532ea22f093d7d99cb2311/src/Lucene.Net/Util/Packed/MonotonicBlockPackedWriter.cs#L82)
becomes

m_values[i] = ZigZagEncode(m_values[i] - min - (long)(BigInteger.Multiply(new BigInteger(avg),
new BigInteger(i))));

If anyone has any other ideas how this can be done without adding a dependency on System.Numerics,
I would appreciate the suggestion. I have tried several other approaches including casting
to double/decimal before the multiplication to no avail. The math is correct until it is cast
to a long, at which point it loses precision. This is the solution I ended up with for the
SimpleText codec and it seems to work here, too. And according to MSDN, a common solution
is to use a BCD library to maintain precision (https://msdn.microsoft.com/en-us/library/c151dt3s.aspx).



Thanks,
Shad Storhaug (NightOwl888)



-----Original Message-----
From: itamar.synhershko@gmail.com [mailto:itamar.synhershko@gmail.com] On Behalf Of Itamar
Syn-Hershko
Sent: Friday, May 12, 2017 7:54 PM
To: dev@lucenenet.apache.org
Subject: Re: [Vote] Apache Lucene.Net 4.8.0-beta00002

+1. Thanks again for great work.

There is the Facet package issue I just forwarded to the list, plus some small tweaking that
I'd like us to add to the next release (still beta), but that's not a blocker for a first
official beta version.

Yay!

--

Itamar Syn-Hershko
Freelance Developer & Consultant
Elasticsearch Partner
Microsoft MVP | Lucene.NET PMC
http://code972.com | @synhershko <https://twitter.com/synhershko> http://BigDataBoutique.co.il/

On Wed, May 10, 2017 at 2:57 PM, Shad Storhaug <shad@shadstorhaug.com>
wrote:

> Due to a severe concurrency bug in 4.8.0-beta00001 we are rolling 
> another release. Note this is not the same release that we just voted 
> on, but a patch for it with a new version number. We've decided to 
> postpone the official announcement until this patch is live.
>
>
>
> The source and binary packages are available for inspection at:
> https://dist.apache.org/repos/dist/dev/lucenenet/.
>
>
>
> There is a MyGet feed that can be accessed at:
>
> V2: https://www.myget.org/F/lucene-net-nuget/api/v2 (VS2012+)
>
> V3: https://www.myget.org/F/lucene-net-nuget/api/v3/index.json 
> (VS2015+)
>
>
>
> The tag is: https://github.com/apache/lucenenet/releases/tag/Lucene.
> Net_4_8_0_beta00002
>
>
>
>
>
> Please review the beta and vote (build and test instructions to follow).
>
> This vote will close no sooner than 72 hours from now, i.e. sometime 
> after
> 12:00 UTC 13-May 2017
>
>
>
> +1 - release it, already!
>
> 0 - indifferent
>
> -1 - Not ready, because...
>
>
>
> --------------------------------------------------
>
>
>
> Building and Testing (Windows Only)
>
>
>
> --------------------------------------------------
>
>
>
> CLI - Prerequisites
>
>
>
> 1.       Powershell 3.0 or higher
>
> 2.       .NET Framework 4.5.1 Developer Pack (
> https://www.microsoft.com/en-us/download/details.aspx?id=40772)
>
>
>
> Command (from the project root)
>
>
>
> build     (run build and create NuGet packages in the
> release/NugetPackages directory)
>
> build --test     (run build and create NuGet packages in the
> release/NugetPackages directory, run all tests and put results in 
> release/TestResults directory)
>
> build -t     (same as above with shorter syntax)
>
>
>
> --------------------------------------------------
>
>
>
> Visual Studio - Prerequisites
>
>
>
> 1.       Visual Studio 2015 Update 3
>
> 2.       .NET Framework 4.5.1 Developer Pack (
> https://www.microsoft.com/en-us/download/details.aspx?id=40772)
>
> 3.       1.1 with SDK Preview 2.1 build 3177 (https://github.com/dotnet/
> core/blob/master/release-notes/download-archive.md)
>
> 4.       NUnit3 Test Adapter (https://marketplace.visualstudio.com/items?
> itemName=NUnitDevelopers.NUnit3TestAdapter)
>
>
>
> Use Lucene.Net.sln for .NET Framework 4.5.1. Use 
> Lucene.Net.Portable.sln for .NET Standard/.NET Core.
>
>
>
> NOTE: To compile in .NET Core, you may need to first run "dotnet restore"
> from the CLI before opening the solution in Visual Studio 2015. Visual 
> Studio 2017 is not supported.
>
>
>
> --------------------------------------------------
>
>
>
>
>
> Thanks,
>
> Shad Storhaug (NightOwl888)
>
>
>
Mime
View raw message