lucenenet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Shad Storhaug (Jira)" <j...@apache.org>
Subject [jira] [Resolved] (LUCENENET-618) Need a Cross-OS NativeFSLock Implementation
Date Sat, 26 Oct 2019 19:09:00 GMT

     [ https://issues.apache.org/jira/browse/LUCENENET-618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Shad Storhaug resolved LUCENENET-618.
-------------------------------------
    Resolution: Fixed

The resolution to this issue involved 2 parts:
 # For non-Windows, provoke the {{IOExceptions}} by creating a temp file, and cache the HResult
value for this known scenario. Then use this HResult value to test the {{IOException}}s at
runtime.
 # Refactored the {{SharingAwareNativeFSLock}} into 2 different classes, {{NativeFSLock}}
and {{SharingNativeFSLock}}. If the platform supports it (Windows), use {{NativeFSLock}}.
Use {{SharingNativeFSLock}} if provoking the exception succeeded in #1, otherwise fallback
to {{FallbackNativeFSLock}}, the original non-thread safe implementation.

Fallback should almost never happen in practice, as it is only there to catch scenarios such
as a zero-valued HResult (i.e. the underlying platform is does not specify) or if the temp
file cannot be created because the system is out of disk space or temp file names.

Testing has shown that macOS tests complete about 15% faster than before the change on netcoreapp2.1,
however on netcoreapp1.1 performance decreased by about the same amount. Given the fact that
official support for .NET Core 1.1 has ended, this is a reasonable tradeoff.

> Need a Cross-OS NativeFSLock Implementation
> -------------------------------------------
>
>                 Key: LUCENENET-618
>                 URL: https://issues.apache.org/jira/browse/LUCENENET-618
>             Project: Lucene.Net
>          Issue Type: Task
>          Components: Lucene.Net Core
>    Affects Versions: Lucene.Net 4.8.0
>            Reporter: Shad Storhaug
>            Assignee: Shad Storhaug
>            Priority: Critical
>              Labels: cross-platform, linux, macOS
>             Fix For: Lucene.Net 4.8.0
>
>
> Now that we have (finally) begun testing on platforms other than Windows, it has been
discovered that the {{NativeFSLock}} doesn't work reliably on other platforms.
> Due to differences in how Java and .NET handle file system locking, we have switched
from the Lucene 4.8.0 implementation to [one that is based on {{FileStream.Lock}}|https://lucene.markmail.org/search/?q=lucenenet%20An%20alternative%20NativeFSLockFactory#query:lucenenet%20An%20alternative%20NativeFSLockFactory+page:1+mid:pdx6nwciw4rn75l6+state:results].
While this implementation does include support for .NET Framework/.NET Standard 1.x/.NET Standard
2.x, it relies on {{HResult}} error codes that exist only on Windows.
> I have done some research, and it turns out that [using {{HResult}} error codes cannot
be made reliably portable across platforms|https://stackoverflow.com/a/46381756]. So, I used
the original implementation we had as a fallback when the OS is not Windows.
> {code:c#}
>         internal virtual Lock NewLock(string path)
>         {
>             if (Constants.WINDOWS)
>                 return new WindowsNativeFSLock(this, m_lockDir, path);
>             // Fallback implementation for unknown platforms that don't rely on HResult
>             return new NativeFSLock(this, m_lockDir, path);
>         }
> {code}
> Unfortunately, we are now seeing error messages when testing on macOS and Linux, which
are a clear symptom of resource locking failures:
> {quote}{{System.IO.IOException : The process cannot access the file '/tmp/LuceneTemp/index-MMapDirectory-i0xc4fz2/write.lock'
because it is being used by another process.}}{quote}
> {quote}{{at Lucene.Net.Util.IOUtils.DisposeWhileHandlingException(Exception priorException,
IDisposable[] objects) in D:\a\1\s\src\Lucene.Net\Util\IOUtils.cs:line 197}}{{at Lucene.Net.Store.Directory.Copy(Directory
to, String src, String dest, IOContext context) in D:\a\1\s\src\Lucene.Net\Store\Directory.cs:line
214}}{{at Lucene.Net.Store.MockDirectoryWrapper.Copy(Directory to, String src, String dest,
IOContext context) in D:\a\1\s\src\Lucene.Net.TestFramework\Store\MockDirectoryWrapper.cs:line
1320}}{{at Lucene.Net.Store.RAMDirectory..ctor(Directory dir, Boolean closeDir, IOContext
context) in D:\a\1\s\src\Lucene.Net\Store\RAMDirectory.cs:line 106}}{{at Lucene.Net.Index.TestTermVectorsWriter.TestTermVectorCorruption()
in D:\a\1\s\src\Lucene.Net.Tests\Index\TestTermVectorsWriter.cs:line 446}}{quote}
> It turns out the implementation that was used as a fallback had been [contributed back
when .NET Framework was the only target framework in Lucene.Net|https://github.com/apache/lucenenet/pull/70#issuecomment-72958293]
and it sounds like running on non-Windows platforms was not being considered in its implementation.
> As a result, we only have 2 implementations that work on Windows and none that work reliably
on other platforms.
> The {{NativeFSLock}} implementation does prevent the "file in use" error from happening
frequently on non-Windows platforms, but doesn't prevent it from happening completely. However,
the {{WindowsNativeFSLock}} is reliably making all of the tests pass on Windows across several
dozen test runs.
> IMO, the approach being used on Windows is fine, but for other operating systems we should
fallback to an implementation that is more reliable than the one we currently have.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Mime
View raw message