lucenenet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Van Den Berghe, Vincent" <Vincent.VanDenBer...@bvdinfo.com>
Subject RE: CreateTempFile is not thread-safe
Date Mon, 02 Jan 2017 13:54:25 GMT
Hello Shad,

Thank you for replying. However, I fail to understand both points of your reasoning. Given
my zero expertise regarding porting code from Java to C#, this may be normal.
Regarding point, 1, the statement:

	File.WriteAllText(fileName, string.Empty, new System.Text.UTF8Encoding(false) /* No BOM */);

... creates a file with 0 (zero) bytes. Unless I'm missing something, this is *exactly* the
same as this:

                           using (new FileStream(fileName, FileMode.CreateNew))
                           {
                                  // do nothing
                           }

... so why there is a problem with the byte order mark issue is beyond me. The net effect
of both statements (if they succeed) is the same.

Regarding point 2, your remark about " this approach seems to keep the file open too long,
causing a "file already in use"" is exactly the intended behavior: the exception signals that
another thread already (successfully) created a temporary file with the same name, and another
attempt is required to create a unique name for this thread. The current code is broken exactly
because File.WriteAllText fails to report such an error. I therefore do not see why you mention
this behavior as being problematic.
However, you do have a point: I failed to notice that the exception handler is ill-defined.
Instead of:

                catch (IOException e)
                {
                    if (!e.Message.Contains("already exists"))
                    {
                        throw e;
                    }

                    attempt++;
                }

... one should simply write:

                catch (IOException e)
                {
	if (!File.Exists(fileName))
                        throw e;

                    attempt++;
                }

There are two reasons why this should be done:
1.  The " e.Message.Contains("already exists")" test doesn't work on localized versions of
the .NET runtime. In general, relying on textual representations of exception messages to
control behavior is not advisable. Sadly, IOException doesn't provide sufficient information
to disambiguate its individual errors.
2. The only statement in the try-clause that can cause an IOException where the file does
in fact exists is  the "new FileStream" statement, which is exactly what we want to check.
When an IOException is thrown and the file doesn't exist, we want to report it. There is no
ambiguity here.

I apologize for the omission.

I agree that relying on exceptions being thrown to create something in a thread safe way is
less than ideal, but unless you're willing to use non-portable P/Invoke calls (the GetTempFileName
Win32 API), this appears to be the only option.

If you want to mimic Java (not always a good idea, but that's my personal opinion), you can
investigate "Path.GetRandomFileName()" which generates a filename based on RNGCryptoServiceProvider
but on many systems this generate filenames in the 8.3 format. 
Another option is to use Guid.NewGuid().ToString("N") to generate a unique name based on a
GUID.
In any case, even though collisions are almost impossible, you still need to perform the above
to reduce it to zero.


Vincent


-----Original Message-----
From: Shad Storhaug [mailto:shad@shadstorhaug.com] 
Sent: Monday, January 02, 2017 12:35 PM
To: dev@lucenenet.apache.org
Subject: RE: CreateTempFile is not thread-safe

Hi Vincent,

Thanks for reporting this. 

Clearly, this is something that needs to be addressed, but your proposed fix will certainly
not work, since it effectively reverses 2 other fixes:

1. In Java, the files are created with no byte order mark, and some parts of Lucene.Net cannot
read back the files when they have one.
2. Despite the using block, this approach seems to keep the file open too long, causing a
"file already in use" error. Using WriteAllText is the only approach I could find to work
around that issue.

I think that a more appropriate solution would be to mimic what is being done in Java. Namely,
it doesn't use sequential file names, they are based on a random hash of characters.

Thanks,
Shad Storhaug (NightOwl888)


-----Original Message-----
From: Van Den Berghe, Vincent [mailto:Vincent.VanDenBerghe@bvdinfo.com] 
Sent: Monday, January 2, 2017 3:38 PM
To: dev@lucenenet.apache.org
Subject: CreateTempFile is not thread-safe

Hello,

Creating AnalyzingSuggester in parallel sometimes fails with an IOException "The file is already
used by another process".
This is due to a flaw in FileSupport.CreateTempFile, most notably this:

                    if (File.Exists(fileName))
                    {
                        attempt++;
                        continue;
                    }
                    // Create the file
                    File.WriteAllText(fileName, string.Empty, new UTF8Encoding(false) /* No
BOM */);


If 2 or more threads execute File.Exists very close one after another, and the file isn't
found, the first call to File.WriteAllText will succeed and the other ones will succeed as
well since an existing file is simply overwritten.
Dependent code will attempt to use the file as if it were unique, which will then trigger
the exception mentioned above. Yes, it happens!
Consider replacing the File.WriteAllText call (which just creates an empty file) with:

                           using (new FileStream(fileName, FileMode.CreateNew))
                           {
                                  // do nothing
                           }

... which will fail if the file doesn't exist, and the exception will trigger another attempt.
 This will work better as intended.


Happy new year,

Vincent Van Den Berghe


Mime
View raw message