lucenenet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Shad Storhaug (Jira)" <>
Subject [jira] [Commented] (LUCENENET-640) Sequential IndexWriter performance in concurrent environments.
Date Tue, 14 Jan 2020 11:23:00 GMT


Shad Storhaug commented on LUCENENET-640:

Hi Mathais,

Thanks for the report and the PR.

Correct me if I am wrong, but wouldn't a better fix for this to be to replace [{{WeakIdentityMap}}|]
with the thread-safe [{{ConditionalWeakTable}}|]?
We may still need to utilize {{IdentityWeakReference}}, but it would need to be a class since
{{ConditionalWeakTable}} has a class constraint on {{TKey}}.

Do note that Microsoft didn't expose the enumerator or the [AddOrUpdate|]
method of {{ConditionalWeakTable}} until .NET Standard 2.1. However, Lucene requires one or
the other in every (other) place where {{ConditionalWeakTable}} would be useful (specifically,
as a replacement for [{{Lucene.Net.Support.WeakDictionary}}|]).
An effort to port {{ConditionalWeakTable}} from .NET Standard 2.1 back to .NET Standard 2.0
(LUCENENET-636) is currently underway, but stalled on [this J2N branch|].
Unfortunately, it depends on unmanaged resources that somehow need to be re-mapped or embedded
in order to make it functional. Perhaps there is also a way to cut through at a lower level
and make a {{ConditionalWeakIdentityTable}} that could be used as a direct replacement for
{{WeakIdentityMap}} instead of using {{IdentityWeakReference}}.

If you could take a look at using {{ConditionalWeakTable}} to solve this issue, it would be
much appreciated. Since there appears to be one place where the enumerator is required [here|],
the best approach would be to first check for compatibility on .NET Standard 2.1 and if that
works, help us to complete the port of {{ConditionalWeakTable}} for .NET Framework 4.5 and
.NET Standard 2.0 by submitting a PR to [the J2N project|]
so the same fix can also be applied to those platforms.

> Sequential IndexWriter performance in concurrent environments.
> --------------------------------------------------------------
>                 Key: LUCENENET-640
>                 URL:
>             Project: Lucene.Net
>          Issue Type: Bug
>          Components: Lucene.Net Core
>    Affects Versions: Lucene.Net 4.8.0
>            Reporter: Mathias Henriksen
>            Priority: Major
>              Labels: performance
>             Fix For: Lucene.Net 4.8.0
>         Attachments: AssertFinalBug.jpg, IdentityWeakReferenceBug.jpg, Program.cs, overviewBug.jpg
>          Time Spent: 20m
>  Remaining Estimate: 0h
> When creating Lucene.Net indices in parallel, sequential-like performance is experienced.
Profiling 8 concurrent IndexWriter instances writing in parallel shows that WeakIdentityMap::IdentityWeakReference::Equals
spends most time garbage collecting (94.91%) and TokenStream::AssertFinal (87.09% garbage
collecting) in my preliminary tests (see screenshots).
> The [WeakIdentityMap|]
implementation uses an IdentityWeakReference as key, which is implemented as a class. By inspection
of this class, it is merely a System.Runtime.InteropServices.GCHandle wrapper as can be seen
in the mono project, manually wrapping of this struct in a struct rather than a class - will
eliminate some of the immense amounts of garbage collection.

This message was sent by Atlassian Jira

View raw message