lucenenet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Digy" <digyd...@gmail.com>
Subject RE: [jira] Updated: (LUCENENET-190) 2.4.0 Performance in TermInfosReader term caching (New implementation of SimpleLRUCache)
Date Tue, 18 Aug 2009 18:10:24 GMT
Thanks for the code. I made some (primitive) performance tests. I seems to be ~20-25% slower.

DIGY

-----Original Message-----
From: xzxz@mail.ru [mailto:xzxz@mail.ru] 
Sent: Tuesday, August 18, 2009 8:19 PM
To: lucene-net-dev@incubator.apache.org
Subject: Re: [jira] Updated: (LUCENENET-190) 2.4.0 Performance in TermInfosReader term caching
(New implementation of SimpleLRUCache)

Digy :
> I couldn't find a method like Keys.First(). 
>   
I found what is the problem. Method First()  exists only for Framework 
3.5. For Framework 2.0 some work around requied :

                    SortedDictionary<long, object>.Enumerator 
EnumTimeStamps = TimeStamps.GetEnumerator(); // This method is an O(1) 
operation
                    EnumTimeStamps.MoveNext(); // Shoud be an O(1) 
operation.
                    long key = 
EnumTimeStamps.Current.Key;                  

Ugly, but it works.

> Why don't you send your own working copy of LRUCache. If it is faster, we can use it.
>   
I modified SimpleLRUCache_LUCENENET_190 to use a SortedDictionary.

Only few lines of code have to be modified:
1) Declaration:
-  SortedList<long, object> TimeStamps = new SortedList<long, object>();
+ System.Collections.Generic.SortedDictionary<long, object> TimeStamps = 
new SortedDictionary<long, object>();
2) Put method:
                            
                if (Data.Count > Capacity)
                {
                    SortedDictionary<long, object>.Enumerator 
EnumTimeStamps = TimeStamps.GetEnumerator();
                    EnumTimeStamps.MoveNext();
                    long key = 
EnumTimeStamps.Current.Key;                  
                    Data.Remove(TimeStamps[key]);
                    TimeStamps.Remove(key);
                }


It passed Nunit test. I didn't make any performance test yet.

----------
Andrei
> DIGY
>
>
> -----Original Message-----
> From: xzxz@mail.ru [mailto:xzxz@mail.ru] 
> Sent: Tuesday, August 18, 2009 5:58 PM
> To: lucene-net-dev@incubator.apache.org
> Subject: Re: [jira] Updated: (LUCENENET-190) 2.4.0 Performance in TermInfosReader term
caching (New implementation of SimpleLRUCache)
>
> Digy пишет:
>   
>> This time you will get an error with "TimeStamps.Keys[0]". To get the key at index
0, you have to copy the "keys" to a temporary array.
>>
>>   
>>     
> Ok  i missed that Keys returns KeyCollection instead of  an array. But  
> you can use Keys.First() instead of Keys[0]. It should be fast.
>
>
>   
>> DIGY.
>>
>>
>> -----Original Message-----
>> From: xzxz@mail.ru [mailto:xzxz@mail.ru] 
>> Sent: Tuesday, August 18, 2009 5:24 PM
>> To: lucene-net-dev@incubator.apache.org
>> Subject: Re: [jira] Updated: (LUCENENET-190) 2.4.0 Performance in TermInfosReader
term caching (New implementation of SimpleLRUCache)
>>
>> Digy wrote:
>>   
>>     
>>> SortedDictionary does not have a getter with index (like Data[Index]). In a SortedList,
Data[0] is always the LRU item. 
>>>
>>>   
>>>     
>>>       
>> I didn't find any use of index access for SortedList (TimeStamps) except 
>> of : 
>>                 if (Data.Count > Capacity)
>>                 {
>>                     long key = TimeStamps.Keys[0];
>>                     Data.Remove(TimeStamps[key]);
>>                     TimeStamps.RemoveAt(0); <-------------
>>                 }
>>
>>  In case of  SortedDictionary it can be replaced by:
>>                 if (Data.Count > Capacity)
>>                 {
>>                     long key = TimeStamps.Keys[0];
>>                     Data.Remove(TimeStamps[key]);
>>                     TimeStamps.Remove(key);
>>                 }
>>
>> Am I missed something?
>>
>> ---
>> Andrei
>>   
>>     
>>> I got much more better results with RedBlackCS.RedBlack ( http://www.codeproject.com/KB/recipes/redblackcs.aspx
) but it is huge and there may be licensing problems.
>>>
>>>
>>> DIGY.
>>>
>>>
>>> -----Original Message-----
>>> From: xzxz@mail.ru [mailto:xzxz@mail.ru] 
>>> Sent: Tuesday, August 18, 2009 4:51 PM
>>> To: lucene-net-dev@incubator.apache.org
>>> Subject: Re: [jira] Updated: (LUCENENET-190) 2.4.0 Performance in TermInfosReader
term caching (New implementation of SimpleLRUCache)
>>>
>>> SimpleLRUCache_LUCENENET_190 uses SortedList<long, object> collection.
>>>
>>> Performance of SortedList (see http://msdn.microsoft.com/en-us/library/ms132339.aspx):

>>> 1) Add method is an O(n) operation for unsorted data. It is an O(log n) operation
if the new element is added at the end of the list. 
>>> If insertion causes a resize, the operation is O(n).
>>> 2) Remove method method is an O(n) operation
>>> 3) RemoveAt method is an O(n) operation 
>>> 4) Keys property is an O(1) operation
>>>
>>> Why not to use SortedDictionary<>? It has better performance for Remove
and Add:
>>>
>>> 1) Add method is an O(log n) operation
>>>
>>> 2) Remove method is an O(log n) operation
>>>
>>> 3) Keys property is an O(1) operation
>>>
>>>
>>>   
>>>     
>>>       
>> <http://www.garweb.ru>
>>
>>
>>   
>>     
>
>
>
>   



Mime
View raw message