lucenenet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Shad Storhaug (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LUCENENET-469) Convert Java Iterator classes to implement IEnumerable<T>
Date Wed, 09 Aug 2017 05:44:00 GMT

    [ https://issues.apache.org/jira/browse/LUCENENET-469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16119439#comment-16119439
] 

Shad Storhaug commented on LUCENENET-469:
-----------------------------------------

The types I am aware of are the {{TermsEnum}} and {{DocIdSetIterator}} and their subclasses
(which include {{DocsEnum}} and {{DocsAndPositionsEnum}} classes). There are probably a few
others. Anything that is not public facing isn't worth the effort of refactoring, so there
isn't much after these that haven't already been converted.

BTW - there is no longer a {{TermEnum}} class in Lucene 4+ - it has been replaced with {{TermsEnum}}.

The issue isn't so much with implementing {{IEnumerator<T>}} as the fact that once you
use a foreach loop and it calls {{GetEnumerator()}}, there would need to be a cast to get
to the functionality other than {{IEnumerator<T>}}. And inside of a foreach loop you
don't have access to the {{IEnumerator<T>}} instance, so there is no way to cast to
get the rest of the functionality. 

I looked at {{TermsEnum}} in some depth and I think the best solution there would be to divide
it into "basic" and "advanced" APIs. Simply rename {{Terms.GetIterator(BytesRef)}} to {{Terms.GetEnumerator(BytesRef)}}
and add an overload that accepts no parameter and satisfies the {{IEnumerable<T>}} contract
that internally calls {{Terms.GetEnumerator(null)}}. The overload that accepts a parameter
can then return a {{TermsEnum}} type, which will expose the additional members that are hidden
in the .NETified overload.

Since a common use case is to loop forward, this would simplify some code:

{code:borderStyle=solid}
Terms vector = <something>
TermsEnum termsEnum = vector.GetIterator(null);
BytesRef text;
while ((text = termsEnum.Next()) != null)
{
	// use text
}
{code}

becomes

{code:borderStyle=solid}
Terms vector = <something>
foreach (BytesRef text in vector)
{
	// use text
}
{code}

But you could also call {{GetEnumerable(BytesRef)}} in order to reuse a {{BytesRef}} or to
get to all of {{TermsEnum}} s goodies that aren't part of the {{IEnumerator<T>}} contract.

I don't think this same approach will work for {{DocsEnum}} or {{DocsAndPositionsEnum}} -
it would be nice to find a solution that can be applied consistently. I am not so sure those
types fit the mold of {{IEnumerator<T>}} anyway, but perhaps there is some way to refactor
them to fit.

Then again, there must be some reason why the Lucene designers strayed away from using {{Iterable<T>}}
in this part of the design - they may have intentionally done it this way for some odd reason.

> Convert Java Iterator classes to implement IEnumerable<T>
> ---------------------------------------------------------
>
>                 Key: LUCENENET-469
>                 URL: https://issues.apache.org/jira/browse/LUCENENET-469
>             Project: Lucene.Net
>          Issue Type: Sub-task
>          Components: Lucene.Net Contrib, Lucene.Net Core
>    Affects Versions: Lucene.Net 2.9.4, Lucene.Net 2.9.4g, Lucene.Net 3.0.3, Lucene.Net
4.8.0
>         Environment: all
>            Reporter: Christopher Currens
>             Fix For: Lucene.Net 4.8.0
>
>
> The Iterator pattern in Java is equivalent to IEnumerable in .NET.  Classes that were
directly ported in Java using the Iterator pattern, cannot be used with Linq or foreach blocks
in .NET.
> {{Next()}} would be equivalent to .NET's {{MoveNext()}}, and in the below case, {{Term()}}
would be as .NET's {{Current}} property.  In cases as below, it will require {{TermEnum}}
to become an abstract class with {{Term}} and {{DocFreq}} properties, which would be returned
from another class or method that implemented {{IEnumerable<TermEnum>}}.
> {noformat} 
> 	public abstract class TermEnum : IDisposable
> 	{
> 		public abstract bool Next();
> 		public abstract Term Term();
> 		public abstract int DocFreq();
> 		public abstract void  Close();
> 	        public abstract void Dispose();
> 	}
> {noformat} 
> would instead look something like:
> {noformat} 
> 	public class TermFreq
> 	{
> 		public abstract Term { get; }
> 		public abstract int { get; }
> 	}
>         public abstract class TermEnum : IEnumerable<TermFreq>, IDisposable
>         {
>                 // ...
>         }
> {noformat}
> Keep in mind that it is important that if the class being converted implements {{IDisposable}},
the class that is enumerating the terms (in this case {{TermEnum}}) should inherit from both
{{IEnumerable<T>}} *and* {{IDisposable}}.  This won't be any change to the user, as
the compiler automatically calls {{IDisposable}} when used in a {{foreach}} loop.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message