lucenenet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andy Pook (JIRA)" <>
Subject [jira] [Commented] (LUCENENET-469) Convert Java Iterator classes to implement IEnumerable<T>
Date Wed, 09 Aug 2017 09:36:00 GMT


Andy Pook commented on LUCENENET-469:

So I have a little time to get into this then :)

Probably nothing new here to you. Mainly just me catching up...

To your first point about GetEnumerator being somewhat hidden/opaque, isn't that just something
that is the usual guidance in c#. ie don't mess with the collection from inside a for cos
you'll break the enumerator.

In the fairly simple case in the code above you could...
(NB just using TermEnum as it's a simple example)

TermEnum vector = <something>
foreach (var item in vector)
	var x = vector.Term();
	var y = item.Term;  // same thing as 'x'
	// but you should probably steer clear of using 'Next()'

There's no need to extend/cast or get access to the enumerator. The enumerators job is to
simply move a cursor forwards and support a language construct.
Of course this would make it still look weird to a c#/vb dev.
{{Current}} is supposed to return the current value of the cursor. Maybe it could return Current
as the collection ('vector' above). So it would look like...

TermEnum vector = <something>
foreach (TermEnum item in vector)
	var x = item.Term();

Again, weird. Doing the method -> property change would help. But, still feels odd to have
the item being the same thing as vector.

Here lies the basic problem. The Lucene "iterable" classes represent all of: 
* the collection
** stepping the cursor
** collection scope behaviors for the type
* the current item within the collection
** "properties"
** item scope behaviors
Not only that but there are also different implementations of the pattern with different semantics.
ie {{Next()}} returns success; {{NextDoc()}} returns an id or magic number; {{Advance(target)}}
spin forward to target

Also, the need to allow the Current item to be; created on each iteration; reused; fetch from
object pool; ...
I'm fairly sure that the EnumEnumerator (or versions of) can handle all that. I'd need to
get further into the use cases.

Lastly, there's the observation that the {{while}} vs {{foreach}} examples you gave are not
equivalent as the while version does not call {{Dispose()}}. 
I suspect that in current usage Dispose is never called. Or if it is it's because it's associated
with files rather than because we're leaving a loop. Does that match your experience?
Thinking about it, maybe the enumerator Dispose should be a noop? It's odd for a type to be
disposing it's "parent" (though I know some StreamWriter/Reader classes do, it's still odd/annoying).
Again there's a scoping concept here that's muddled by the type being many things.

All very Tricky...

I think there's some combination of
* isolating the "item" type
* using the collection object behaviors from inside the loop
* implementing the enumerator such that it is ok about the cursor being changed from outside

It's going to be very hard to get the semantics of each type in a sane shape such that it's
hard for a dev to get tripped up.
Which is why this have been open for 5+ years :)

I'll see if I can put together a proof of concept PR

> Convert Java Iterator classes to implement IEnumerable<T>
> ---------------------------------------------------------
>                 Key: LUCENENET-469
>                 URL:
>             Project: Lucene.Net
>          Issue Type: Sub-task
>          Components: Lucene.Net Contrib, Lucene.Net Core
>    Affects Versions: Lucene.Net 2.9.4, Lucene.Net 2.9.4g, Lucene.Net 3.0.3, Lucene.Net
>         Environment: all
>            Reporter: Christopher Currens
>             Fix For: Lucene.Net 4.8.0
> The Iterator pattern in Java is equivalent to IEnumerable in .NET.  Classes that were
directly ported in Java using the Iterator pattern, cannot be used with Linq or foreach blocks
in .NET.
> {{Next()}} would be equivalent to .NET's {{MoveNext()}}, and in the below case, {{Term()}}
would be as .NET's {{Current}} property.  In cases as below, it will require {{TermEnum}}
to become an abstract class with {{Term}} and {{DocFreq}} properties, which would be returned
from another class or method that implemented {{IEnumerable<TermEnum>}}.
> {noformat} 
> 	public abstract class TermEnum : IDisposable
> 	{
> 		public abstract bool Next();
> 		public abstract Term Term();
> 		public abstract int DocFreq();
> 		public abstract void  Close();
> 	        public abstract void Dispose();
> 	}
> {noformat} 
> would instead look something like:
> {noformat} 
> 	public class TermFreq
> 	{
> 		public abstract Term { get; }
> 		public abstract int { get; }
> 	}
>         public abstract class TermEnum : IEnumerable<TermFreq>, IDisposable
>         {
>                 // ...
>         }
> {noformat}
> Keep in mind that it is important that if the class being converted implements {{IDisposable}},
the class that is enumerating the terms (in this case {{TermEnum}}) should inherit from both
{{IEnumerable<T>}} *and* {{IDisposable}}.  This won't be any change to the user, as
the compiler automatically calls {{IDisposable}} when used in a {{foreach}} loop.

This message was sent by Atlassian JIRA

View raw message