lucenenet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christopher Currens <currens.ch...@gmail.com>
Subject Re: [Lucene.Net] [jira] [Created] (LUCENENET-469) Convert Java Iterator classes to implement IEnumerable<T>
Date Thu, 14 Jun 2012 19:30:45 GMT
I think it depends on how we decided to implement the enumerable.  Are we
going to implement it on the IndexReader or on a TermEnum (makes more sense
this way)?  In that case, I would imagine it would look something like this:

// Used to wrap the actual results of a TermEnum, which includes a Term and
frequency
public struct TermFreq
{
    Term Term { get; private set; }
    int DocFreq { get; private set; }
}

private class TermEnumWrapper : IEnumerable<TermFreq>
{
     // the code
}

public static class TermEnumExtensions
{
    public static IEnumerable<TermFreq> AsEnumerable(this TermEnum instance)
    {
        return new TermEnumWrapper(instance);
    }
}

//....in other method
foreach(var term in indexReader.TermEnum().AsEnumerable())
{
    // ....yay.
}

Now, for IndexReader.Terms(), I agree that we can just make it implement
IEnumerable as is, the above pattern could be an easy way to implement
IEnumerable wrappers for other classes, like classes that implement
TermDocs, TermPositions or other iterator-like classes.  I would WANT to
change the current TermEnum interface if we did this and mark Next(),
Term() and DocFreq() as protected internal instead of public.

I'm not sure what we should do regarding dispose, though, as it, too,
largely depends on how we implement this.  TermEnum is more like an
enumerator than enumerable, so it makes sense that it would implement
Dispose.  However, if we were planning on just making TermEnum implement
IEnumerable, we wouldn't want to have TermEnum disposable anymore,
otherwise you couldn't iterate over the items more than once.  Some options:

1) Use a wrapper class and an extension method (AsEnumerable()).  In this
case, the user would need to dispose of the TermEnum themselves when
they're done with it (which is a little silly)
2) Make TermEnum no longer disposable and have it implement IEnumerable.
 Every class that implements TermEnum would instead have a nested private
class that would be the enumerator.  This has a more .NET feel, because I
would expect to be able to store Terms() in a variable and iterate over it
multiple times (if I wanted to).
3) Make IndexReader.Terms() return an IEnumerable<TermFreq> instead of a
TermEnum.  IMO, this is a better option than 1, because it shouldn't
require any changes to the other TermEnum classes, and still gives the
ability to iterate multiple times. We'd need to create two wrapper classes,
though.  Perhaps something like this (nested inside of IndexReader:

private class TermEnumWrapper : IEnumerable<TermFreq>
{
    Term _startingTerm;
    IndexReader _reader;

    public TermEnumWrapper(IndexReader reader)
         : this(reader, null)
    {  }

    public TermEnumWrapper(IndexReader reader, Term term)
    {
        _startingTerm = term;
        _reader = reader;
    }

    public IEnumerator<TermFreq> GetEnumerator()
    {
        // return a new TermEnumerator passing in
        // _reader.Terms() if _startingTerm is null
        // otherwise _reader.Terms(_startingTerm)
    }

    private class TermEnumerator : IEnumerator<TermFreq>
    {
        public TermEnumerator(TermEnum enumToWrap)
        {
            // store that thing
            // Probably put logic in MoveNext() to see if this TermEnum has
already
            // been seeked to a Term, and to just return true in that case.
        }
        // wrap the TermEnum behavior
       public Dispose()
       {
           // dispose stored TermEnum
       }
    }
}

Anyway, Terms() is pretty straightforward, because it practically is an
IEnumerator already.  Those other classes will be more tricky, because they
support methods like Seek which could mess up enumerations.


Thanks,
Christopher

On Wed, Jun 13, 2012 at 10:29 PM, Simon Svensson <sisve@devhost.se> wrote:

> I dislike the idea of using extension methods. It would work, but it would
> expose a second set of method calls which duplicate existing functionality.
> How would the actual method call look like? .TermEnum() for the old one,
> and TermEnumerable() for the new one?
>
> I'm not sure what's considered "changes in Lucene.Net kernel". I believe
> we could support further Java ports by declaring these enumerator-classes
> partial, and move the actual interface implementations to another file. We
> would then only need to redeclare the ported class as partial if it is ever
> regenerated from java code. (I'm not fully up to speed on how the actual
> porting is done, I'm just guessing.)
>
> I'm [obviously] interested in implementing IEnumerable/IEnumerator
> directly.
> 1) There's only one entry point to term enumerations:
> IndexReader.Terms(...)
> 2) We can implement it without changing current TermEnum interface (we
> will only add to it).
> 3) We can support Dispose (calling Close) by writing custom iterators.
> (i.e. avoid "yield return")
>
> // Simon
>
>
> On 2012-06-09 01:15, Christopher Currens wrote:
>
>> I've used extension methods as well.  I'm +1 as well, though a vote, I'm
>> sure, isn't necessary; we've all been wanting IEnumerable for a long time.
>>  I've used them on TermEnum, TermDocs, and TermPositions.  I think we just
>> need to create an issue in JIRA for it and get it done.  Should be fairly
>> easy to do, and I don't see any reason why we can't put the extension
>> methods in core, in the same namespace where they're used.  That way,
>> they're automatically seen by users.
>>
>>
>> Thanks,
>> Christopher
>>
>> On Fri, Jun 8, 2012 at 3:19 PM, Digy<digydigy@gmail.com>  wrote:
>>
>>  Hi Andy,
>>>
>>> I have used similar extension methods for a long time. What I like
>>> especially in extension methods is that they don't require changes in
>>> Lucene.Net kernel and make further ports of Lucene.java independent from
>>> .Net structures.
>>>
>>> +1 for a Lucene.Net extensions in contrib. Even a +1 for a
>>> "Lucene.Net.Extensions" for the core.
>>>
>>> DIGY
>>>
>>> -----Original Message-----
>>> From: Andy Pook [mailto:andy.pook@gmail.com]
>>> Sent: Friday, June 08, 2012 10:26 PM
>>> To: lucene-net-dev@lucene.apache.**org<lucene-net-dev@lucene.apache.org>
>>> Subject: Re: [Lucene.Net] [jira] [Created] (LUCENENET-469) Convert Java
>>> Iterator classes to implement IEnumerable<T>
>>>
>>> If we don't want to add IEnumerable (though it seems that IEnumerable
>>> could
>>> be added in parallel with the existing pattern) could we add a bunch of
>>> extension methods?
>>> Something like the following...
>>>
>>> {noformat}
>>> public static class LuceneExtensions
>>> {
>>> public static IEnumerable<Term>  GetEnumerable(this TermEnum termEnum)
{
>>> yield return termEnum.Term(); while (termEnum.Next()) yield return
>>> termEnum.Term(); } } {noformat}
>>>
>>> Then you can...
>>> {noformat}
>>> foreach(var e in myTernEnum.GetEnumerable()) {
>>>    // do stuff with e
>>> }
>>> {noformat}
>>>
>>> Not as elegant as a direct implementation but gives easy enough access to
>>> foreach sematics.
>>>
>>>
>>> The second option is to realize that you don't need to explicitly
>>> implement
>>> IEnumerable. You just need a GetEnumerator method.
>>> So just add...
>>>
>>> {noformat}
>>> public IEnumerable<Term>  GetEnumerator() { yield return Term(); while
>>> (Next()) yield return Term(); } {noformat}
>>>
>>> Now you get nice foreach sematics without even mentioning IEnumerable.
>>> Compiler magic is your friend :-)
>>>
>>> BTW: Dispose() is only called automatically when exiting a using block.
>>> Exiting a foreach will not.
>>>
>>> Cheers,
>>>  Andy
>>>
>>> On 24 January 2012 06:37, Christopher Currens (Created) (JIRA)<
>>> jira@apache.org>  wrote:
>>>
>>>  Convert Java Iterator classes to implement IEnumerable<T>
>>>> ------------------------------**---------------------------
>>>>
>>>>                 Key: LUCENENET-469
>>>>                 URL: https://issues.apache.org/**
>>>> jira/browse/LUCENENET-469<https://issues.apache.org/jira/browse/LUCENENET-469>
>>>>             Project: Lucene.Net
>>>>          Issue Type: Sub-task
>>>>          Components: Lucene.Net Contrib, Lucene.Net Core
>>>>    Affects Versions: Lucene.Net 2.9.4, Lucene.Net 3.0.3, Lucene.Net
>>>>
>>> 2.9.4g
>>>
>>>>         Environment: all
>>>>            Reporter: Christopher Currens
>>>>             Fix For: Lucene.Net 3.0.3
>>>>
>>>>
>>>> The Iterator pattern in Java is equivalent to IEnumerable in .NET.
>>>>  Classes that were directly ported in Java using the Iterator pattern,
>>>> cannot be used with Linq or foreach blocks in .NET.
>>>>
>>>> {{Next()}} would be equivalent to .NET's {{MoveNext()}}, and in the
>>>> below case, {{Term()}} would be as .NET's {{Current}} property.  In
>>>> cases as below, it will require {{TermEnum}} to become an abstract
>>>> class with {{Term}} and {{DocFreq}} properties, which would be
>>>> returned from another class or method that implemented
>>>>
>>> {{IEnumerable<TermEnum>}}.
>>>
>>>> {noformat}
>>>>        public abstract class TermEnum : IDisposable
>>>>        {
>>>>                public abstract bool Next();
>>>>                public abstract Term Term();
>>>>                public abstract int DocFreq();
>>>>                public abstract void  Close();
>>>>                public abstract void Dispose();
>>>>        }
>>>> {noformat}
>>>>
>>>> would instead look something like:
>>>>
>>>> {noformat}
>>>>        public class TermFreq
>>>>        {
>>>>                public abstract Term { get; }
>>>>                public abstract int { get; }
>>>>        }
>>>>
>>>>        public abstract class TermEnum : IEnumerable<TermFreq>,
>>>>
>>> IDisposable
>>>
>>>>        {
>>>>                // ...
>>>>        }
>>>> {noformat}
>>>>
>>>> Keep in mind that it is important that if the class being converted
>>>> implements {{IDisposable}}, the class that is enumerating the terms
>>>> (in this case {{TermEnum}}) should inherit from both
>>>> {{IEnumerable<T>}} *and* {{IDisposable}}.  This won't be any change
to
>>>> the user, as the compiler automatically calls {{IDisposable}} when used
>>>>
>>> in
>>> a {{foreach}} loop.
>>>
>>>> --
>>>> This message is automatically generated by JIRA.
>>>> If you think it was sent incorrectly, please contact your JIRA
>>>> administrators:
>>>> https://issues.apache.org/**jira/secure/**
>>>> ContactAdministrators!default.**js<https://issues.apache.org/jira/secure/ContactAdministrators!default.js>
>>>> pa For more information on JIRA, see:
>>>> http://www.atlassian.com/**software/jira<http://www.atlassian.com/software/jira>
>>>>
>>>>
>>>>
>>>>  -----
>>>
>>> Checked by AVG - www.avg.com
>>> Version: 2012.0.2177 / Virus Database: 2433/5056 - Release Date: 06/08/12
>>>
>>>
>>>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message