lucenenet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shad Storhaug <>
Subject RE: Lucene.NET 4.8 demo
Date Wed, 09 Nov 2016 09:27:55 GMT

Thanks for putting this together.

The demo made me realize something about the design of Analyzer that I didn't realize before.
The abstract Analyzer class was designed to be used with Java's anonymous class functionality
in mind. This makes creating custom Analyzers more concise in Java than it is in .NET.

In .NET we don't have anonymous classes. But we DO have anonymous methods that we could use
to simulate this behavior, provided there is a helper class to assist with it. To demonstrate
what I mean, I have updated the demo with a (very simple) AnonymousAnalyzer, which completely
eliminates the need for the 3 analyzer classes that you made.

I am not suggesting we should update the demo like this, but I am suggesting that we should
add something like AnonymousAnalyzer (perhaps renamed to CustomAnalyzer, InlineAnalyzer, DelegateAnalyzer,
or something else more appropriate) in the box so .NET developers can take advantage of its
language features in conjunction with Lucene the same way that Java developers do. In fact,
I think there are many things we can add (such as utility classes, utility methods, extension
methods, and builders) that would make developing with Lucene almost as seamless in .NET as
it is in Java - we just need to put our thinking caps on.

For example, maybe there could be a fluent TokenStreamComponentsBuilder that could be used
to put the components together in a fluent way...?

Another thing I noticed is that we should probably move the TokenStreamComponents class so
it is not a nested class of Analyzer to match the syntax more closely to Lucene.

A few thoughts on the demo:

1. Not everyone is familiar with a GitHub organization. Perhaps the demo should provide a
list to choose from? Currently, if you type something that doesn't exist you get an exception.
I had to do a Google search to come up with something, since my own username didn't work.
One of the top results (before an actual list of organizations) was an API that can be utilized
to read all of the GitHub organizations:
2. Maybe there should be some kind of estimate given on how long it will take to index the
organization. When I ultimately chose "apache" it took several minutes to index the results,
which I was not expecting.
3. Perhaps the API key should be put into a separate (config) file rather than inline in the
code. And you could pre-define the name of this file and put it into a .gitignore file. This
would help prevent anyone from accidentally committing their API key to the Git repo.
4. The search results seemed a bit underwhelming. Maybe there should be some kind of indicators
how many results Lucene.Net had to sift through to come up with the short list. Or at least
there should be some kind of explanation what is happening to put things into perspective.
Think of a crime scene investigation. If the investigators enter the search criteria and it
comes up with 50,000 suspects it would ruin their day. If it comes up with 3, then their work
is much easier. But without some kind of indicator showing that 3 is better than 50,000, the
latter seems much more impressive in a demo.
5. Perhaps there should be some way to reset the index? I entered another organization to
test my updates to the code and it added that organization's results to the original index,
which I wasn't expecting.

Shad Storhaug (NightOwl888)

-----Original Message-----
From: [] On Behalf Of Itamar
Sent: Wednesday, November 9, 2016 6:45 AM
Subject: Lucene.NET 4.8 demo

Hey folks,

I just pushed a working demo for Lucene.NET 4.8 using the latest bits to index and search
public repositories on github. Check it out:

I also recorded a Channel 9 video walking through the demo - I will post it here again as
soon as it's released on the nets.

This should clarify some mysteries around the new-ish API and hopefully drive confidence in
what we consider a stable beta release.



Itamar Syn-Hershko | @synhershko <> Freelance Developer
& Consultant Lucene.NET committer and PMC member
View raw message