lucenenet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LUCENENET-565) Port Lucene.Net.Replicator
Date Fri, 28 Jul 2017 22:22:00 GMT

    [ https://issues.apache.org/jira/browse/LUCENENET-565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16105791#comment-16105791
] 

ASF GitHub Bot commented on LUCENENET-565:
------------------------------------------

Github user NightOwl888 commented on the issue:

    https://github.com/apache/lucenenet/pull/209
  
    > Another thing is that we don't really hide much complexity for the developer but
instead take away his freedom to decide when and where indexes etc. should be initialized
or at least provide some pre-stage knowledge of it as each index would have to have a replicator
configured, these should then be accessible from where he writes to the index has they have
to be notified of changes...
    
    I disagree. We are not taking anything away from the user by doing this - they are always
free to drop to the lower-level Lucene APIs to do any low-level configuration such features.
    
    But the grand plan is (hopefully...someday...with enough contributions) to have integration
packages with all of the .NET UI frameworks so application-level configuration can be done
at application startup using DI and so all of the boilerplate code that everyone writes to
add search to the application can be reduced to a fluent API configuration (the same way that
most .NET libraries do it). We know nearly every web app will need an IndexWriter per index
registered as a singleton. Why not do that with one (or two) `.AddSearch()` method where the
configuration operations are specified as simple variables in one place (the place where every
other global feature of the app is configured)? We could easily provide a way to do most of
the common options that apply to 80-90% of users (and would save 80-90% of users from having
to deal with a complex API to get the simple features they want).
    
    I don't see any reason at all to expose the HttpRequest/HttpResponse in this API - there
are only a few ways this can go that are sensible. Sure, this might make sense at a low level
(particularly if this is the piece that we plug into every HTTP listener-capable framework),
but these details are not necessary for it to be useful for the end user (after all, they
are interacting with it through a URL).
    
    And sure, there are other features of Lucene that need to interact with the UI directly
and so forth that need other integration APIs to interoperate. But certainly there is no question
that an HTTP listener is a one-time per application thing, not something that would ever be
registered in a controller. The client on the other hand may be a different story.
    
    Also, we should make the .NET version of replicator support interop with the Java Lucene
replicator. For that to work, the URL scheme should be the same as Lucene by default. The
[replication service documentation](https://lucene.apache.org/core/4_8_0/replicator/org/apache/lucene/replicator/http/ReplicationService.html)
clearly specifies this as:
    
    ```
    /<context>/<shard>/<action>
    ```
    
    Of course, we should probably provide a way to override this - routing conflicts happen.
But this is the logical default setting. I haven't looked into whether this is even configurable
in Java or would require a custom compile in order to get it to interoperate.
    
    Most likely the common use case for the server will be a standalone application that serves
as a server only. So, I would expect routing conflicts in this situation to be rare.
    
    Even if you go the path of using a controller for this (which is an option), we should
stay away from attribute routing for the simple reason that it is impossible to change after
it is compiled into the library. A better argument for never using it is the fact that routing
is *order sensitive* and .NET Reflection (which is how Attributes are read) by definition
has *undefined order*. I have answered who knows how many questions on StackOverflow for people
who have hit that landmine. The solution is to add an Order parameter to the attributes, but
once again if the attribute is compiled into the DLL there is no way to fix this problem.
    
    On the other hand, using convention-based routing allows you do define the AddLuceneReplication()
method where the route will be added in relation to the other routes, based on the order the
methods are called at application startup (which is unclear to me if that is possible with
middleware - it should be...).
    
    I made a similar implementation in MvcSiteMapProvider for the [`XmlSiteMapController`](https://github.com/maartenba/MvcSiteMapProvider/blob/master/src/MvcSiteMapProvider/MvcSiteMapProvider/Web/Mvc/XmlSiteMapController.cs).
 It registers its own routes using convention-based routing. It used WebActivator to load
the routing for the controller. Although, this was before Microsoft made the nice `Startup.cs`
class where everything could be configured using extension methods so this is not exactly
how I would do it now. Instead, I would give the user the ability to configure it in `Startup.cs`,
where other 3rd parties do it. And that is where I would provide extension method overloads
to configure alternate route URLs and any other advanced options that may be needed.
    
    The user of course always has the option to *not* call the method at application startup,
build their own controller, and dig into any of the more advanced options (assuming there
are any left that are not in the extension method overloads). But why should everyone have
to do this?
    
    >  it's apparently not that simple, and currently I don't fully get the idea behind
the design from the java implementation... The things I do get though is that it is a master/slave
implementation and a polling implementation.
    
    There is [some documentation](https://github.com/apache/lucene-solr/blob/releases/lucene-solr/4.8.0/lucene/replicator/src/java/org/apache/lucene/replicator/package.html)
in the repo about how replicator is configured in Java, but it seems to be missing from the
[4.8.0 API docs](https://lucene.apache.org/core/4_8_0/replicator/index.html). 
    
    A quick search also reveals [this mailing list thread](https://lists.gt.net/lucene/java-user/225677)
which links to [a blog post](http://shaierera.blogspot.com/2013/05/the-replicator.html) that
seems to describe it in more detail.
    
    Does this help? If not, I suggest contacting the Lucene team (and link to the above) via
the Lucene user list to see if they can provide better answers on the intended workflow. I
skimmed it, but it is still not very clear to me where the client (who calls the server every
30 seconds) would need to be. Maybe a Quartz.net task? Or a Windows service/other type of
persistent app that runs when the computer is started? Certainly, that is the part that has
the most complexity - it needs to keep track of the URLs to call to replicate everything and
provide the commands to the replication servers.
    
    > Move EnumrableExtensions to Lucene.Support? or Lucene.Util?
    
    Lucene.Support is the only place where we are putting code that is not a port of something
from Java, so that would be the place to put it. 


> Port Lucene.Net.Replicator
> --------------------------
>
>                 Key: LUCENENET-565
>                 URL: https://issues.apache.org/jira/browse/LUCENENET-565
>             Project: Lucene.Net
>          Issue Type: Task
>          Components: Lucene.Net.Replicator
>    Affects Versions: Lucene.Net 4.8.0
>            Reporter: Shad Storhaug
>            Priority: Minor
>              Labels: features
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message