lucenenet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From NightOwl888 <...@git.apache.org>
Subject [GitHub] lucenenet issue #209: RFC: LUCENENET-565: Porting of Lucene.Net.Replicator
Date Sat, 05 Aug 2017 22:14:54 GMT
Github user NightOwl888 commented on the issue:

    https://github.com/apache/lucenenet/pull/209
  
    @jeme 
    
    I got a chance to get a deeper look of this structure, and I found a comment on that blog
post that holds the answer to what I think is the part we are struggling with:
    
    > The server can use ReplicationService to embed in a servlet which responds to HTTP
requests sent by HttpReplicator [on the client side]. The server also uses LocalReplicator
to manage the revisions. The indexing code on the server will call localReplicator.publish()
and the servlet (through ReplicationService) will call localReplicator.checkForUpdate, obtain
etc.
    
    > The clients can use ReplicationClient, with an HttpReplicator, to replicate index
changes from the server. The HttpReplicator is given the host:port of the server which manages
the index revisions. The code examples above show how it can be done.
    
    So, there are not 2 pieces to this, but 3:
    
    1. A ReplicationService (on the server), which indeed does nothing more than listen for
incoming requests. The files are replicated from the server to the client when it receives
a command.
    2. A LocalReplicator (on the server), which does the publishing of revisions.
    3. An HTTPReplicator (on the client), which calls the ReplicationService with the commands.
It receives the files from the server and updates files local to the client.
    
    Since the ReplicationService (the HTTP listener) accepts an `IDictionary<string, IReplicator>`
through its constructor, and we know that an HTTP listener must be registered at application
startup, I think it is safe to assume that these replicators need to be registered as singletons
(best to do it with DI, but I suppose a static would also work). If you look at the [HttpReplicatorTest](https://github.com/apache/lucene-solr/blob/releases/lucene-solr/4.8.1/lucene/replicator/src/test/org/apache/lucene/replicator/http/HttpReplicatorTest.java#L48),
it registers this dependency at the class level (which essentially makes it singleton for
the test).
    
    So, first we need to devise a way to have multiple replicators as singleton:
    
    ```c#
    services.AddFileReplication(r => r.WithShard("shard1").WithShard("shard2", new MyCustomReplicator()));
    ```
    
    Using a fluent builder API, you can register multiple shards with "speak-able" syntax.
Internally, this would just register a custom service that encapsulates a dictionary to hold
the configuration data similar to the `IHttpContextAccessor`.
    
    ```c#
        public interface IReplicatorAccessor
        {
            IDictionary<string, IReplicator> Replicators { get; }
            IReplicator GetReplicator(string shard);
        }
    
        public class ReplicatorAccessor : IReplicatorAccessor
        {
            public ReplicatorAccessor(IDictionary<string, IReplicator> replicators)
            {
                this.Replicators = replicators ?? throw new ArgumentNullException(nameof(replicators));
            }
    
            public IDictionary<string, IReplicator> Replicators { get; private set;
}
    
            public IReplicator GetReplicator(string shard)
            {
                IReplicator result;
                Replicators.TryGetValue(shard, out result);
                return result;
            }
        }
    ```
    
    This is wired up with an extension method:
    
    ```c#
        public static class ServiceCollectionExtensions
        {
            public static void AddFileReplication(this IServiceCollection services, Func<ReplicatorBuilder,
ReplicatorBuilder> expression)
            {
                if (services == null)
                    throw new ArgumentNullException(nameof(services));
                if (expression == null)
                    throw new ArgumentNullException(nameof(expression));
    
                var starter = new ReplicatorBuilder();
                var builder = expression(starter);
                AddFileReplication(services, builder.Build());
            }
    
            public static void AddFileReplication(this IServiceCollection services, IDictionary<string,
IReplicator> replicators)
            {
                if (services == null)
                    throw new ArgumentNullException(nameof(services));
                if (replicators == null)
                    throw new ArgumentNullException(nameof(replicators));
    
                services.AddSingleton<IReplicatorAccessor>(new ReplicatorAccessor(replicators));
            }
        }
    ```
    and the dictionary is built using a fluent builder. This is the most interesting part:
    
    ```c#
        public class ReplicatorBuilder
        {
            private readonly IDictionary<string, IReplicator> replicators;
    
            public ReplicatorBuilder()
                : this(new Dictionary<string, IReplicator>())
            {
            }
    
            public ReplicatorBuilder(IDictionary<string, IReplicator> replicators)
            {
                this.replicators = replicators ?? throw new ArgumentNullException("replicators");
            }
    
            public ReplicatorBuilder WithShard(string shard)
            {
                replicators.Add(shard, new LocalReplicator());
                return new ReplicatorBuilder(replicators);
            }
    
            public ReplicatorBuilder WithShard(string shard, IReplicator replicator)
            {
                replicators.Add(shard, replicator);
                return new ReplicatorBuilder(replicators);
            }
    
            public IDictionary<string, IReplicator> Build()
            {
                return replicators;
            }
        }
    ```
    
    With that one line registered at application startup, you can ask for the `IReplicatorAccessor`
by adding a constructor argument for it.
    
    ```c#
        public class ValuesController : Controller
        {
            private readonly IReplicatorAccessor replicatorAccessor;
    
            public ValuesController(IReplicatorAccessor replicatorAccessor)
            {
                this.replicatorAccessor = replicatorAccessor ?? throw new ArgumentNullException("replicatorAccessor");
            }
    ```
    
    And then you can use it to get the replicator instances by shard name.
    
    ```c#
            [HttpGet]
            public IEnumerable<string> Get()
            {
                IReplicator replicator = replicatorAccessor.GetReplicator("shard1");
                replicator.Publish(new IndexRevision(writer));
    
                return new string[] { "value1", "value2" };
            }
    ```
    
    Now, if the server needs to be registered as an HTTP listener, we just need another extension
method to do that, similar to @AndyPook 's example. Basically, we register a route that runs
our ReplicationService logic, which uses the `IReplicatorAccessor` to get the dictionary where
they live and pass it to the `ReplicationService`.
    
    ```c#
        public static class ApplicationBuilderExtensions
        {
            public static void UseFileReplication(this IApplicationBuilder app)
            {
                UseFileReplication(app, string.Empty);
            }
    
            public static void UseFileReplication(this IApplicationBuilder app, string urlPrefix)
            {
                var routeBuilder = new RouteBuilder(app);
                string template = (string.IsNullOrWhiteSpace(urlPrefix) ? "" : urlPrefix +
"/") +
                    "{context}/{shard}/{action}";
    
                routeBuilder.MapGet(template, async context =>
                {
                    var replicatorAccessor = app.ApplicationServices.GetService<IReplicatorAccessor>();
                    var replicationService = new ReplicationService(replicatorAccessor.Replicators);
                    replicationService.Perform(new AspNetCoreReplicationRequest(context.Request),

                        new AspNetCoreReplicationResponse(context.Response));
    
                    await Task.FromResult(0);
                });
    
                var routes = routeBuilder.Build();
                app.UseRouter(routes);
            }
        }
    ```
    
    So, in a nutshell you were right that there is more to the server than a simple HTTP listener.
But I was right that the HTTP listener doesn't have any runtime behavior and thus needs to
be registered at startup. So, we are both right :).
    
    We just needed a way to register the shards in a way that they can be used both at startup
and at runtime. But it looks like you were exactly on the right path where we need to go.
    
    BTW - not sure about the `await` above. Maybe we should provide a way to `await` the `Perform()`
method?
    
    The above is horrible in terms of how it blocks the default MVC route - we need at least
one route constrant. The simplest option (although not as thorough at it should be) would
just be to make a constraint that matches the shard name against the keys in the `IReplicatorAccessor.Replicators.Keys`
property. Suggestions on how to make it a bit more reliable than that welcome. Also, I wasn't
able to work out how to use a constraint in conjunction with a handler...more time (or suggestions)
needed.
    
    The client can simply do its stuff by calling replicator API methods directly, there is
no need to build anything special into AspNetCore for that.
    
    As for the IndexWriters and IndexReaders, we can also provide similar extension methods
to register them as singletons and access them in services (but not in `Lucene.Net.Replicator`
- we'll build a package named `Lucene.Net.AspNetCore` for that).
    
    So, if you don't mind putting this together like I have above and cleaning up the Java
comments, I think this will be good enough to merge. Or if you don't have the time let me
know and I will get it done.
    
    There are some unfinished documentation comments and some methods that should be virtual
(in Java they are virtual by default), but I think I can manage these. The important thing
is that all of the test pass.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

Mime
View raw message