lucenenet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From GitBox <...@apache.org>
Subject [GitHub] [lucenenet] NightOwl888 commented on issue #229: Updates docs build
Date Tue, 13 Aug 2019 13:20:26 GMT
NightOwl888 commented on issue #229: Updates docs build
URL: https://github.com/apache/lucenenet/pull/229#issuecomment-520831453
 
 
   > ### versioning
   >
   > I know that in the JavaDocToMarkdownConverter there's a TODO for passing in a tab/version
which is for the method RepoLinkReplacer ... but, this method is looking for links in files
like overview.md with this syntax src-html, but as far as i can see there is only 2x places
in all of the source that contains these types of links which is in the Lucene.Net/overview.md
file which is supposed to link to some demos. For this method it would prob just be easier
to fix this file, unless there are more inline links i'm unsure of.
   >
   > Apart from that is there another area where we need to have the version number/tag
injected in places?
   >
   > update: just noticed a version link here https://lucenenetdocs.azurewebsites.net/index.html#reference-documents
... so we'd need to pass a version into the build for that one, do you know of others?
   
   I did a search using NotePad++'s "Find in Files" feature and here is the entire list
   
   ```
   Search "src-html" (7 hits in 2 files)
     \\Boggle\F\Projects\_Test\lucene-solr-4.8.1\lucene\core\src\java\overview.html (2 hits)
   	Line 147: &nbsp;<a href="../demo/src-html/org/apache/lucene/demo/IndexFiles.html">IndexFiles.java</a>
creates an
   	Line 151: &nbsp;<a href="../demo/src-html/org/apache/lucene/demo/SearchFiles.html">SearchFiles.java</a>
prompts for
     \\Boggle\F\Projects\_Test\lucene-solr-4.8.1\lucene\demo\src\java\overview.html (5 hits)
   	Line 101:      <li><a href="src-html/org/apache/lucene/demo/IndexFiles.html">IndexFiles.java</a>:
code to create a Lucene index.
   	Line 102:      <li><a href="src-html/org/apache/lucene/demo/SearchFiles.html">SearchFiles.java</a>:
code to search a Lucene index.
   	Line 110: "src-html/org/apache/lucene/demo/IndexFiles.html">IndexFiles</a> class
creates
   	Line 181: "src-html/org/apache/lucene/demo/SearchFiles.html">SearchFiles</a>
class is
   	Line 186: "src-html/org/apache/lucene/demo/IndexFiles.html">IndexFiles</a> class
as well)
   ```
   
   My thought on the versioning was to pretty much copy what Lucene did. You can access any
version by simply changing the version number in the URL:
   
   * https://lucene.apache.org/core/4_8_0/index.html
   * https://lucene.apache.org/core/7_4_0/index.html
   
   This should include beta versions. We don't want to remove documentation that might be
relevant only to a specific version if someone is still depending on that version. Especially
if there have been breaking API changes between them.
   
   * https://lucenenet.somewhere.com/4.8.0-beta00005/index.html
   * https://lucenenet.somewhere.com/4.8.0-beta00006/index.html
   
   Each version of the docs should point only to its own version of the source (using the
tag)
   
   * https://lucenenet.somewhere.com/4.8.0-beta00005/index.html > https://github.com/apache/lucenenet/blob/Lucene.Net_4_8_0_beta00005/src/Lucene.Net.Analysis.Common/Analysis/Ar/ArabicAnalyzer.cs
   * https://lucenenet.somewhere.com/4.8.0-beta00006/index.html > https://github.com/apache/lucenenet/blob/Lucene.Net_4_8_0_beta00006/src/Lucene.Net.Analysis.Common/Analysis/Ar/ArabicAnalyzer.cs
   
   This ensures the docs for a specific version stay static and point to the right code for
that version even though the code is changing and new versions of docs are being released
over time. We don't want to point to the head of the repository, because by the time the reader
clicks the link, the doc could be years behind the code.
   
   Perhaps there should even be an index page/directory listing at the root that shows all
of the versions that are available (at https://lucenenet.somewhere.com/). Also, it might make
sense to make a copy (or redirect) of the latest version at https://lucenenet.somewhere.com/latest/
so we can have links that never need to drift in some places.
   
   The fact that you are hosting them in a temporary location is fine, but they shouldn't
be at the top level of the site, they should be in a directory with the version number on
it (or at least one that is escaped in a way that works in the URL).
   
   ### building
   
   I am having issues getting this working. 
   
   1. The first obstacle I ran into was that it prompted for the credentials for the NuGet
feeds I have referenced that aren't public. I *think* the appropriate way to fix this is to
add a `NuGet.config` file to the appropriate folder to temporarily override what is configured
on the machine. It looks like you are trying to restore both `Lucene.Net.sln` and `LuceneDocsPlugins.sln`?
I was able to work around this by disabling those feeds on my machine.
   2. After that, it seems that `vswhere` isn't correctly identifying the location of MSBuild
on my machine. I only have VS2019 Community and VS2017 Community installed. I tried downloading
and installing the VS2015 build tools as per the comments, opening a new instance of Powershell
and running again, but get the same result.
   
   ```powershell
   Windows PowerShell
   Copyright (C) Microsoft Corporation. All rights reserved.
   
   PS C:\Users\shad> f:
   PS F:\> cd projects/lucenenet
   PS F:\projects\lucenenet> ./websites/apidocs/docs.ps1 0 1
   
   
       Directory: F:\projects\lucenenet\websites\apidocs
   
   
   Mode                LastWriteTime         Length Name
   ----                -------------         ------ ----
   d-----        8/13/2019   5:28 PM                tools
   Cleaning tools...
   
   
       Directory: F:\projects\lucenenet\websites\apidocs\tools
   
   
   Mode                LastWriteTime         Length Name
   ----                -------------         ------ ----
   d-----        8/13/2019   5:43 PM                tmp
   d-----        8/13/2019   5:43 PM                docfx
   Retrieving docfx...
   d-----        8/13/2019   5:44 PM                nuget
   Download NuGet...
   d-----        8/13/2019   5:44 PM                vswhere
   Download VsWhere...
   Feeds used:
     https://api.nuget.org/v3/index.json
     C:\Program Files (x86)\Microsoft SDKs\NuGetPackages\
   
   Installing package 'vswhere' to 'F:\projects\lucenenet\websites\apidocs\tools\tmp'.
     GET https://api.nuget.org/v3/registration3-gz-semver2/vswhere/index.json
     OK https://api.nuget.org/v3/registration3-gz-semver2/vswhere/index.json 867ms
   
   
   Attempting to gather dependency information for package 'vswhere.2.7.1' with respect to
project 'F:\projects\lucenenet\websites\apidocs\tools\tmp', targeting 'Any,Version=v0.0'
   Gathering dependency information took 25.38 ms
   Attempting to resolve dependencies for package 'vswhere.2.7.1' with DependencyBehavior
'Lowest'
   Resolving dependency information took 0 ms
   Resolving actions to install package 'vswhere.2.7.1'
   Resolved actions to install package 'vswhere.2.7.1'
   Retrieving package 'vswhere 2.7.1' from 'nuget.org'.
   Adding package 'vswhere.2.7.1' to folder 'F:\projects\lucenenet\websites\apidocs\tools\tmp'
   Added package 'vswhere.2.7.1' to folder 'F:\projects\lucenenet\websites\apidocs\tools\tmp'
   Successfully installed 'vswhere 2.7.1' to F:\projects\lucenenet\websites\apidocs\tools\tmp
   Executing nuget actions took 200.65 ms
   Cleaning...
   MSBuild path = C:\Program Files (x86)\Microsoft Visual Studio\2019\Community
   MSBuild not found!
   At F:\projects\lucenenet\websites\apidocs\docs.ps1:112 char:2
   +     throw "MSBuild not found!"
   +     ~~~~~~~~~~~~~~~~~~~~~~~~~~
       + CategoryInfo          : OperationStopped: (MSBuild not found!:String) [], RuntimeException
       + FullyQualifiedErrorId : MSBuild not found!
   ```
   Not sure where to go from here.
   
   Do we need to use MSBuild? Can this be done using dotnet.exe?
   
   ### code samples
   
   I had a thought about how we might automate the code samples more easily and reliably than
a code converter, since lack of using blocks will totally make what a code converter gives
us irrelevant, anyway. If the java code sample block can be isolated as a single block of
text, we could generate a hash for it and put that hash into a text (markdown?) file along
with the sample, for example:
   
   ```md
   <hash>C34406A7F4070BC61B9256F6239E2B251CE691F83C2F4A6DD1ADC846FC9847A2</hash>
   <code language="java">
       Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_CURRENT);
   
       // Store the index in memory:
       Directory directory = new RAMDirectory();
       // To store an index on disk, use this instead:
       //Directory directory = FSDirectory.open("/tmp/testindex");
       IndexWriterConfig config = new IndexWriterConfig(Version.LUCENE_CURRENT, analyzer);
       IndexWriter iwriter = new IndexWriter(directory, config);
       Document doc = new Document();
       String text = "This is the text to be indexed.";
       doc.add(new Field("fieldname", text, TextField.TYPE_STORED));
       iwriter.addDocument(doc);
       iwriter.close();
       
       // Now search the index:
       DirectoryReader ireader = DirectoryReader.open(directory);
       IndexSearcher isearcher = new IndexSearcher(ireader);
       // Parse a simple query that searches for "text":
       QueryParser parser = new QueryParser(Version.LUCENE_CURRENT, "fieldname", analyzer);
       Query query = parser.parse("text");
       ScoreDoc[] hits = isearcher.search(query, null, 1000).scoreDocs;
       assertEquals(1, hits.length);
       // Iterate through the results:
       for (int i = 0; i < hits.length; i++) {
         Document hitDoc = isearcher.doc(hits[i].doc);
         assertEquals("This is the text to be indexed.", hitDoc.get("fieldname"));
       }
       ireader.close();
       directory.close();
   <code>
   ```
   
   Then these text files can be manually converted to c# and vb (the latter using the [roslyn
online code converter](https://codeconverter.icsharpcode.net/)) and can be committed to the
lucenenet repo. A converted file may look something like:
   
   ```md
   <hash>C34406A7F4070BC61B9256F6239E2B251CE691F83C2F4A6DD1ADC846FC9847A2</hash>
   <code language="c#">
       Analyzer analyzer = new StandardAnalyzer(LuceneVersion.LUCENE_CURRENT);
   
       // Store the index in memory:
       using (Directory directory = new RAMDirectory())
       // To store an index on disk, use this instead:
       //using (Directory directory = FSDirectory.Open("/tmp/testindex"))
       {
           IndexWriterConfig config = new IndexWriterConfig(LuceneVersion.LUCENE_CURRENT,
analyzer);
           using (IndexWriter iwriter = new IndexWriter(directory, config))
           {
               Document doc = new Document();
               string text = "This is the text to be indexed.";
               doc.Add(new Field("fieldname", text, TextField.TYPE_STORED));
               iwriter.AddDocument(doc);
           }
   
           // Now search the index:
           using (DirectoryReader ireader = DirectoryReader.Open(directory))
           {
               IndexSearcher isearcher = new IndexSearcher(ireader);
   
               // Parse a simple query that searches for "text":
               QueryParser parser = new QueryParser(LuceneVersion.LUCENE_CURRENT, "fieldname",
analyzer);
               Query query = parser.Parse("text");
               ScoreDoc[] hits = isearcher.Search(query, null, 1000).ScoreDocs;
               Assert.AreEqual(1, hits.Length);
               // Iterate through the results:
               for (int i = 0; i < hits.Length; i++)
               {
                   Document hitDoc = isearcher.Doc(hits[i].Doc);
                   Assert.AreEqual("This is the text to be indexed.", hitDoc.Get("fieldname"));
               }
           }
       }
   <code>
   <code language="vb">
       Dim analyzer As Analyzer = New StandardAnalyzer(LuceneVersion.LUCENE_CURRENT)
   
       Using directory As Directory = New RAMDirectory()
           Dim config As IndexWriterConfig = New IndexWriterConfig(LuceneVersion.LUCENE_CURRENT,
analyzer)
   
           Using iwriter As IndexWriter = New IndexWriter(directory, config)
               Dim doc As Document = New Document()
               Dim text As String = "This is the text to be indexed."
               doc.Add(New Field("fieldname", text, TextField.TYPE_STORED))
               iwriter.AddDocument(doc)
           End Using
   
           Using ireader As DirectoryReader = DirectoryReader.Open(directory)
               Dim isearcher As IndexSearcher = New IndexSearcher(ireader)
               Dim parser As QueryParser = New QueryParser(LuceneVersion.LUCENE_CURRENT, "fieldname",
analyzer)
               Dim query As Query = parser.Parse("text")
               Dim hits As ScoreDoc() = isearcher.Search(query, Nothing, 1000).ScoreDocs
               Assert.AreEqual(1, hits.Length)
   
               For i As Integer = 0 To hits.Length - 1
                   Dim hitDoc As Document = isearcher.Doc(hits(i).Doc)
                   Assert.AreEqual("This is the text to be indexed.", hitDoc.[Get]("fieldname"))
               Next
           End Using
       End Using
   <code>
   ```
   
   During doc generation, the hash can be re-generated based off of the Java code and checked
against this file, and if it has not changed, no change will be made and the code in the converted
text file will be used in the documentation. If it has changed, then the java code block can
be inserted/appended to the text file, the hash updated to the new value, and a warning/log
generated so the code changes can be manually propagated to the c# and vb code blocks, then
we can manually remove the java code block.
   
   We will need to do 2 passes to generate the docs if the original Java code changes, but
since that will only happen if we upgrade to target a new version of Lucene it won't be a
common case. We should have the build process write warning messages to stdout and also to
a log that gets uploaded as a build artifact just to ensure we don't miss this during deployment.
We could do the second stage offline:
   
   1. Download the build artifacts from the automated generation.
   2. If we have any code changes, update the files manually and commit those files to the
lucenenet official repo.
   3. Regenerate the docs locally based on the changes.
   4. Deploy.
   
   And of course, if there were no code changes, then we can just skip 2 & 3 and deploy.
   
   We don't necessarily have to use `<hash>` and `<code>` elements, I am just
giving that as an example. Whatever is easiest to integrate into the doc generator would be
fine.
   
   Of course, it would be best if the end user had some way to switch between the VB and C#
code sample in the generated documents, but for now we should focus on C# if VB is going to
be too difficult or time consuming to deal with.
   
   The original doc could then have some specially constructed token that the doc generator
knows how to use to grab the code sample from the text file and insert it into the right place
in the generated HTML. Some more thought will need to be put in to the exact layout and number
of text files in relation to number of code samples per generated document, but I am sure
you can work that out. Maybe using a GUID as both a filename and a placeholder in the documentation
is the appropriate way to go, but we don't want to use the hash because that may change over
time.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

Mime
View raw message