xmlgraphics-fop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From bugzi...@apache.org
Subject DO NOT REPLY [Bug 46705] [PATCH] Enhancement: PDF Accessibility
Date Wed, 18 Feb 2009 23:22:10 GMT
https://issues.apache.org/bugzilla/show_bug.cgi?id=46705





--- Comment #14 from Andreas L. Delmelle <adelmelle@apache.org>  2009-02-18 15:22:09
PST ---
(In reply to comment #13)
> (In reply to comment #12)
> 
> I would assume: no.

FWIW, just checked the XML and XPath specifications, and that one question
regarding 'id' is answered. If one specifies an 'xml:id', that would be a
different matter entirely. Since, in XSL-FO, the attribute is in the default
namespace (think 'fo:id'), they have nothing to do with each other.

I agree with your assumption. There seems to be absolutely no guarantee that
generate-id() returns unique values across the two passes. Even worse, if the
processor would implement generate-id() as a deterministic function (for
example: if the id value is solely based on an index which is incremented with
every element), then in both passes, generate-id() would always return
something like "N0" for the root node, and the risk of collisions could become
significant...

Placing the stress on the XSLT processor could offer a way out:

[in addPtr.xsl]
<!-- key for the explicit ids -->
<xsl:key name="idKey" match="fo:*" use="@id" />
...
<xsl:template name="addId">
  <xsl:param name="checkId" select="generate-id()" />
  <xsl:choose>
    <xsl:when test="key('idKey',$checkId)">
       <xsl:call-template name="addId">
         <xsl:with-param name="checkId"
select="generate-id(key('idKey',$checkId))" />
       </xsl:template>
    </xsl:when>
    <xsl:otherwise>
      <xsl:value-of select="$checkId" />
    </xsl:otherwise>
  </xsl:choose>
</xsl:template>
...
<xsl:attribute name="id">
  <xsl:call-template name="addId" />
</xsl:attribute>
...

Untested for performance. The mere addition of the key could be problematic, as
the first call to key('idKey',...) cannot be evaluated unless the entire source
document is scanned.
I keep looking at that recursive call, and wonder whether that could lead to
problems. Strictly speaking, if the id starting the recursion will always be
unique (by definition so), then the recursion should always stop at some point,
and the path should always run over different nodes.
The depth will, in the most extreme case, be equal to the maximum number of
collisions, which is the same as the number of nodes that already have an
explicit id, or the size of the xsl:key map.

Assuming that the explicit ids added in the first pass adhere to the uniqueness
constraint:
If, for any given node, the recursion would go through all possible collisions,
then that would mean that the collision-values all refer to each other. That
is, every value returned by generate-id() for a node in the key-map, is already
specified as an id on another node in that same map. That can happen at most N
times for N nodes with explicit ids, and it already happened once, for a node
not in that map, to trigger the recursion. So, there will then be one of those
N nodes for which the generated id is not yet occupied. Moreover, this most
extreme case can present itself at most for one node in the document. The more
collisions with nodes not in the map, the less deep the recursions will go.

> <snip/>
> 
> Given all these observations, I think it would make sense to take the XSLT
> approach first (add an XSLT-generated ID if non exists) and then gather
> experience with this approach. This is easy to implement and can be extended
> later if we see the need.

Agreed.


-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

Mime
View raw message