sis-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Martin Desruisseaux (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (SIS-152) Consider using XSLT for handling XML documents compliant to different version of the standards
Date Thu, 05 Dec 2013 20:45:37 GMT

     [ https://issues.apache.org/jira/browse/SIS-152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Martin Desruisseaux updated SIS-152:
------------------------------------

    Description: 
OGC standards are updated once every few years. When a standard get significant changes, OGC
produces a new XML schema with a new namespace URI. For example when upgrading from GML 3.1
to 3.2, the namespace URI changed from http://www.opengis.net/gml to http://www.opengis.net/gml/3.2.

The most straightforward way to support different GML versions with JAXB is to have a different
set of classes for each GML version. This is made easy by running the {{xjc}} compiler on
the XML schemas provided by OGC. But OGC/ISO standards have thousands of elements, and duplicating
all of them for every version has many inconvenient:

* Massive code duplication (hundreds of classes, many of them strictly identical except for
the namespace).
* Handling the above-cited classes duplication requires either a bunch of "{{if (x instanceof
Y)}}" statements in every SIS corners (inconceivable), or to edit the {{xjc}} output in order
to give them a common parent class or interface.
* The namespaces of all versions appear in the {{xmlns}} attributes of the root element (we
can not always create separated JAXB contexts), which is confusing and prevent usage of usual
prefixes for all versions except one.

An alternative is to support "natively" (through JAXB annotations) only one version of each
standard, and transform XML documents at (un)marshalling time if the document uses different
standard versions. This is often done by defining a XSLT to be executed by the {{javax.xml.transform}}
package. Following is an example of XSLT for changing the namespace.

{code:xml}
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl   = "http://www.w3.org/1999/XSL/Transform"
                xmlns:xalan = "http://xml.apache.org/xslt"
                xmlns:gml   = "http://www.opengis.net/gml/3.2"
                version     = "1.0">
  <xsl:output method="xml" indent="yes" xalan:indent-amount="2"/>

  <!-- Identity copy. -->
  <xsl:template match="node()|@*">
    <xsl:copy>
      <xsl:apply-templates select="node()|@*"/>
    </xsl:copy>
  </xsl:template>

  <!-- Change elements namespace. -->
  <xsl:template match="gml:*">
     <xsl:element name='gml2:{local-name()}' namespace='http://www.opengis.net/gml'>
      <xsl:apply-templates select="node()|@*"/>
    </xsl:element>
  </xsl:template>
</xsl:stylesheet>
{code}

However {{javax.xml.transform}} is heavy and perform more transformations than desired. For
example applying the above XSLT causes additional {{xmlns}} attributes to appear in all elements
under the root.

Apache SIS 0.4 in its {{org.apache.sis.xml}} package takes a lighter approach, based on {{javax.xml.stream.XMLStreamReader}}
and {{XMLStreamWriter}} custom implementations used as "micro-transformers". This works for
simple changes and reduce undesirable transformations.

However if future SIS versions need to handle more complicated changes, we may revisit this
choice and uses XSLT. The purpose of this JIRA task is to remember that switching to XSLT
may be something to consider. This is not something to do now - we are waiting for more experience
before to determine is XSLT is appropriate, given its cost and the output result.


  was:
OGC standards are updated once every few years. When a standard get significant changes, OGC
produces a new XML schema with a new namespace URI. For example when upgrading from GML 3.1
to 3.2, the namespace URI changed from http://www.opengis.net/gml to http://www.opengis.net/gml/3.2.

The most straightforward way to support different GML versions with JAXB is to have a different
set of classes for each GML version. This is made easy by running the {{xjc}} compiler on
the XML schemas provided by OGC. But OGC/ISO standards have thousands of elements, and duplicating
all of them for every version has many inconvenient:

* Massive code duplication (hundreds of classes, many of them strictly identical except for
the namespace).
* Handling the above-cited classes duplication requires either a bunch of "{{if (x instanceof
Y)}}" statements in every SIS corners (inconceivable), or to edit the {{xjc}} output in order
to give them a common parent class or interface.
* The namespaces of all versions appear in the {{xmlns}} attributes of the root element (we
can not always create separated JAXB contexts), which is confusing and prevent usage of usual
prefixes for all versions except one.

An alternative is to support "natively" (through JAXB annotations) only one version of each
standard, and transform XML documents at (un)marshalling time if the document uses different
standard versions. This is often done by defining a XSLT to be executed by the {{javax.xml.transform}}
package. Following is an example of XSLT for changing the namespace.

{code:xml}
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl   = "http://www.w3.org/1999/XSL/Transform"
                xmlns:xalan = "http://xml.apache.org/xslt"
                xmlns:gml   = "http://www.opengis.net/gml/3.2"
                version     = "1.0">
  <xsl:output method="xml" indent="yes" xalan:indent-amount="2"/>

  <!-- Identity copy. -->
  <xsl:template match="node()|@*">
    <xsl:copy>
      <xsl:apply-templates select="node()|@*"/>
    </xsl:copy>
  </xsl:template>

  <!-- Change elements namespace. -->
  <xsl:template match="gml:*">
     <xsl:element name='gml2:{local-name()}' namespace='http://www.opengis.net/gml'>
      <xsl:apply-templates select="node()|@*"/>
    </xsl:element>
  </xsl:template>
</xsl:stylesheet>
{code}

However {{javax.xml.transform}} is heavy and perform more transformations than desired. For
example applying the above XSLT causes additional {{xmlns}} attributes to appear in all elements
under the root.

Apache SIS 0.4 in its {{org.apache.sis.xml}} package takes a lighter approach, based on {{javax.xml.stream.XMLStreamReader}}
and {{XMLStreamWriter}} custom implementations used as "micro-transformers". This works for
simple changes and reduce undesirable transformations.

However if future SIS versions need to handle more complicated changes, we may revisit this
choice and uses XSLT. The purpose of this JIRA task is to remember that switching to XSLT
may be something to consider. This is not something to do now - we are waiting for mor e experience
before to determine is XSLT is appropriate.



> Consider using XSLT for handling XML documents compliant to different version of the
standards
> ----------------------------------------------------------------------------------------------
>
>                 Key: SIS-152
>                 URL: https://issues.apache.org/jira/browse/SIS-152
>             Project: Spatial Information Systems
>          Issue Type: Task
>          Components: Utilities
>    Affects Versions: 0.3
>            Reporter: Martin Desruisseaux
>            Priority: Minor
>
> OGC standards are updated once every few years. When a standard get significant changes,
OGC produces a new XML schema with a new namespace URI. For example when upgrading from GML
3.1 to 3.2, the namespace URI changed from http://www.opengis.net/gml to http://www.opengis.net/gml/3.2.
> The most straightforward way to support different GML versions with JAXB is to have a
different set of classes for each GML version. This is made easy by running the {{xjc}} compiler
on the XML schemas provided by OGC. But OGC/ISO standards have thousands of elements, and
duplicating all of them for every version has many inconvenient:
> * Massive code duplication (hundreds of classes, many of them strictly identical except
for the namespace).
> * Handling the above-cited classes duplication requires either a bunch of "{{if (x instanceof
Y)}}" statements in every SIS corners (inconceivable), or to edit the {{xjc}} output in order
to give them a common parent class or interface.
> * The namespaces of all versions appear in the {{xmlns}} attributes of the root element
(we can not always create separated JAXB contexts), which is confusing and prevent usage of
usual prefixes for all versions except one.
> An alternative is to support "natively" (through JAXB annotations) only one version of
each standard, and transform XML documents at (un)marshalling time if the document uses different
standard versions. This is often done by defining a XSLT to be executed by the {{javax.xml.transform}}
package. Following is an example of XSLT for changing the namespace.
> {code:xml}
> <?xml version="1.0" encoding="UTF-8"?>
> <xsl:stylesheet xmlns:xsl   = "http://www.w3.org/1999/XSL/Transform"
>                 xmlns:xalan = "http://xml.apache.org/xslt"
>                 xmlns:gml   = "http://www.opengis.net/gml/3.2"
>                 version     = "1.0">
>   <xsl:output method="xml" indent="yes" xalan:indent-amount="2"/>
>   <!-- Identity copy. -->
>   <xsl:template match="node()|@*">
>     <xsl:copy>
>       <xsl:apply-templates select="node()|@*"/>
>     </xsl:copy>
>   </xsl:template>
>   <!-- Change elements namespace. -->
>   <xsl:template match="gml:*">
>      <xsl:element name='gml2:{local-name()}' namespace='http://www.opengis.net/gml'>
>       <xsl:apply-templates select="node()|@*"/>
>     </xsl:element>
>   </xsl:template>
> </xsl:stylesheet>
> {code}
> However {{javax.xml.transform}} is heavy and perform more transformations than desired.
For example applying the above XSLT causes additional {{xmlns}} attributes to appear in all
elements under the root.
> Apache SIS 0.4 in its {{org.apache.sis.xml}} package takes a lighter approach, based
on {{javax.xml.stream.XMLStreamReader}} and {{XMLStreamWriter}} custom implementations used
as "micro-transformers". This works for simple changes and reduce undesirable transformations.
> However if future SIS versions need to handle more complicated changes, we may revisit
this choice and uses XSLT. The purpose of this JIRA task is to remember that switching to
XSLT may be something to consider. This is not something to do now - we are waiting for more
experience before to determine is XSLT is appropriate, given its cost and the output result.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Mime
View raw message