From Ate Douma <...@douma.nu>
Subject [J2] Proposal: Handling encoding requirements for the Portal URL
Date Thu, 26 Aug 2004 01:48:20 GMT
Currently, the url encoding done by J2 isn't exactly fail prove.
For the Struts Portlet Framework (now Portals Struts Bridge) I created a 
workaround to be able to use embedded url parameters but even that turned out 
not to work under all situations.

Furthermore, there are portlet specification requirements which currently are 
not met.

I have been trying to create a fail prove solution and have already done part of 
what I propose below.
But, to get it completely right turned out to be more complex than I first 
thought it would.
Therefore I first want to present my ideas how to solve this.

I'm not sure I covered all requirements and maybe its not fail prove either, so 
please shoot as much holes in it as possible because the J2 project I'm 
involved in has run in many problems because of this and I need to fix them asap.

First I like to specify the (current) definition of a Jetspeed PortalURL as far 
as I have been able to determine from the code.

A Jetspeed PortalURL is build up out of 5 parts:


The [baseURL/basePath] defines the access point to the portal like:

The [/GlobalNavigation] defines site navigation to folders and pages like:
      [/], or [/default-page] or [/default-page.psml] for the default page
   or [/Administrative/pam]
   or [/rootfolder/subfolder/page]

The [/ControlParameters] define statefull portlet specific parameters like for 
window mode, portlet mode and portlet request parameters.
These parameters are encoded in two different ways.
   Non-portlet request parameters:
   Portlet request parameters 

The control parameters prefix and type keys are all externally definable (in 
spring assembly jetspeed-spring.xml)

The [?RequestParameters] define the "normal" url parameters like those of an 
ActionURL and are encoded as usual:
   name=value definitions with first parameter prefixed with an '?',
additional parameters prefixed with '&'.

Finally, the [#LocalNavigation] is only used by a browser to refer to a embedded 
anchor location.

 From the above definitions it becomes clear that the values of certain elements 
might cause havoc on the url parsing as is done
by both the servlet environment as well as by jetspeed.
Of course, defining the control parameter prefix as '/' clearly will break 
things, but also using a '/', '_', '?' or '&' within a request parameter
name or value will do so.

In the portlet 1.0 specification these kind of problems have been recognized and 
therefore PLT.7.1 SPEC 30:
   The portlet-container must "x-www-form-urlencoded" encode parameter names and
   values added to a PortletURL object.

Currently, Jetspeed 2 doesn't do this!
A light encoding algorithm is implemented which replaces '_', '.', '/', '\r', 
'\n', '<', '>' and ' ' with hexadecimal representations (0x1, 0x2 etc).
But, this is only done for RequestURL parameters and also not as complete as 
required by the spec. For instance, '?' and '&' characters are currently not 

Besides the parameter names and values, other elements also can break the url 
If a folder or page name in the [/GlobalNavigation] part start with the 
ControlParameter prefix key character Jetspeed won't be able to determine them 

To solve these problems I propose the following solution:

1) The control parameter prefix key may not be one of: '/', '?', ' ', or '#'

2) To be able to clearly separate the [/GlobalNavigation] part from the 
[/ControlParameters] part, all [/GlobalNavigation] (path) elements which start 
with the control parameter prefix are encoded by putting an additional prefix 
character in front of it. This character of course then also may not be used
as control parameter prefix. A [/GlobalNavigation] path element already starting 
with this character must also be prefixed with it, escaping it.
Which character should be used is a matter of taste. I personally opt for '!'.

3) All the [/ControlParameters] portlet request parameter values must have the 
'_' character encoded, preferably with only a single character instead of using 
a multi-character hexadecimal style. This, because values might itself contain 
such a encoding pattern which then needs to be escaped.
I would prefer using a '!' again (no conflict with [/GlobalNavigation] encoding 
because these encoded values are never at the start of a url path element).
If we want to have a clearer distinction another character like '$' would also 
be good.

4) As per the portlet specification, all the [/ControlParameters] parameter name 
and values as well as those in the [?RequestParameters] part need to be
"x-www-form-urlencoded" encoded which can be done with java.net.URLEncoder.

5) The Url parameter separators '?' and '&' are also allowed to be specified 
using html escape definitions like &#38 and &amp; for '&' and &#63; for '?'.
If those are encoded using the URLEncoder they won't be recognized anymore so 
they must be properly decoded (into '?' and '&') *before* encoding using the 

6) Within Jetspeed, HttpServletRequest.getPathInfo() may not be used anymore 
because it will first decode the the path string which can again break the
url parsing. Luckily, this can be easily circumvented by using the following 
   String pathInfo =

As said above, part of this solution I already have in place (4, 5 and 6).
Once we have a solid solution I will also do the required additional changes.



