portals-jetspeed-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Raphaƫl Luta <raph...@apache.org>
Subject Re: [J2] Proposal: Handling encoding requirements for the Portal URL
Date Thu, 26 Aug 2004 15:03:42 GMT
Ate Douma wrote:
 > <snip>
> A Jetspeed PortalURL is build up out of 5 parts:
> [<baseURL>/<basePath>][/GlobalNavigation][/ControlParameters][?RequestParameters][#LocalNavigation]

> The [baseURL/basePath] defines the access point to the portal like:
>   [<http://localhost>/<jetspeed/portal>]
> The [/GlobalNavigation] defines site navigation to folders and pages like:
>      [/], or [/default-page] or [/default-page.psml] for the default page
>   or [/Administrative/pam]
>   or [/rootfolder/subfolder/page]
>   etc.
> The [/ControlParameters] define statefull portlet specific parameters 
> like for window mode, portlet mode and portlet request parameters.
> These parameters are encoded in two different ways.
>   Non-portlet request parameters:
>     <prefix><type>_<portletWindowId>/<value>
>   Portlet request parameters 
> <prefix><type>_<portletWindowId>_<parameterName>/<parameterCount>_<parameterValue1>[[_<parameterValue<x>]...]

 > The control parameters prefix and type keys are all externally definable
 > (in spring assembly jetspeed-spring.xml)

AFAIK, the JSR168 does not mandate that these parameters be included in the URL, 
you could keep them only in the session and never send them back to the browser.
The only disadvantage I see of doing this is that it becomes impossible for the 
user to bookmark a portal page in a non-default state.

I personnally would rather like to have clean simple URLs and not be able to
boolmark pages in non-default states than having extremely long and unnatural
URLs sent to the user.

> The [?RequestParameters] define the "normal" url parameters like those 
> of an ActionURL and are encoded as usual:
>   name=value definitions with first parameter prefixed with an '?',
> additional parameters prefixed with '&'.
> Finally, the [#LocalNavigation] is only used by a browser to refer to a 
> embedded anchor location.

Agreed, but you may also want to consider that some app servers are configured
not to track sessions through cookies but through URL rewriting so you probably
want to avoid any scheme that can confuse the most popular servers
out there (like apache, tomcat, etc...).

>  From the above definitions it becomes clear that the values of certain 
> elements might cause havoc on the url parsing as is done
> by both the servlet environment as well as by jetspeed.
> Of course, defining the control parameter prefix as '/' clearly will 
> break things, but also using a '/', '_', '?' or '&' within a request 
> parameter
> name or value will do so.
> In the portlet 1.0 specification these kind of problems have been 
> recognized and therefore PLT.7.1 SPEC 30:
>   The portlet-container must "x-www-form-urlencoded" encode parameter 
> names and
>   values added to a PortletURL object.
> Currently, Jetspeed 2 doesn't do this!
> A light encoding algorithm is implemented which replaces '_', '.', '/', 
> '\r', '\n', '<', '>' and ' ' with hexadecimal representations (0x1, 0x2 
> etc).
> But, this is only done for RequestURL parameters and also not as 
> complete as required by the spec. For instance, '?' and '&' characters 
> are currently not encoded.
> Besides the parameter names and values, other elements also can break 
> the url (parsing).
> If a folder or page name in the [/GlobalNavigation] part start with the 
> ControlParameter prefix key character Jetspeed won't be able to 
> determine them correctly.

Agreed on the symptoms. What we do now is a ugly hack that need to be fixed.

> To solve these problems I propose the following solution:
> 1) The control parameter prefix key may not be one of: '/', '?', ' ', or 
> '#'
> 2) To be able to clearly separate the [/GlobalNavigation] part from the 
> [/ControlParameters] part, all [/GlobalNavigation] (path) elements which 
> start with the control parameter prefix are encoded by putting an 
> additional prefix character in front of it. This character of course 
> then also may not be used
> as control parameter prefix. A [/GlobalNavigation] path element already 
> starting with this character must also be prefixed with it, escaping it.
> Which character should be used is a matter of taste. I personally opt 
> for '!'.
> 3) All the [/ControlParameters] portlet request parameter values must 
> have the '_' character encoded, preferably with only a single character 
> instead of using a multi-character hexadecimal style. This, because 
> values might itself contain such a encoding pattern which then needs to 
> be escaped.
> I would prefer using a '!' again (no conflict with [/GlobalNavigation] 
> encoding because these encoded values are never at the start of a url 
> path element).
> If we want to have a clearer distinction another character like '$' 
> would also be good.
> 4) As per the portlet specification, all the [/ControlParameters] 
> parameter name and values as well as those in the [?RequestParameters] 
> part need to be
> "x-www-form-urlencoded" encoded which can be done with java.net.URLEncoder.
> 5) The Url parameter separators '?' and '&' are also allowed to be 
> specified using html escape definitions like &#38 and &amp; for '&' and 
> &#63; for '?'.
> If those are encoded using the URLEncoder they won't be recognized 
> anymore so they must be properly decoded (into '?' and '&') *before* 
> encoding using the URLEncoder.
> 6) Within Jetspeed, HttpServletRequest.getPathInfo() may not be used 
> anymore because it will first decode the the path string which can again 
> break the
> url parsing. Luckily, this can be easily circumvented by using the 
> following statement:
>   String pathInfo =
>     request.getRequestURI().substring(request.getContextPath().length()+
>                                       request.getServletPath().length());
 > As said above, part of this solution I already have in place (4, 5 and 6).
 > Once we have a solid solution I will also do the required additional
 > changes.

While the above will perfectly work, I think you operate within an implicit 
constraint of having all [/ControlParameters] readable and parseable by the
browser/user. This is not a requirement IMO, even if we go and send back
[/ControlParameters] to the browser and not keep them within the server
session, only the server needs to be able to parse these parameters when they
are sent back.

Hence you can for example have the following transformation:
Unencoded control parameter string -> GZipFilter -> Base64Encoder* -> 

(* I believe '/' is valid Base64 encoding so we need to modify slightly the
encoding table not to allow this char but use '!' or '_' instead)

and simply append the the [EncodeControlParameter] at the end of the 
[GlobalNavigation] separated by a /.

Your request URL then becomes a regular one, without any specific additional
delimiters; except that the last segment of the request path info is an encoded
string reprsenting the [ControlParameter] that can easily be decoded by the
Using a gzip filter also ensures we keep control of the URL length since the
absolute minimum size of a control parameter is 11 char and more likely to run
into the 20-25 char.

I personnally value control over the URLs we use in the app and thus would 
prefer solve these issues in this order of preference:
1- not sending control parameters at all and keep them within the server session
2- sending a completely encoded control parameters string and append it as the
    last component of the request path info
3- define additionnal reserved markers and sub-components of the URL

However, if we decide to go for 3 I think your proposal completely fits the
bill :)

Raphael Luta - raphael@apache.org
Apache Jetspeed - Enterprise Portal in Java

To unsubscribe, e-mail: jetspeed-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: jetspeed-dev-help@jakarta.apache.org

View raw message