portals-jetspeed-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Sean Taylor <da...@bluesunrise.com>
Subject Re: ???NEW WebPageService/WebPageServlet NEW ???
Date Sat, 01 Mar 2003 05:48:50 GMT

On Friday, February 28, 2003, at 04:43  PM, David G. Powers wrote:

> I've read others comments, given it a little thought, and looked at  
> some
> of the tools mentioned in prior emails
> If you're interested, please review my comments and reply
> ----------------------------------------------------------------------- 
> --------
> 1.	What should this thing be called?
> ----------------------------------------------------------------------- 
> --------
> I don't think anyone has used HttpProxyPortlet yet.  With a new new a
> new thread can be started....
Proxy to me means a Proxy Server, which can get quite complicated.

> ----------------------------------------------------------------------- 
> --------
> 2.	What is the "preferred" HTML parser API to base the rewriter on?
> ----------------------------------------------------------------------- 
> --------
> My inclination is to use the Swing API with a 1.3 JDK for parsing.   
> Within
> each tag event I might use regular expression to, for example, prepend  
> a
> portlet-specific id to JavaScript function names and the HTML objects
> they act upon.
Why not a pluggable parser?
If I want to use Neko, why force me to use Swing?
Have you looked at the org.apache.jetspeed.util.rewriter package?

> It looks like Neko would work better with an XHTML document.  The docs
> seem to indicate that malformed HTML isn't handled yet.  It has been my
> experience that you cannot expect a well-formed HTML document.
I've heard a lot of good about Neko. Haven't actually used it.
The other choice is the Tidy parser, which now supports SAX events I  

> I looked at Noodle only briefly,  There is a RequestFilterInterface  
> which
> provides the content as one big byte array with some extra attributes
> stripped out (HTTP Headers).  Am I wrong in assuming that the filters  
> are
> regular expressions that Noodle will iterate through and apply to the
> complete HTML document?  That seems horribly inefficient.

Yes it does seem inefficient.

Recently read about a HTTP Proxy Cache project, see below.
Perhaps we can have a look at their code and see if it fits in with  
what we're doing.

Pasted from the Jakarta General List:

we are five italian programmers and we have finished some days ago  
"Puff", a http cache proxy written in java that have some interesting  
features like a spider that prefetch the web links and an option to  
convert all images to black and white one to have a speeder connection  
for the client. This software is free licensed (we haven't already  
choosed what type but this is free however!!) and we think to jakarta  
If it is possible tell us how we have to send it, the documentation and  
what ever you want.

Best regards

Paparoni Federico

David Sean Taylor
Bluesunrise Software
+01 707 773-4646

To unsubscribe, e-mail: jetspeed-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: jetspeed-dev-help@jakarta.apache.org

View raw message