You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@activemq.apache.org by Aleksi Kallio <al...@csc.fi> on 2007/03/14 16:46:52 UTC

Re: BlobMessage support for sending out-of-band BLOBs

>> I had a bit of a look at the Jetty code and it doesn't seem to have a
>> simple 'file servlet' for supporting GET/PUT/DELETE of files in a
>> directory; so I was thinking we could contribute one to the Jetty6
>> tree and reuse that. e.g. maybe use the DefaultServlet in Jetty (which
>> does GET fine) then add PUT/DELETE support

I have this working (upload, download, remove) and have tested it with 
files in the order of a couple of gigabytes.

I'd have two questions, all input is appreciated...


1) I'm currently using standard Java URL / HttpURLConnection stuff to 
connect to the web server. Unfortunately there is a bug or design issue 
that makes HttpURLConnection to load everything into memory:

http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=5026745
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4212479

That is definitely not appropriate for our purposes. Currently you can 
also use chunked transfer mode 
(HttpUrlConnection.setChunkedStreamingMode()) and it works well with 
Jetty. However not all HTTP servers support this mode, I don't know if 
it is an issue. The bigger issue is that the method only exists in Java 
1.5. In other words, it is not usable in the ActiveMQ context?

I think the only solution would be to use some other HTTP package, 
probably commons-http. Or do you see some other workarounds? Personally 
I would have preferred if the thing would have not required anything but 
standard JRE classes. Unfortunately it seems impossible, unless one 
writes the HTTP code from scratch, which does not sound like a clever 
thing to do.


2) While storing files to an intermediary file server they are named 
after their message ID's. Is that wise? Message ID's look a bit unfit 
for filenames. Do you think in some environments/FS's these kind of 
names might cause problems?


Re: BlobMessage support for sending out-of-band BLOBs

Posted by Hiram Chirino <hi...@hiramchirino.com>.
On 3/14/07, Hiram Chirino <hi...@hiramchirino.com> wrote:
> On 3/14/07, James Strachan <ja...@gmail.com> wrote:
> > On 3/14/07, Aleksi Kallio <al...@csc.fi> wrote:
> > > >> I had a bit of a look at the Jetty code and it doesn't seem to have a
> > > >> simple 'file servlet' for supporting GET/PUT/DELETE of files in a
> > > >> directory; so I was thinking we could contribute one to the Jetty6
> > > >> tree and reuse that. e.g. maybe use the DefaultServlet in Jetty (which
> > > >> does GET fine) then add PUT/DELETE support
> > >
> > > I have this working (upload, download, remove) and have tested it with
> > > files in the order of a couple of gigabytes.
> >
> > Awesome!
> >
> >
> > > I'd have two questions, all input is appreciated...
> > >
> > >
> > > 1) I'm currently using standard Java URL / HttpURLConnection stuff to
> > > connect to the web server. Unfortunately there is a bug or design issue
> > > that makes HttpURLConnection to load everything into memory:
> > >
> > > http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=5026745
> > > http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4212479
> > >
> > > That is definitely not appropriate for our purposes. Currently you can
> > > also use chunked transfer mode
> > > (HttpUrlConnection.setChunkedStreamingMode()) and it works well with
> > > Jetty. However not all HTTP servers support this mode, I don't know if
> > > it is an issue. The bigger issue is that the method only exists in Java
> > > 1.5. In other words, it is not usable in the ActiveMQ context?
> > >
> > > I think the only solution would be to use some other HTTP package,
> > > probably commons-http. Or do you see some other workarounds? Personally
> > > I would have preferred if the thing would have not required anything but
> > > standard JRE classes. Unfortunately it seems impossible, unless one
> > > writes the HTTP code from scratch, which does not sound like a clever
> > > thing to do.
> >
> > Good point. Trunk of ActiveMQ is currently Java 5 specific anyway, so
> > using chunked streaming mode sounds a fine workaround for now.
> >
> > We could always switch to using either the Jetty client stuff or
> > commons-httpclient if its on the classpath. It'd be nice to keep the
> > dependency list as small as possible; but work better if
> > commons-httpclient or jetty-client is on the classpath etc.
> >
> >
> > > 2) While storing files to an intermediary file server they are named
> > > after their message ID's. Is that wise?
> >
> > Well the Message IDs are guarrenteed to be system wide unique strings,
> > so they're quite good for the names of the files - but...
> >
> > > Message ID's look a bit unfit
> > > for filenames. Do you think in some environments/FS's these kind of
> > > names might cause problems?
> >
> > Agreed. We should definitely make the message IDs both URI and file
> > name friendly. So how about we search & replace the message ID string
> > and switch some of the dodgy characters and replace them with
> > something.
> >
> > e.g.
> >
> > switch : to a / maybe?
> >
> > Then each producer will be in a directory which will be inside a
> > directory named after the connection ID etc.
> >
> > Thinking about it - it might make more sense to change ActiveMQ itself
> > to create MessageId strings which use / rather than : maybe...
> >
>
> The character restrictions will vary depending on the final store,
> different files systems have different restrictions.  So encoding the
> message id to a safe file name should be left up to the servlet
> managing that file system.
>

I could image someone may use S3 buckets to store this stuff :)

>
> > --
> >
> > James
> > -------
> > http://radio.weblogs.com/0112098/
> >
>
>
> --
> Regards,
> Hiram
>
> Blog: http://hiramchirino.com
>


-- 
Regards,
Hiram

Blog: http://hiramchirino.com

Re: BlobMessage support for sending out-of-band BLOBs

Posted by Hiram Chirino <hi...@hiramchirino.com>.
On 3/15/07, Aleksi Kallio <al...@csc.fi> wrote:
> > The character restrictions will vary depending on the final store,
> > different files systems have different restrictions.  So encoding the
> > message id to a safe file name should be left up to the servlet
> > managing that file system.
>
> I would keep the RestServlet as a generic file storage tool. Filename
> rewriting could be done with a servlet filter or extending RestServlet?

Extending the servlet sounds good.

>


-- 
Regards,
Hiram

Blog: http://hiramchirino.com

Re: BlobMessage support for sending out-of-band BLOBs

Posted by Aleksi Kallio <al...@csc.fi>.
> The character restrictions will vary depending on the final store,
> different files systems have different restrictions.  So encoding the
> message id to a safe file name should be left up to the servlet
> managing that file system.

I would keep the RestServlet as a generic file storage tool. Filename 
rewriting could be done with a servlet filter or extending RestServlet?

Re: BlobMessage support for sending out-of-band BLOBs

Posted by Hiram Chirino <hi...@hiramchirino.com>.
On 3/14/07, James Strachan <ja...@gmail.com> wrote:
> On 3/14/07, Aleksi Kallio <al...@csc.fi> wrote:
> > >> I had a bit of a look at the Jetty code and it doesn't seem to have a
> > >> simple 'file servlet' for supporting GET/PUT/DELETE of files in a
> > >> directory; so I was thinking we could contribute one to the Jetty6
> > >> tree and reuse that. e.g. maybe use the DefaultServlet in Jetty (which
> > >> does GET fine) then add PUT/DELETE support
> >
> > I have this working (upload, download, remove) and have tested it with
> > files in the order of a couple of gigabytes.
>
> Awesome!
>
>
> > I'd have two questions, all input is appreciated...
> >
> >
> > 1) I'm currently using standard Java URL / HttpURLConnection stuff to
> > connect to the web server. Unfortunately there is a bug or design issue
> > that makes HttpURLConnection to load everything into memory:
> >
> > http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=5026745
> > http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4212479
> >
> > That is definitely not appropriate for our purposes. Currently you can
> > also use chunked transfer mode
> > (HttpUrlConnection.setChunkedStreamingMode()) and it works well with
> > Jetty. However not all HTTP servers support this mode, I don't know if
> > it is an issue. The bigger issue is that the method only exists in Java
> > 1.5. In other words, it is not usable in the ActiveMQ context?
> >
> > I think the only solution would be to use some other HTTP package,
> > probably commons-http. Or do you see some other workarounds? Personally
> > I would have preferred if the thing would have not required anything but
> > standard JRE classes. Unfortunately it seems impossible, unless one
> > writes the HTTP code from scratch, which does not sound like a clever
> > thing to do.
>
> Good point. Trunk of ActiveMQ is currently Java 5 specific anyway, so
> using chunked streaming mode sounds a fine workaround for now.
>
> We could always switch to using either the Jetty client stuff or
> commons-httpclient if its on the classpath. It'd be nice to keep the
> dependency list as small as possible; but work better if
> commons-httpclient or jetty-client is on the classpath etc.
>
>
> > 2) While storing files to an intermediary file server they are named
> > after their message ID's. Is that wise?
>
> Well the Message IDs are guarrenteed to be system wide unique strings,
> so they're quite good for the names of the files - but...
>
> > Message ID's look a bit unfit
> > for filenames. Do you think in some environments/FS's these kind of
> > names might cause problems?
>
> Agreed. We should definitely make the message IDs both URI and file
> name friendly. So how about we search & replace the message ID string
> and switch some of the dodgy characters and replace them with
> something.
>
> e.g.
>
> switch : to a / maybe?
>
> Then each producer will be in a directory which will be inside a
> directory named after the connection ID etc.
>
> Thinking about it - it might make more sense to change ActiveMQ itself
> to create MessageId strings which use / rather than : maybe...
>

The character restrictions will vary depending on the final store,
different files systems have different restrictions.  So encoding the
message id to a safe file name should be left up to the servlet
managing that file system.


> --
>
> James
> -------
> http://radio.weblogs.com/0112098/
>


-- 
Regards,
Hiram

Blog: http://hiramchirino.com

Re: BlobMessage support for sending out-of-band BLOBs

Posted by James Strachan <ja...@gmail.com>.
On 3/14/07, Aleksi Kallio <al...@csc.fi> wrote:
> >> I had a bit of a look at the Jetty code and it doesn't seem to have a
> >> simple 'file servlet' for supporting GET/PUT/DELETE of files in a
> >> directory; so I was thinking we could contribute one to the Jetty6
> >> tree and reuse that. e.g. maybe use the DefaultServlet in Jetty (which
> >> does GET fine) then add PUT/DELETE support
>
> I have this working (upload, download, remove) and have tested it with
> files in the order of a couple of gigabytes.

Awesome!


> I'd have two questions, all input is appreciated...
>
>
> 1) I'm currently using standard Java URL / HttpURLConnection stuff to
> connect to the web server. Unfortunately there is a bug or design issue
> that makes HttpURLConnection to load everything into memory:
>
> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=5026745
> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4212479
>
> That is definitely not appropriate for our purposes. Currently you can
> also use chunked transfer mode
> (HttpUrlConnection.setChunkedStreamingMode()) and it works well with
> Jetty. However not all HTTP servers support this mode, I don't know if
> it is an issue. The bigger issue is that the method only exists in Java
> 1.5. In other words, it is not usable in the ActiveMQ context?
>
> I think the only solution would be to use some other HTTP package,
> probably commons-http. Or do you see some other workarounds? Personally
> I would have preferred if the thing would have not required anything but
> standard JRE classes. Unfortunately it seems impossible, unless one
> writes the HTTP code from scratch, which does not sound like a clever
> thing to do.

Good point. Trunk of ActiveMQ is currently Java 5 specific anyway, so
using chunked streaming mode sounds a fine workaround for now.

We could always switch to using either the Jetty client stuff or
commons-httpclient if its on the classpath. It'd be nice to keep the
dependency list as small as possible; but work better if
commons-httpclient or jetty-client is on the classpath etc.


> 2) While storing files to an intermediary file server they are named
> after their message ID's. Is that wise?

Well the Message IDs are guarrenteed to be system wide unique strings,
so they're quite good for the names of the files - but...

> Message ID's look a bit unfit
> for filenames. Do you think in some environments/FS's these kind of
> names might cause problems?

Agreed. We should definitely make the message IDs both URI and file
name friendly. So how about we search & replace the message ID string
and switch some of the dodgy characters and replace them with
something.

e.g.

switch : to a / maybe?

Then each producer will be in a directory which will be inside a
directory named after the connection ID etc.

Thinking about it - it might make more sense to change ActiveMQ itself
to create MessageId strings which use / rather than : maybe...

-- 

James
-------
http://radio.weblogs.com/0112098/

Re: BlobMessage support for sending out-of-band BLOBs

Posted by Hiram Chirino <hi...@hiramchirino.com>.
On 3/14/07, Aleksi Kallio <al...@csc.fi> wrote:
> >> I had a bit of a look at the Jetty code and it doesn't seem to have a
> >> simple 'file servlet' for supporting GET/PUT/DELETE of files in a
> >> directory; so I was thinking we could contribute one to the Jetty6
> >> tree and reuse that. e.g. maybe use the DefaultServlet in Jetty (which
> >> does GET fine) then add PUT/DELETE support
>
> I have this working (upload, download, remove) and have tested it with
> files in the order of a couple of gigabytes.
>

wow great!

> I'd have two questions, all input is appreciated...
>
>
> 1) I'm currently using standard Java URL / HttpURLConnection stuff to
> connect to the web server. Unfortunately there is a bug or design issue
> that makes HttpURLConnection to load everything into memory:
>
> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=5026745
> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4212479
>

bugger!

> That is definitely not appropriate for our purposes. Currently you can
> also use chunked transfer mode
> (HttpUrlConnection.setChunkedStreamingMode()) and it works well with
> Jetty. However not all HTTP servers support this mode, I don't know if
> it is an issue. The bigger issue is that the method only exists in Java
> 1.5. In other words, it is not usable in the ActiveMQ context?
>

ActiveMQ 5.x will depend on java 5 anyways so I think you are safe!

> I think the only solution would be to use some other HTTP package,
> probably commons-http. Or do you see some other workarounds? Personally
> I would have preferred if the thing would have not required anything but
> standard JRE classes. Unfortunately it seems impossible, unless one
> writes the HTTP code from scratch, which does not sound like a clever
> thing to do.
>

I agree, lets just use HttpUrlConnection with chunking for now.  If we
need to backport to older jvm versions, then we could just detect that
chunking is not available and the down side would just be you have the
load everything into memory.

>
> 2) While storing files to an intermediary file server they are named
> after their message ID's. Is that wise? Message ID's look a bit unfit
> for filenames. Do you think in some environments/FS's these kind of
> names might cause problems?
>

It can/will.  I would suggest that the servlet that creats/loads the
files do some sort of name encoding so that it is safe for the file
system.

>


-- 
Regards,
Hiram

Blog: http://hiramchirino.com