You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@activemq.apache.org by Aleksi Kallio <al...@csc.fi> on 2007/02/14 14:30:43 UTC
FileMessage: we would like to contribute
There has been discussion regarding FileMessage or something similar: a
message for transferring large amounts of data.
For background, see:
http://issues.apache.org/activemq/browse/AMQ-1075
We need something that offers better service than streaming, of course
by building on top of it. That is why we would be interested in this and
ready to contribute our effort into making it happen.
We have to be capable of moving BIG files over JMS, in the order of
couple of gigabytes. Normally they are a lot smaller, but also the big
ones must be handled.
Features/fixes we are looking for are:
1) streaming requires dedicated destination, which makes things complicated
2) resuming transfers
3) easy monitoring of progress
From core developers we would appreciate feedback on how to proceed
with developing such a feature. From rest of the communinity we would
gladly get feedback on how you would like this to be done and what kind
of API would be the best.
In the spirit of making ActiveMQ even better,
--
Aleksi Kallio, Application Architect, Scientific Software Development
P.O. BOX 405, 02101 Espoo, Finland, Tel +358 9 457 2297
CSC is the Finnish IT center for science, www.csc.fi
e-mail: aleksi.kallio@csc.fi
Re: [Spam: 5.0] Re: FileMessage: we would like to contribute
Posted by Hiram Chirino <hi...@hiramchirino.com>.
I personally like options 2 and 3.
The message just contains a URL to where the whole stream can be
downloaded. The nice part about this is that then you have many
option for where your store the message. You make the URL a reference
to the producer. And a consumer would just connect back to the
producer to get the stream. The producer could first upload the
message to the broker, and then pass a URL to the broker, and the
consumer would connect back to the broker for the stream. You could
upload the file to and HTTP server or even something like S3 and have
the consumer connect to that for the stream. In all these cases the
only thing that varies is where the producer uploads the data to. The
message and the consumer behave the same way in all the cases.
I think that the simplest case to implement which will provide the
most robust operation is to have the producer upload the stream to the
broker and pass a reference to the broker's blob and have clients
connect back to it.
On 2/22/07, Aleksi Kallio <al...@csc.fi> wrote:
>
> I'll move this discussion over to dev@activemq.apache.org as it belongs
> there...
>
> Below I go on sketching how FileMessage might work. As you can see it is
> still quite sketchy in some parts so all comments and good ideas are
> welcome.
>
> Btw. is FileMessage a good name? In some cases we are not transferring
> files, but data in memory. Basically we are transferring a finite byte
> sequence. Is ByteSequenceMessage too awkward? I think it is maybe...
>
> > There are a few different possible implementations...
> >
> > (i) FileMessage ends up being a wrapper on top of the existing JMS streams
>
> This would be the preferred method in our case.
>
> Current streaming API uses a designated destination for the stream. If
> that destination is used for other purposes (other streams or messages)
> there has to be a way of separating current stream from the rest of the
> traffic.
>
> Streams support message selectors, which looks like the best solution.
> The problem is that if there are non-FileMessages sent to that
> destination also and listeners for them, they have to also use selectors
> to weed out FileMessage streams.
>
> It would be great if FileMessage would behave just like other types of
> messages. In the streaming case I guess it is not possible to achieve?
>
> > (ii) FileMessage uses some out of band transfer mechanism.
> > (iii) direct connection. This option is similar to (ii) but rather
> > than putting the file on some remote file server, the file stays on
> > the producer until the consumer has received it;
> >
> > The nice thing is I think all of these approaches can be handled
> > nicely by the single FileMessage API; then it can be a
> > configuration/policy issue as to exactly which implementation is used.
>
> If we look at the three implementations and what they would require from
> the two endpoints:
>
> 1. producer must inform what selector is to be used to receive data
> through JMS stream, consumers must acknowledge when they are ready to
> receive data
> 2. producer must place file available to external server and place URL
> to the message
> 3. producer must open a port and place URL to the message
>
> .. we see that streaming is maybe the trickiest one. If producer just
> starts streaming there is no guarantee (in the general case) that
> consumer will receive the whole stream from the beginning. Is that correct?
>
> In case 2, there is the question about pruning the files. Are they left
> to external server for ever? Or should consumer be capable of confirming
> the transfer, and after confirmation producer would remove the file? I
> don't think we should assume the consumer has a write access to file server.
>
> Case 3 is actually pretty straightforward. Do we want allow also a push
> option where consumer opens the port and producer delivers the file?
>
> > In terms of getting started, the simplest route is probably to add the
> > API in first (to start with assuming just a URL to the file which is a
> > no brainer) then we should be able to start adding different providers
> > to suit.
>
> Yes, I think that's the best way to go forward. I'll write something
> based on that JIRA issue and send it to dev@activemq.apache.org for
> comments. Does that sound good?
>
>
--
Regards,
Hiram
Blog: http://hiramchirino.com
Re: [Spam: 5.0] Re: FileMessage: we would like to contribute
Posted by Aleksi Kallio <al...@csc.fi>.
I'll move this discussion over to dev@activemq.apache.org as it belongs
there...
Below I go on sketching how FileMessage might work. As you can see it is
still quite sketchy in some parts so all comments and good ideas are
welcome.
Btw. is FileMessage a good name? In some cases we are not transferring
files, but data in memory. Basically we are transferring a finite byte
sequence. Is ByteSequenceMessage too awkward? I think it is maybe...
> There are a few different possible implementations...
>
> (i) FileMessage ends up being a wrapper on top of the existing JMS streams
This would be the preferred method in our case.
Current streaming API uses a designated destination for the stream. If
that destination is used for other purposes (other streams or messages)
there has to be a way of separating current stream from the rest of the
traffic.
Streams support message selectors, which looks like the best solution.
The problem is that if there are non-FileMessages sent to that
destination also and listeners for them, they have to also use selectors
to weed out FileMessage streams.
It would be great if FileMessage would behave just like other types of
messages. In the streaming case I guess it is not possible to achieve?
> (ii) FileMessage uses some out of band transfer mechanism.
> (iii) direct connection. This option is similar to (ii) but rather
> than putting the file on some remote file server, the file stays on
> the producer until the consumer has received it;
>
> The nice thing is I think all of these approaches can be handled
> nicely by the single FileMessage API; then it can be a
> configuration/policy issue as to exactly which implementation is used.
If we look at the three implementations and what they would require from
the two endpoints:
1. producer must inform what selector is to be used to receive data
through JMS stream, consumers must acknowledge when they are ready to
receive data
2. producer must place file available to external server and place URL
to the message
3. producer must open a port and place URL to the message
.. we see that streaming is maybe the trickiest one. If producer just
starts streaming there is no guarantee (in the general case) that
consumer will receive the whole stream from the beginning. Is that correct?
In case 2, there is the question about pruning the files. Are they left
to external server for ever? Or should consumer be capable of confirming
the transfer, and after confirmation producer would remove the file? I
don't think we should assume the consumer has a write access to file server.
Case 3 is actually pretty straightforward. Do we want allow also a push
option where consumer opens the port and producer delivers the file?
> In terms of getting started, the simplest route is probably to add the
> API in first (to start with assuming just a URL to the file which is a
> no brainer) then we should be able to start adding different providers
> to suit.
Yes, I think that's the best way to go forward. I'll write something
based on that JIRA issue and send it to dev@activemq.apache.org for
comments. Does that sound good?
Re: [Spam: 5.0] Re: FileMessage: we would like to contribute
Posted by Aleksi Kallio <al...@csc.fi>.
I'll move this discussion over to dev@activemq.apache.org as it belongs
there...
Below I go on sketching how FileMessage might work. As you can see it is
still quite sketchy in some parts so all comments and good ideas are
welcome.
Btw. is FileMessage a good name? In some cases we are not transferring
files, but data in memory. Basically we are transferring a finite byte
sequence. Is ByteSequenceMessage too awkward? I think it is maybe...
> There are a few different possible implementations...
>
> (i) FileMessage ends up being a wrapper on top of the existing JMS streams
This would be the preferred method in our case.
Current streaming API uses a designated destination for the stream. If
that destination is used for other purposes (other streams or messages)
there has to be a way of separating current stream from the rest of the
traffic.
Streams support message selectors, which looks like the best solution.
The problem is that if there are non-FileMessages sent to that
destination also and listeners for them, they have to also use selectors
to weed out FileMessage streams.
It would be great if FileMessage would behave just like other types of
messages. In the streaming case I guess it is not possible to achieve?
> (ii) FileMessage uses some out of band transfer mechanism.
> (iii) direct connection. This option is similar to (ii) but rather
> than putting the file on some remote file server, the file stays on
> the producer until the consumer has received it;
>
> The nice thing is I think all of these approaches can be handled
> nicely by the single FileMessage API; then it can be a
> configuration/policy issue as to exactly which implementation is used.
If we look at the three implementations and what they would require from
the two endpoints:
1. producer must inform what selector is to be used to receive data
through JMS stream, consumers must acknowledge when they are ready to
receive data
2. producer must place file available to external server and place URL
to the message
3. producer must open a port and place URL to the message
.. we see that streaming is maybe the trickiest one. If producer just
starts streaming there is no guarantee (in the general case) that
consumer will receive the whole stream from the beginning. Is that correct?
In case 2, there is the question about pruning the files. Are they left
to external server for ever? Or should consumer be capable of confirming
the transfer, and after confirmation producer would remove the file? I
don't think we should assume the consumer has a write access to file server.
Case 3 is actually pretty straightforward. Do we want allow also a push
option where consumer opens the port and producer delivers the file?
> In terms of getting started, the simplest route is probably to add the
> API in first (to start with assuming just a URL to the file which is a
> no brainer) then we should be able to start adding different providers
> to suit.
Yes, I think that's the best way to go forward. I'll write something
based on that JIRA issue and send it to dev@activemq.apache.org for
comments. Does that sound good?
Re: FileMessage: we would like to contribute
Posted by James Strachan <ja...@gmail.com>.
On 2/14/07, Aleksi Kallio <al...@csc.fi> wrote:
>
> There has been discussion regarding FileMessage or something similar: a
> message for transferring large amounts of data.
>
> For background, see:
> http://issues.apache.org/activemq/browse/AMQ-1075
>
> We need something that offers better service than streaming, of course
> by building on top of it. That is why we would be interested in this and
> ready to contribute our effort into making it happen.
>
> We have to be capable of moving BIG files over JMS, in the order of
> couple of gigabytes. Normally they are a lot smaller, but also the big
> ones must be handled.
>
> Features/fixes we are looking for are:
>
> 1) streaming requires dedicated destination, which makes things complicated
> 2) resuming transfers
> 3) easy monitoring of progress
>
>
> From core developers we would appreciate feedback on how to proceed
> with developing such a feature. From rest of the communinity we would
> gladly get feedback on how you would like this to be done and what kind
> of API would be the best.
>
>
> In the spirit of making ActiveMQ even better,
Sounds great! Welcome!
The comments on the JIRA issue describe the kind of API I was thinking of...
http://issues.apache.org/activemq/browse/AMQ-1075
There are a few different possible implementations...
(i) FileMessage ends up being a wrapper on top of the existing JMS streams
http://activemq.apache.org/jms-streams.html
(ii) FileMessage uses some out of band transfer mechanism. e.g. the
FileMessage essentially just carries around a URL; then the file is
transferred using some FTP/HTTP site. e.g. the producer sends to the
remote file/web server, then sends the message with a reference to it.
i.e. using an out-of-band transfer mechanism. The benefit of this is
that existing file/web servers can be used for the actual file
transfer piece, then the message broker can be used for reliable
message load balancing and failover. i.e. making sure exactly one
consumer processes the message properly, in a transactional way etc.
(iii) direct connection. This option is similar to (ii) but rather
than putting the file on some remote file server, the file stays on
the producer until the consumer has received it; also the producer
listens on a socket for the consumer. Then the FileMessage includes a
URL which when opened on the consumer will essentially connect
directly to the producer of the file. So the FileMessage is a normal
JMS message sent around the JMS network (over clusters & network store
& forward etc). Then when its finally received by a consumer, the
consumer opens the URL which in effect causes the consumer to connect
directly to the producer, to then stream the file.
All of these options have different strengths and weaknesses. I think
for many use cases (iii) is good but I could see folks liking (i) and
(ii) as well.
The nice thing is I think all of these approaches can be handled
nicely by the single FileMessage API; then it can be a
configuration/policy issue as to exactly which implementation is used.
In terms of getting started, the simplest route is probably to add the
API in first (to start with assuming just a URL to the file which is a
no brainer) then we should be able to start adding different providers
to suit.
BTW we might also want to support senders providing a File rather than
an InputStream if you want to be able to monitor progress (as with a
stream you've no idea how long its gonna be)
--
James
-------
http://radio.weblogs.com/0112098/