You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@cocoon.apache.org by Gianugo Rabellino <gi...@apache.org> on 2003/07/31 16:35:16 UTC
Flow's processPipelineTo() and FileSource
I'm having a hell of a time using flow with processPipelineTo() and
OutputStreams coming out from FileSource(s).
The problem is that FileSource#getOutputStream() creates a temporary
file (... to be discussed later ...) and such file gets renamed to the
original one only upon OutputStream.close(). Now, AbstractInterpreter,
line 201, actually calls flush() but *never* close. As a result,
everything is kinda ... well... screwed up.
Patch is trivial, but I'm wondering if adding out.close() in
AbstractInterpreter.java might break something: any flow experts around?
Now for the FileSource: I do understand *some* of the reasoning behind
using a temporary file, but I have to disagree on the implementation:
naming it [filename].tmp is a bit of a bet, since someone might
legitimately have such a filename around. While I understand that there
might be memory issues with large files, I guess that either:
1. keeping a ByteArrayOutputStream;
2. forget about it and just write the file;
3. use a more "clever" name that doesn't risk conflicts this much
are all better options.
Is that OK to you if I work on it? I don't know if I have access to the
Excalibur CVS though...
Ciao,
--
Gianugo Rabellino
Pro-netics s.r.l. - http://www.pro-netics.com
Orixo, the XML business alliance - http://www.orixo.com
(Now blogging at: http://blogs.cocoondev.org/gianugo/)
Re: Flow's processPipelineTo() and FileSource
Posted by Gianugo Rabellino <gi...@apache.org>.
Sylvain Wallez wrote:
>> I'm having a hell of a time using flow with processPipelineTo() and
>> OutputStreams coming out from FileSource(s).
>>
>> The problem is that FileSource#getOutputStream() creates a temporary
>> file (... to be discussed later ...) and such file gets renamed to the
>> original one only upon OutputStream.close(). Now, AbstractInterpreter,
>> line 201, actually calls flush() but *never* close. As a result,
>> everything is kinda ... well... screwed up.
>>
>> Patch is trivial, but I'm wondering if adding out.close() in
>> AbstractInterpreter.java might break something: any flow experts around?
>
> I don't see why there should be some consequences on the flow itself...
> Just replace flush() by close() !
Just did it, but I didn't replace flush(), just added close()
afterwards: it's better to be sure that there are no leftovers...
>> Now for the FileSource: I do understand *some* of the reasoning behind
>> using a temporary file, but I have to disagree on the implementation:
>> naming it [filename].tmp is a bit of a bet, since someone might
>> legitimately have such a filename around. While I understand that
>> there might be memory issues with large files, I guess that either:
>>
>> 1. keeping a ByteArrayOutputStream;
>> 2. forget about it and just write the file;
>> 3. use a more "clever" name that doesn't risk conflicts this much
>
>
>
> I would avoid 2. The reason why I used a temporary file is because of
> the streamed nature of Cocoon pipelines. If an error occurs within the
> processing, the original content is not partially overwritten. My
> preference would go to 3.
I see and understand. Yet temporary files, besides being somehow
inconvenient, can be a major security hole in general. I'd rather go for
1, then, accumulating bytes as they come on a ByteArrayOutputStream and
writing them upon close() (and maybe flush() too?). True, this is in
turn a possible security hole since someone might DOS the machine by
processing gigabyte-sized files, but all in all I tend to think that
it's a better solution... and yes, doing transaction on a filesystem is
a PITA. :-)
Ciao,
>> are all better options.
>>
>> Is that OK to you if I work on it? I don't know if I have access to
>> the Excalibur CVS though...
>
> As a Cocoon committer, you should.
I understand that I am authorized in line of principle, just don't know
if I need to be explicitely enabled. Anyway, I'll check it out. :-)
Thanks for everything,
--
Gianugo Rabellino
Pro-netics s.r.l. - http://www.pro-netics.com
Orixo, the XML business alliance - http://www.orixo.com
(Now blogging at: http://blogs.cocoondev.org/gianugo/)
Re: Flow's processPipelineTo() and FileSource
Posted by Gianugo Rabellino <gi...@apache.org>.
Christopher Oliver wrote:
> Why can't you just call "close()" in your flowscript? I don't see
> anything in the contract of processPipelineTo() that indicates that it
> should close the stream. In my opinion, calling "flush()" but not
> "close()" as in the original implementation is correct.
On second thought, yes, you're right. The OutputStream might be reused
later, and if you call close() in the flow you miss that opportunity.
I'm reverting the patch right now: next on the list is finding out why
close() in flowscript was giving me problems, while putting it there
solved everything. Probably something wrong in FileSource.
Thanks,
--
Gianugo Rabellino
Pro-netics s.r.l. - http://www.pro-netics.com
Orixo, the XML business alliance - http://www.orixo.com
(Now blogging at: http://blogs.cocoondev.org/gianugo/)
Re: Flow's processPipelineTo() and FileSource
Posted by Christopher Oliver <re...@verizon.net>.
Why can't you just call "close()" in your flowscript? I don't see
anything in the contract of processPipelineTo() that indicates that it
should close the stream. In my opinion, calling "flush()" but not
"close()" as in the original implementation is correct.
My $0.02,
Chris
Sylvain Wallez wrote:
> Gianugo Rabellino wrote:
>
>> I'm having a hell of a time using flow with processPipelineTo() and
>> OutputStreams coming out from FileSource(s).
>>
>> The problem is that FileSource#getOutputStream() creates a temporary
>> file (... to be discussed later ...) and such file gets renamed to
>> the original one only upon OutputStream.close(). Now,
>> AbstractInterpreter, line 201, actually calls flush() but *never*
>> close. As a result, everything is kinda ... well... screwed up.
>>
>> Patch is trivial, but I'm wondering if adding out.close() in
>> AbstractInterpreter.java might break something: any flow experts around?
>
>
>
> I don't see why there should be some consequences on the flow
> itself... Just replace flush() by close() !
>
>> Now for the FileSource: I do understand *some* of the reasoning
>> behind using a temporary file, but I have to disagree on the
>> implementation: naming it [filename].tmp is a bit of a bet, since
>> someone might legitimately have such a filename around. While I
>> understand that there might be memory issues with large files, I
>> guess that either:
>>
>> 1. keeping a ByteArrayOutputStream;
>> 2. forget about it and just write the file;
>> 3. use a more "clever" name that doesn't risk conflicts this much
>
>
>
> I would avoid 2. The reason why I used a temporary file is because of
> the streamed nature of Cocoon pipelines. If an error occurs within the
> processing, the original content is not partially overwritten. My
> preference would go to 3.
>
>> are all better options.
>>
>> Is that OK to you if I work on it? I don't know if I have access to
>> the Excalibur CVS though...
>
>
>
> As a Cocoon committer, you should.
>
> Sylvain
>
Re: Flow's processPipelineTo() and FileSource
Posted by Sylvain Wallez <sy...@anyware-tech.com>.
Gianugo Rabellino wrote:
> I'm having a hell of a time using flow with processPipelineTo() and
> OutputStreams coming out from FileSource(s).
>
> The problem is that FileSource#getOutputStream() creates a temporary
> file (... to be discussed later ...) and such file gets renamed to the
> original one only upon OutputStream.close(). Now, AbstractInterpreter,
> line 201, actually calls flush() but *never* close. As a result,
> everything is kinda ... well... screwed up.
>
> Patch is trivial, but I'm wondering if adding out.close() in
> AbstractInterpreter.java might break something: any flow experts around?
I don't see why there should be some consequences on the flow itself...
Just replace flush() by close() !
> Now for the FileSource: I do understand *some* of the reasoning behind
> using a temporary file, but I have to disagree on the implementation:
> naming it [filename].tmp is a bit of a bet, since someone might
> legitimately have such a filename around. While I understand that
> there might be memory issues with large files, I guess that either:
>
> 1. keeping a ByteArrayOutputStream;
> 2. forget about it and just write the file;
> 3. use a more "clever" name that doesn't risk conflicts this much
I would avoid 2. The reason why I used a temporary file is because of
the streamed nature of Cocoon pipelines. If an error occurs within the
processing, the original content is not partially overwritten. My
preference would go to 3.
> are all better options.
>
> Is that OK to you if I work on it? I don't know if I have access to
> the Excalibur CVS though...
As a Cocoon committer, you should.
Sylvain
--
Sylvain Wallez Anyware Technologies
http://www.apache.org/~sylvain http://www.anyware-tech.com
{ XML, Java, Cocoon, OpenSource }*{ Training, Consulting, Projects }
Orixo, the opensource XML business alliance - http://www.orixo.com