You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@cocoon.apache.org by Ovidiu Predescu <ov...@cup.hp.com> on 2001/08/14 08:36:03 UTC

[C2] [2.1-dev] proposed changes to the Source interface

Hi,

I was looking at how the current Source interface is defined, and I
believe we need to separate things a little bit more. I badly need
this separation in one of the extensions to Cocoon I'm working on
(which I hope to present sometime early next month).

There are three distinct things the Source interface deals with right
now:

a) the real input source, its last modified date, and content length

b) determining whether the source is a file, and obtaining the file

c) streaming the content of the source to a ContentHandler

d) the ability to refresh a Source

IMO the Source interface should deal only with a). Source should be an
abstraction for content, with no regard whether is a file or whether
it contains XML data.

By greping the sources really quick, I found that the only place that
uses the file characteristics of Source is in
DirectoryGenerator. However a simple workaround can be implemented, by
asking the Source for its system id, and determining from there the
type of the Source.

The functionality defined in c) is already provided by XMLFragment,
and I see no reason why we shouldn't use this instead. Also we should
make a separation between Sources that contain XML data, and those
that don't.

As for point d), I'm not sure is good to assume that all the Sources
are mutable. I have actually come up with Source objects which are
imutable, and for them the refresh operation has no meaning.

As a result, I propose to have Source be following interface:

public interface Source {
  /*** BTW, why use long and not Date? ***/
  long getLastModified();

  long getContentLength();

  public InputSource getInputSource() throws IOException;

  String getSystemId();

  /*** getInputStream() can be easily implemented as
  getInputSource().getByteStream(). ***/
}

Based on this, we can define XMLSource as:

public interface XMLSource extends Source, XMLFragment
{
}

and rename stream(ContentHandler) to toSAX(ContentHandler). The
stream(XMLConsumer) can be implemented based on the toSAX() method
easily.

The URLSource and SitemapSource can then become classes that implement
the XMLSource interface. The refresh method can be placed as a public
method in URLSource and SitemapSource, if they don't prove worth of
creating a new interface. I still need to look into this a little bit
more.

I'm willing to do all the refactoring work if you guys like the
approach. Please let me know so I can finish the work and post a patch
this week; next two weeks I'll be on vacation ;-)

Greetings,
-- 
Ovidiu Predescu <ov...@cup.hp.com>
http://orion.nsr.hp.com/ (inside HP's firewall only)
http://sourceforge.net/users/ovidiu/ (my SourceForge page)
http://www.geocities.com/SiliconValley/Monitor/7464/ (GNU, Emacs, other stuff)

---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org


AW: [C2] [2.1-dev] proposed changes to the Source interface

Posted by Carsten Ziegeler <cz...@sundn.de>.
> Ovidiu Predescu wrote:
> 
> Hi,
> 
> I was looking at how the current Source interface is defined, and I
> believe we need to separate things a little bit more. I badly need
> this separation in one of the extensions to Cocoon I'm working on
> (which I hope to present sometime early next month).
> 
Sounds interesting. Tell us more about your extensions !

> There are three distinct things the Source interface deals with right
> now:
> 
> a) the real input source, its last modified date, and content length
> 
> b) determining whether the source is a file, and obtaining the file
> 
> c) streaming the content of the source to a ContentHandler
> 
> d) the ability to refresh a Source
> 
> IMO the Source interface should deal only with a). Source should be an
> abstraction for content, with no regard whether is a file or whether
> it contains XML data.
Yes, this is right. That was actually the intension of the source
object. But by the time if was introduced the cocoon code used different
ways of getting information from sources and it was very hard to unify
them into a single Source object without redesigning some major parts.
So this led actually to the current implementation.

> 
> By greping the sources really quick, I found that the only place that
> uses the file characteristics of Source is in
> DirectoryGenerator. However a simple workaround can be implemented, by
> asking the Source for its system id, and determining from there the
> type of the Source.
Again correct, but I think that a isFile() method on the Source object
is more convenient than testing the system id if it starts with the
"file" protocol (and more performant).

> 
> The functionality defined in c) is already provided by XMLFragment,
> and I see no reason why we shouldn't use this instead. Also we should
> make a separation between Sources that contain XML data, and those
> that don't.
Yes and no, it would be good to separate between XML and not XML, the 
reason for the stream() method in the Source object was the idea that
a source object is able to "generate" xml, even if the data is not XML,
but e.g. html.
We could use the XMLFragment interface here as well, also we have to
add a toSAX(XMLConsumer consumer) method.

> 
> As for point d), I'm not sure is good to assume that all the Sources
> are mutable. I have actually come up with Source objects which are
> imutable, and for them the refresh operation has no meaning.
The refresh() method is currently very important for the reloading of
the sitemap and the cocoon.xconf to detect changes. On the other hand
the refresh() method is also meant as a reset() method, which means
that you can call e.g. getInputStream() more than once. With e.g.
an url connection this is only possible if you open a new connection
before you can get an input stream for the second time, so refresh()
is usefull here.
I agree that refresh() might not be the right name for it.

> 
> As a result, I propose to have Source be following interface:
> 
> public interface Source {
>   /*** BTW, why use long and not Date? ***/
>   long getLastModified();
> 
>   long getContentLength();
> 
>   public InputSource getInputSource() throws IOException;
> 
>   String getSystemId();
> 
>   /*** getInputStream() can be easily implemented as
>   getInputSource().getByteStream(). ***/
> }
> 
> Based on this, we can define XMLSource as:
> 
> public interface XMLSource extends Source, XMLFragment
> {
> }
> 
> and rename stream(ContentHandler) to toSAX(ContentHandler). The
> stream(XMLConsumer) can be implemented based on the toSAX() method
> easily.
> 
> The URLSource and SitemapSource can then become classes that implement
> the XMLSource interface. The refresh method can be placed as a public
> method in URLSource and SitemapSource, if they don't prove worth of
> creating a new interface. I still need to look into this a little bit
> more.
> 
The first problem I see here is: Who decides whether an object is xml
source or simply source? A in my opinion more convenient method is that
everything is at first a source object and you can ask it to give you
its xml representation. The source object itself converts into xml
if required.

> I'm willing to do all the refactoring work if you guys like the
> approach. Please let me know so I can finish the work and post a patch
> this week; next two weeks I'll be on vacation ;-)
> 
Hm, I am not quiet sure, these are indeed some good and fresh ideas
on dealing with sources. As we have the beta 2 of cocoon 2.0 out,
I think we shouldn't to such radical changes any more.
We could discuss this for one of the next versions (2.1/3.0).

Any other thoughts or comments?


Carsten 

Open Source Group                        sunShine - b:Integrated
================================================================
Carsten Ziegeler, S&N AG, Klingenderstrasse 5, D-33100 Paderborn
www.sundn.de                          mailto: cziegeler@sundn.de 
================================================================

> Greetings,
> -- 
> Ovidiu Predescu <ov...@cup.hp.com>
> http://orion.nsr.hp.com/ (inside HP's firewall only)
> http://sourceforge.net/users/ovidiu/ (my SourceForge page)
> http://www.geocities.com/SiliconValley/Monitor/7464/ (GNU, Emacs, 
> other stuff)
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
> For additional commands, email: cocoon-dev-help@xml.apache.org
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org