You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@commons.apache.org by "Daniel F. Savarese" <df...@savarese.org> on 2004/01/10 22:28:15 UTC

Re: [net] Solved VMS duplicates problem and simplified system

In message <20...@javactivity.org>, steve cohen writes:
>2. In order to implement one, a removeDuplicates() method was added to 
>FTPFileEntryParser.  This is a no-op in the normal case but implemented in 
>VMSVersioningFTPEntryParser.

I don't think like this idea because it is application-specific
(i.e., concerned only with the VMS case).  If we're going to put
hooks like that into FTPFileEntryParser, then a generic approach
would be to have pre-parse and a post-parse methods.  That said,
my createFTPFileList addition doesn't belong in FTPFileEntryParser.
Still, I think FTPFileList is overloaded and the parsing driver
(FTPFileList.readStream()) needs to be extracted into a separate
interface/class in a way that allows the API user to make
customizations without us having to address application-specific
scenarios.

>4.  I renamed FTPFileListParserImpl to FTPFileEntryParserImpl to reduce the 
>confusion level since FTPFileListParser is now deprecated and going away in 
>2.0.  For the time being, though, for backward compatibility it still 
>implements FTPFileListParser.

Good change, but we need to keep FTPFileListParserImpl for backward
compatibility according to Commons release/versioning rules.  We
can enhance the API, but we can't remove APIs until a major release
(i.e., 2.0).  So deprecating FTPFileListParserImpl (and changing it
to extend FTPFileentryParserImpl) should suffice.

>Probably FTPFileList and FTPFileListIterator are misnamed.  FTPFileList 
>IS-NOT-A List of FTPFiles.  It HAS a Vector of raw input lines from the 
>listing.  FTPFileListIterator IS-NOT-A simple iterator over FTPFileList.  It 
>DOES iterate over this Vector and DOES return FTPFile objects, so it is more 
>than a simple iterator, it is an iterator on steroids.  

Sure, but the API user doesn't know how they are implemented.  To the user,
they are a list and an iterator (albeit specifically for FTPFile instances).
Right now they are concrete classes, so should there be a case where a
listing format cannot be parsed a line at a time, then there's no way to
extend the implementation to meet the need.  That's where I argue for
making the classes abstract (ideally interfaces, but it may be too late
for that given API migration rules; however abstract doesn't break anything
because they cannot be instantiated directly by a user as it is).  It's
one path toward implementing Jeffrey's suggestion, which I favor, of
allowing the iterator behavior to be altered either through delegation
or subclassing.

>Remember that there are two scenarios supported here:
>a) read the whole list at once as in FTPClient.listFiles() and 
>b) read the list in but defer creation of more expensive FTPFile objects until
>needed.  This scenario was broken in the VMS case.

Yes, and it may break again with a new unforeseen case.  How to handle
those unforeseen cases is what I think we need to solve (solving the
VMS case as a byproduct).  I'm not convinced of the best way.  My
changes were just a step in one possible direction.

>The one thing left to do is decide which VMS scenario, versioning or 
>non-versioning is the default, i.e. the one to be supported by autodetection. 
> 
>I asked for someone to chime in on this last week and no one has.  I still 
>don't know.

I don't have enough experience with VMS to know what the default should
be.  The default in the implementation was to filter out duplicates,
so I would suggest sticking with that until we hear otherwise.

daniel



---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org


Re: [net] Solved VMS duplicates problem and simplified system

Posted by steve cohen <sc...@javactivity.org>.
On Saturday 10 January 2004 03:28 pm, Daniel F. Savarese wrote:
> In message <20...@javactivity.org>, steve cohen writes:
> >2. In order to implement one, a removeDuplicates() method was added to
> >FTPFileEntryParser.  This is a no-op in the normal case but implemented in
> >VMSVersioningFTPEntryParser.
>
> I don't think like this idea because it is application-specific
> (i.e., concerned only with the VMS case).  If we're going to put
> hooks like that into FTPFileEntryParser, then a generic approach
> would be to have pre-parse and a post-parse methods.  That said,
> my createFTPFileList addition doesn't belong in FTPFileEntryParser.
> Still, I think FTPFileList is overloaded and the parsing driver
> (FTPFileList.readStream()) needs to be extracted into a separate
> interface/class in a way that allows the API user to make
> customizations without us having to address application-specific
> scenarios.

Totally in favor of renaming removeDuplicates() to preParse().  That way more 
generic preParse scenarios can be accomodated without changing API.  I have 
made this change and also changed the comments.

>
> >4.  I renamed FTPFileListParserImpl to FTPFileEntryParserImpl to reduce
> > the confusion level since FTPFileListParser is now deprecated and going
> > away in 2.0.  For the time being, though, for backward compatibility it
> > still implements FTPFileListParser.
>
> Good change, but we need to keep FTPFileListParserImpl for backward
> compatibility according to Commons release/versioning rules.  We
> can enhance the API, but we can't remove APIs until a major release
> (i.e., 2.0).  So deprecating FTPFileListParserImpl (and changing it
> to extend FTPFileentryParserImpl) should suffice.

Oops!  CVS doesn't have "undelete", does it?  :-( Sorry for not 
knowing/understanding this rule.  Do we add it back in (losing history) or is 
there some way to put that back as well?  What is best way to proceed?

>
> >Probably FTPFileList and FTPFileListIterator are misnamed.  FTPFileList
> >IS-NOT-A List of FTPFiles.  It HAS a Vector of raw input lines from the
> >listing.  FTPFileListIterator IS-NOT-A simple iterator over FTPFileList. 
> > It DOES iterate over this Vector and DOES return FTPFile objects, so it
> > is more than a simple iterator, it is an iterator on steroids.
>
> Sure, but the API user doesn't know how they are implemented.  To the user,
> they are a list and an iterator (albeit specifically for FTPFile
> instances). Right now they are concrete classes, so should there be a case
> where a listing format cannot be parsed a line at a time, then there's no
> way to extend the implementation to meet the need.  That's where I argue
> for making the classes abstract (ideally interfaces, but it may be too late
> for that given API migration rules; however abstract doesn't break anything
> because they cannot be instantiated directly by a user as it is).  It's one
> path toward implementing Jeffrey's suggestion, which I favor, of allowing
> the iterator behavior to be altered either through delegation or
> subclassing.

But I think these names confused you.  At least that's the way it looked to 
me.  The iteration that was needed was over the Vector of raw input in 
FTPFileList.  When I saw you specializing FTPFileIterator, which walks raw 
input but returns "cooked" equivalents, I knew you had gone down a wrong 
path.  I'm not against abstracting some of these classes.  It's probably a 
good idea and once we have it, further beneficial refactorings will no doubt 
suggest themselves, but I think that is for 2.0.

>
> >Remember that there are two scenarios supported here:
> >a) read the whole list at once as in FTPClient.listFiles() and
> >b) read the list in but defer creation of more expensive FTPFile objects
> > until needed.  This scenario was broken in the VMS case.
>
> Yes, and it may break again with a new unforeseen case.  How to handle
> those unforeseen cases is what I think we need to solve (solving the
> VMS case as a byproduct).  I'm not convinced of the best way.  My
> changes were just a step in one possible direction.
>
> >The one thing left to do is decide which VMS scenario, versioning or
> >non-versioning is the default, i.e. the one to be supported by
> > autodetection.
> >
> >I asked for someone to chime in on this last week and no one has.  I still
> >don't know.
>
> I don't have enough experience with VMS to know what the default should
> be.  The default in the implementation was to filter out duplicates,
> so I would suggest sticking with that until we hear otherwise.

Status as of this moment was the opposite, so I changed that too.
>
> daniel
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: commons-dev-help@jakarta.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org