You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@commons.apache.org by Steve Cohen <sc...@javactivity.org> on 2004/09/25 17:19:55 UTC
[NET] Designing a Date Format-aware FTP Entry Parser
Designing a Date Format-aware FTP Entry Parser
After having percolated on the back burner for several years as an unresolved
issue, there is finally some momentum toward solving the problem of parsing
FTP entries from servers which format the file timestamps in the directory
listings in a format other than the NetComponents “standard”.
In order to understand what must be done, it would be helpful to understand
what we now do. In brief, we are using a regular expression to achieve
basically the same results as attempting to parse the date portion of the
listing with one of two alternate java.text.SimpleDateFormats in the en_US
locale:
1.MMM dd HH:mm for dates within one year of the current time
2.MMM dd yyyy for dates older than one year.
Additionally, these formats presume some timezone, which is either the local
timezone of the server or GMT, I presume.
The alternative mechanism that I am proposing would remove the parsing of the
timestamp from the responsibilities of the regular expression and unload this
onto some other object.
But what object? The obvious candidate would be java.text.DateFormat. This
abstract class allows a formatter object to be created on the basis of some
formatting codes defined in DateFormat (“LONG, MEDIUM, SHORT”) and a Locale.
But this is problematic because what is meant by MEDIUM in en_US is a string
like “Sep 25, 2004” while in “de_DE”, you get a string like “25.09.2004”.
This just won't do. So we have to fall back on java.text.SimpleDateFormat,
passing in both a specific formatting string and a Locale, which provides the
month names, etc. (By the way, has anyone ever noticed that SimpleDateFormat
is actually less simple than DateFormat?) :-)
The regular expression would merely extract from the listing the entire
timestamp portion and delegate the task of parsing it to a pair of
SimpleDateFormat objects (one for less than 1 year old and the other for one
year old or older), each constructed on the basis of a format string and a
locale. Since the Locale should be the same for both formats, we would
require the user to provide the two format Strings, and the Locale (or
possibly the constituent elements of the locale, the country code and
language code). We want an object that encapsulates all of that, say,
org.apache.commons.net.ftp.parser.FTPDateFormat.
So each parser would have a settable member of this class FTPDateFormat
would be constructed from two format strings and a Locale. Possibly a
timezone as well. We probably would have to provide some default
FTPDateFormat objects for some of the common locales.
One consequence of this is that we would start making heavier use of the
FTPFileEntryParserFactory objects. We might want to start thinking about
deemphasizing but not deprecating the use of FTPClient.listFiles() which is
simple but makes too many assumptions. There are already four or five
different overrides of this method name and adding several more parameters
into the mix will make this completely unworkable. Instead, going through
the factory would become the more common, more documented and recommended
approach. This would be the preferred method of accessing commons-net ftp
for clients such as Ant and VFS. Users who are happily using listFiles() in
its current form in their custom apps built directly from commons-net could
continue to do so.
Well, these are some preliminary thoughts. Let's hear from the other
developers of this project.
---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org
Re: [NET] Designing a Date Format-aware FTP Entry Parser
Posted by Mario Ivankovits <ma...@ops.co.at>.
Steve Cohen wrote:
>If you think that I meant for the user to pass in FTPDateFormat objects, you
>misunderstood me. The paradigm I want to use here is passing in strings.
>
>
No, i got it. But maybe we could do it like the FTPFileEntryParser do?
Try to determine if the use passed a FQCN else treat it as date/locale
string.
>For the Ant client
>task it is much easier to assemble the strings and construct the needed
>objects ourselves, than it is to make the user do it.
>
>
This is why i created in VFS a
o.a..c.v.u.DelegatingFileSystemOptionsBuilder (again a long name ;-).
I dont wanted to maintain two methods for each possible configuration
setting - one with a string as parameter and another with the real class
type.
This class is responsible to get a configuration-key-name and its value
as string.
It tries to lookup the targeted configuration-methods (by the key name)
parsers its method parameters and tries to find a way to convert the
string to the desired type.
This is done by lookup a
*) constructor with only one String as parameter
*) static valueOf(String) method
on the targeted object.
I think this is a nice glue between configuration-by-string (as in ant)
and configuration-by-code where i would like to have compile time checks.
But for sure, this might go too far just for the date/locale setting.
---
Mario
---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org
Re: [NET] Designing a Date Format-aware FTP Entry Parser
Posted by Steve Cohen <sc...@javactivity.org>.
On Tuesday 28 September 2004 7:34 am, Mario Ivankovits wrote:
> Steve Cohen wrote:
> >All this setting would go on as setters on a factory class that the user
> > would not have to use. If they didn't setLocale, en_US would be the
> > default. If they setLocale but not either date recent or older date
> > format, then the standard US ordering would be used but the Locale month
> > names. If they specified Locale and older date format, we could infer
> > the newer date format as well. And if they specified everything, we
> > could handle that case too.
>
> At least could you please implement this by passing in a e.g.
> FTPDateObject as you stated in one of your previous mails.
>
> This sould have a method like
> Date FTPDateObject.parse(String datepart)
> or something else.
>
> That way one is able to pass in a completely different sort of date
> parses - like the one i have in mind - which is able to automatically
> determine the right month without have to set any locale (as long as the
> date parts are in correct order)
>
> ---
> Mario
If you think that I meant for the user to pass in FTPDateFormat objects, you
misunderstood me. The paradigm I want to use here is passing in strings.
The FTPDateFormat object was just a way of organizing thoughts. It is an
object that the user would rarely if ever see. I have learned from earlier
ideas of "passing in a parser" that the more complex the object you are
trying to pass in, the more difficult it is for the user. For the Ant client
task it is much easier to assemble the strings and construct the needed
objects ourselves, than it is to make the user do it.
---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org
Re: [NET] Designing a Date Format-aware FTP Entry Parser
Posted by Mario Ivankovits <ma...@ops.co.at>.
Steve Cohen wrote:
>All this setting would go on as setters on a factory class that the user would
>not have to use. If they didn't setLocale, en_US would be the default. If
>they setLocale but not either date recent or older date format, then the
>standard US ordering would be used but the Locale month names. If they
>specified Locale and older date format, we could infer the newer date format
>as well. And if they specified everything, we could handle that case too.
>
>
At least could you please implement this by passing in a e.g.
FTPDateObject as you stated in one of your previous mails.
This sould have a method like
Date FTPDateObject.parse(String datepart)
or something else.
That way one is able to pass in a completely different sort of date
parses - like the one i have in mind - which is able to automatically
determine the right month without have to set any locale (as long as the
date parts are in correct order)
---
Mario
Re: [NET] Designing a Date Format-aware FTP Entry Parser
Posted by Mario Ivankovits <ma...@ops.co.at>.
Steve Cohen schrieb:
>I guess I don't have a problem with making a composite parser, which you could
>make the default for VFS if it works, but I don't think it can be the default
>for NetComponents itself.
>
re composite parser: You can stick this work on me as soon as the
framework has materialized.
---
Mario
---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org
Re: [NET] Designing a Date Format-aware FTP Entry Parser
Posted by Steve Cohen <sc...@javactivity.org>.
On Thursday 30 September 2004 7:09 am, Mario Ivankovits wrote:
> Steve Cohen wrote:
> >This business of constantly churning does bother me.
>
> I hope I dont leave to negative impressions on you.
I didn't mean to leave such an impression.
>
> If we find an agreement it could be possible with the way you build the
> framework - it is good enough for me.
I guess I don't have a problem with making a composite parser, which you could
make the default for VFS if it works, but I don't think it can be the default
for NetComponents itself. It's too radical a step, and even if it works
exactly as planned, it will impose some performance penalty on non-VFS users.
I think you ought to wait until the basic framework is completed, and I think
it would be good to accomodate your use case with the proper hooks.
But again, the basic functionality should not change.
---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org
RE: [NET] Designing a Date Format-aware FTP Entry Parser
Posted by Rory Winston <rw...@eircom.net>.
I'm also in favour of the general staying staying exactly the same as it is now, except for the (few) edge cases that cause us problems. Using something like setShortMonthNames() as Steve mentioned earlier sounds reasonable enough to me to catch Locale-related language issues, and the data formatting we can allow the user to specify exactly, but *only* when they know they have a problem and need to use that functionality.
-----Original Message-----
From: Mario Ivankovits [mailto:mario@ops.co.at]
Sent: 30 September 2004 13:09
To: Jakarta Commons Developers List
Subject: Re: [NET] Designing a Date Format-aware FTP Entry Parser
Steve Cohen wrote:
>This business of constantly churning does bother me.
>
I hope I dont leave to negative impressions on you.
If we find an agreement it could be possible with the way you build the
framework - it is good enough for me.
---
Mario
---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org
Re: [NET] Designing a Date Format-aware FTP Entry Parser
Posted by Mario Ivankovits <ma...@ops.co.at>.
Steve Cohen wrote:
>This business of constantly churning does bother me.
>
I hope I dont leave to negative impressions on you.
If we find an agreement it could be possible with the way you build the
framework - it is good enough for me.
---
Mario
---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org
Re: [NET] Designing a Date Format-aware FTP Entry Parser
Posted by Steve Cohen <sc...@javactivity.org>.
On Thursday 30 September 2004 5:39 am, Mario Ivankovits wrote:
> Steve Cohen wrote:
> >First parser to successfully parse every
> >item in the listing I suppose. Or do you go for best score?
>
> line-by-line - the first parser which is able to parse should be cached
> (ad performance) - that way it might be only slower on the first match
> However the parser should be prepared to redetect the language as soon
> as it fails at a later time - maybe there
> are minor differences between languages and the first detection wasnt
> correct -
> e.g. Mar (March) might not be uniq if you talk to a german ftp server
> wich do not use umlauts (März => Mar)
This business of constantly churning does bother me.
>
> >What if none of
> >the parsers in your composite works? Then what?
>
> Like now - a "null" entry in the list of returned entries.
> Or we change the paradigm NET uses today and throw an exception - but
> this is worth a thread on its own ;-)
>
> >2) Will we be opening ourselves to arguments as to which languages are
>
> "in"
>
> >the composite? Or in which order? If you're using Italian and it has
>
> to try
>
> >US English, British English, French and German first, your performance is
> >going to be lousy. Which brings me to
>
> Is there a difference between US and British?
The original complaint which got this started, about AIX comes from Britain.
http://issues.apache.org/bugzilla/show_bug.cgi?id=27437
I believe the month names will be the same, but not necessarily the order. In
Linux, I found that the month-day order was preserved regardless of locale.
(although my test was with French, not en_UK). In that defect there is an
example about AIX where there was a difference between en_US and en_UK.
>
> Performance: As i said - we could cache the last matching language -
> then only the first search might be slow.
>
> Such a composite might only fail if one have to use croatic and polnish
> language at once. There the names "lis" and "lip" means different
> months. (at least of the point of "java short names" view)
So you are saying that between these two languages, that the same
abbreviations in one language refer to different months in another? That is
a real problem for autodetection. I guess you could say it's not affecting
the most common languages. But it doesn't make me happy.
> This is why i am not against your solution at all, the composite parser
> should only be one additional possiblity - and IMHO the default parser.
I agree that the composite parser could be a possibility. I disagree
vehemently that it should be the default.
>
> I think this composite could be configureable by a static map (system
> wide). There I would like to configure it
> to detect "US", "DE", "FR" (in this order) and i am fine with 100% of
> all ftp server i have to contact today.
> In the case of ant it could be configured by e.g lang="US,DE,FR"
> Or by a system property, .... or .... we could discuss this if we found
> a consens at all.
>
> And we should also discuss that you dont want to take SYST into account
> - or at least the possiblity to do so, but this depends also for which
> file entry parsers you would like to implement the date stuff. Currently
> I am only aware the fact the unices to this language stuff.
>
> >3) This is too much run-time trial and error for my tastes. The
>
> average user
>
> >of our library is not writing the ultimate FTP client. He is writing
>
> a java
>
> >app or Ant script to connect repeatedly to an FTP server somewhere.
>
> Once he
>
> >gets the right parser, he never has need of trying others for that
> > server.
>
> ... or using VFS. And VFS would like the be the super ftp, ssh, ....
> client. Like a filesystem works - the user dont want to be bothered with
> things like date styles.
OK, I understand you a little better, you are approaching this from the angle
of VFS. So, you could make your composite parser the default parser USED BY
VFS. In other usages, where our user is simply setting up a little system to
talk to some specific ftp server about which he knows all the details, the
composite parser is a needless performance drain.
>
> For sure, I am not fully against the solution you have in mind, i just
> would like to ensure it is posssible to pass
> in a parser which uses a completley different strategy.
> And again: The user do not have to choose a file-entry-parser now - is
> is done automatically by SYST (i know you know ;-)) -
> but now we force him to select the correct date format - today if he
> changes the url (and a appropriate parser
> is available) the file parsing works without any additional attention.
No, we force him to do nothing. My goal, expressed a few posts back, was that
the system work by default exactly as it does now. The additional
functionalities would only exist to help him out of the odd cases. Changing
the default parser that autodetection provides could provide some real
surprises.
>
> <vision>
> Maybe we would provide a parser with a TreeMap where all month names and
> their numbers are stored - the community could
> help to fill this map - or a properties file which could easily be changed.
> </vision>
>
> >4) On the other hand, your idea could be the basis of a pretty cool
>
> tool based
>
> >on NetComponents: point it at an FTP server somewhere, let it try all the
> >tricks it knows, and somehow it returns its best guess as to what
>
> parser and
>
> >parser date format to use for that server.
>
> Thats the point - like to comfort we provide with the automatich
> detection of the needet file-entry-parser.
> Computers should work for humans and not humans for computers ;-)
>
> As i tried to say earlier: Today the parsing works pretty well - we do
> have problems only with the month
> name (and unknown servers). As long as the date parts are not in
> different order (based on the language)
> why implement such a drastic change in the comfort NET provides today -
> A black box where the user passes
> in an url and gets a file listing is what the user really wants.
I think you are proposing a swiss-army-knife. While this could be an
indispensable tool in a few situations, it's an inefficient answer for the
great majority of them. Yes, there should be a swiss army knife, but it
should not be the default.
---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org
Re: [NET] Designing a Date Format-aware FTP Entry Parser
Posted by Mario Ivankovits <ma...@ops.co.at>.
Steve Cohen wrote:
>First parser to successfully parse every
>item in the listing I suppose. Or do you go for best score?
line-by-line - the first parser which is able to parse should be cached
(ad performance) - that way it might be only slower on the first match
However the parser should be prepared to redetect the language as soon
as it fails at a later time - maybe there
are minor differences between languages and the first detection wasnt
correct -
e.g. Mar (March) might not be uniq if you talk to a german ftp server
wich do not use umlauts (März => Mar)
>What if none of
>the parsers in your composite works? Then what?
Like now - a "null" entry in the list of returned entries.
Or we change the paradigm NET uses today and throw an exception - but
this is worth a thread on its own ;-)
>
>2) Will we be opening ourselves to arguments as to which languages are
"in"
>the composite? Or in which order? If you're using Italian and it has
to try
>US English, British English, French and German first, your performance is
>going to be lousy. Which brings me to
Is there a difference between US and British?
Performance: As i said - we could cache the last matching language -
then only the first search might be slow.
Such a composite might only fail if one have to use croatic and polnish
language at once. There the names "lis" and "lip" means different
months. (at least of the point of "java short names" view)
This is why i am not against your solution at all, the composite parser
should only be one additional possiblity - and IMHO the default parser.
I think this composite could be configureable by a static map (system
wide). There I would like to configure it
to detect "US", "DE", "FR" (in this order) and i am fine with 100% of
all ftp server i have to contact today.
In the case of ant it could be configured by e.g lang="US,DE,FR"
Or by a system property, .... or .... we could discuss this if we found
a consens at all.
And we should also discuss that you dont want to take SYST into account
- or at least the possiblity to do so, but this depends also for which
file entry parsers you would like to implement the date stuff. Currently
I am only aware the fact the unices to this language stuff.
>3) This is too much run-time trial and error for my tastes. The
average user
>of our library is not writing the ultimate FTP client. He is writing
a java
>app or Ant script to connect repeatedly to an FTP server somewhere.
Once he
>gets the right parser, he never has need of trying others for that server.
... or using VFS. And VFS would like the be the super ftp, ssh, .... client.
Like a filesystem works - the user dont want to be bothered with things
like date styles.
For sure, I am not fully against the solution you have in mind, i just
would like to ensure it is posssible to pass
in a parser which uses a completley different strategy.
And again: The user do not have to choose a file-entry-parser now - is
is done automatically by SYST (i know you know ;-)) -
but now we force him to select the correct date format - today if he
changes the url (and a appropriate parser
is available) the file parsing works without any additional attention.
<vision>
Maybe we would provide a parser with a TreeMap where all month names and
their numbers are stored - the community could
help to fill this map - or a properties file which could easily be changed.
</vision>
>4) On the other hand, your idea could be the basis of a pretty cool
tool based
>on NetComponents: point it at an FTP server somewhere, let it try all the
>tricks it knows, and somehow it returns its best guess as to what
parser and
>parser date format to use for that server.
Thats the point - like to comfort we provide with the automatich
detection of the needet file-entry-parser.
Computers should work for humans and not humans for computers ;-)
As i tried to say earlier: Today the parsing works pretty well - we do
have problems only with the month
name (and unknown servers). As long as the date parts are not in
different order (based on the language)
why implement such a drastic change in the comfort NET provides today -
A black box where the user passes
in an url and gets a file listing is what the user really wants.
---
Mario
---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org
Re: [NET] Designing a Date Format-aware FTP Entry Parser
Posted by Steve Cohen <sc...@javactivity.org>.
On Wednesday 29 September 2004 2:27 pm, Mario Ivankovits wrote:
> Steve Cohen wrote:
>
> Maybe you are right, but at least I think it should be possible to
> implement a "CompositeDateFormat".
> This could be a composite of n languages and it tries every (configured
> - by default eg. US, FR, DE) language (and maybe by using
> SimpleDateFormat) until a match is found.
> Same thing we did for the NTFTPFileEntryParser to automatically
> distinguish between NT and UNIX format.
Whew! I'm trying to get my mind around that one! I see problems you would
need to address.
1) What is the meaning of "tries"? First parser to successfully parse every
item in the listing I suppose. Or do you go for best score? What if none of
the parsers in your composite works? Then what?
2) Will we be opening ourselves to arguments as to which languages are "in"
the composite? Or in which order? If you're using Italian and it has to try
US English, British English, French and German first, your performance is
going to be lousy. Which brings me to
3) This is too much run-time trial and error for my tastes. The average user
of our library is not writing the ultimate FTP client. He is writing a java
app or Ant script to connect repeatedly to an FTP server somewhere. Once he
gets the right parser, he never has need of trying others for that server.
4) On the other hand, your idea could be the basis of a pretty cool tool based
on NetComponents: point it at an FTP server somewhere, let it try all the
tricks it knows, and somehow it returns its best guess as to what parser and
parser date format to use for that server.
>
> Maybe you might not want an implementation of CompositeDateFormat in the
> main version of net, but it would be nice if this could be possible.
>
> ---
> Mario
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: commons-dev-help@jakarta.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org
Re: [NET] Designing a Date Format-aware FTP Entry Parser
Posted by Mario Ivankovits <ma...@ops.co.at>.
Steve Cohen wrote:
>The advantage of 2 is that you still get a Date object after all your pains,
>more easily that you do from rolling your own off a regex. And it's easier
>to use SimpleDateFormat format strings than regular expressions. Finally,
>there is more calendar logic in SimpleDateFormat than in our reqular
>expressions. Please note that using our regexes Feb 30 is a legitimate date
>in our regex system.
>
>
Maybe you are right, but at least I think it should be possible to
implement a "CompositeDateFormat".
This could be a composite of n languages and it tries every (configured
- by default eg. US, FR, DE) language (and maybe by using
SimpleDateFormat) until a match is found.
Same thing we did for the NTFTPFileEntryParser to automatically
distinguish between NT and UNIX format.
Maybe you might not want an implementation of CompositeDateFormat in the
main version of net, but it would be nice if this could be possible.
---
Mario
---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org
Re: [NET] Designing a Date Format-aware FTP Entry Parser
Posted by Steve Cohen <sc...@javactivity.org>.
On Tuesday 28 September 2004 8:02 am, Mario Ivankovits wrote:
> Steve Cohen wrote:
> >>>>form jui 7
> >>>>rather than
> >>>>7 jui
>
> Steven, we have a problem?
Yes indeed. Thanks! Ouch!
>
> I have tried to parse the date you shown in your ftp-locales test.
>
> SimpleDateFormat sdf = new SimpleDateFormat("MMM dd", new
> Locale("fr", "FR"));
> Date dt = sdf.parse("jui 7");
>
> "jui 7" is not parseable!!!!!!
> java.text.ParseException: Unparseable date: "jui 7"
> at java.text.DateFormat.parse(DateFormat.java:335)
>
> while "juil." (javas short french form) is.
>
> SimpleDateFormat sdf = new SimpleDateFormat("MMM dd", new
> Locale("fr", "FR"));
> Date dt = sdf.parse("juil. 7");
This is what I get for making assumptions in this field (and a strong
cautionary note to those who might still want to try for an automated
auto-detect system.
Noting the strong similarities between "ls" listings and listings from ftp
servers, I assumed that what I see in a unix directory listing created with a
specific unix "LANG" would be the same as what we would see in java, created
with a specific "Locale". Because American unix directory and ftp listings
use the same month abbreviations as those returned by
SimpleDateFormat.getDateFormatSymbols().getShortMonths(), I erroneously
assumed that this must be the case for all LANGs and their equivalent
Locales.
As you so cogently point out, that's not the case! Doh!
Which means, back to the drawing board! Java and its SimpleDateFormats are
not going to help us as much. There is no simple path from a Locale to a
directory listing date format.
So I see two possiblities.
1) Parse the date with a regular expression but make the month names a
settable parameter.
2) Parse the date with a special SimpleDateFormat constructed on the fly:
private SimpleDateFormat createDateFormatter(
String formatString, /* e.g "MMM dd"*/
String monthNames) /* e.g "jan|fév|mar|avr|mai|jun|jui|aoû|sep|oct|nov|déc"*/
{
sdf = new SimpleDateFormat(formatString);
sdf.getDateFormatSymbols().setShortMonthNames(monthNames.split("|");
/*
yes,I know that String.split() is java 1.4 specific, this is just for
simplicity here. Any actual implementation could not use the split() method.
*/
}
The advantage of 2 is that you still get a Date object after all your pains,
more easily that you do from rolling your own off a regex. And it's easier
to use SimpleDateFormat format strings than regular expressions. Finally,
there is more calendar logic in SimpleDateFormat than in our reqular
expressions. Please note that using our regexes Feb 30 is a legitimate date
in our regex system.
In either case, for the sake of user convenience we might still want to tie
some preset constant month to locales (and possibly system types), even
though java's implementation does not produce the same symbols natively as
unix directory listings do.
>
> ---
> Mario
---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org
Re: [NET] Designing a Date Format-aware FTP Entry Parser
Posted by Mario Ivankovits <ma...@ops.co.at>.
Steve Cohen wrote:
>>>>form jui 7
>>>>rather than
>>>>7 jui
>>>>
>>>>
Steven, we have a problem?
I have tried to parse the date you shown in your ftp-locales test.
SimpleDateFormat sdf = new SimpleDateFormat("MMM dd", new
Locale("fr", "FR"));
Date dt = sdf.parse("jui 7");
"jui 7" is not parseable!!!!!!
java.text.ParseException: Unparseable date: "jui 7"
at java.text.DateFormat.parse(DateFormat.java:335)
while "juil." (javas short french form) is.
SimpleDateFormat sdf = new SimpleDateFormat("MMM dd", new
Locale("fr", "FR"));
Date dt = sdf.parse("juil. 7");
---
Mario
Re: [NET] Designing a Date Format-aware FTP Entry Parser
Posted by Steve Cohen <sc...@javactivity.org>.
On Tuesday 28 September 2004 7:11 am, Rory Winston wrote:
> Steve,
>
> This sounds like it could be the way forward. This way, the user doesn't
> have to specify anything extra unless they really need to. The only
> question is, do we generate regexes on the fly, or pull out the enire date
> string? I would be inclined to go for the latter option.
Me too.
>
> -----Original Message-----
> From: Steve Cohen [mailto:scohen@javactivity.org]
> Sent: 28 September 2004 12:24
> To: Jakarta Commons Developers List
> Subject: Re: [NET] Designing a Date Format-aware FTP Entry Parser
>
> On Monday 27 September 2004 7:50 am, Mario Ivankovits wrote:
> > Steve Cohen wrote:
> > >I created a hypothetical French user
> > >named Jacques on my system, gave him "LANG" of "fr_FR", logged in as
> > > him, and got French directory listings, although the dates were of the
> > > form jui 7
> > >rather than
> > >7 jui
> >
> > So it is as i thought - at least for the unix like ftp server. The date
> > format isnt really true locale-specific, only the month name is
> > converted. I am not sure if it is worth to implement the whole date stuff
> > just to handle the month name - we could achieve the same by simply
> > provide a static month-name list and a
> > static addMonth(String name, int number) which one can use to add some
> > month-names we do not maintain in our default list.
>
> Locale + SimpleDateFormat provides an easier way to do this. A
> SimpleDateFormat is constructed with the Locale as a parameter and then
> SimpleDateFormat.getShortMonthNames() provides a list of month
> abbreviations for that locale.
>
> Another option, though, is NOT to use regular expressions for the date
> parsing at all. Let the regex pull out the entire date portion and then
> parse that with the SimpleDateFormat.
>
> > This is fairly easy to implement.
> > But i dont know what Rory found for NT and therefore i dont know if this
> > might work there too.
> >
> > >It seems to me that we might need no other identifier than Locale. I
> > > would caution once again that we not get this mixed up with SYST. I
> > > would proceed for now as though there is no way to automate this.
> > > Later if we find such a way we can build for it.
> >
> > But you found that the french date wasnt relly printed in its typical
> > manner, maybe another server will do.
> > So it is possible you might end up in two French locale definitions and
> > then the user has to encode this fact into the locale name e.g "fr_FR"
> > and "fr_FR_xyzserver".
> > For sure, in this case my proposed solution might not work too.
> >
> > The question is: Are there server where the date parts are really
> > printed in different order depending on the locale?
>
> That is not what I meant when I said we might need no other IDENTIFIER than
> a locale. That is, if the user supplied "fr_FR" we would construct a
> SimpleDateFormat("MMM dd", new Locale("fr", "FR"))
> and if he supplied "en_US" we would construct thus:
> SimpleDateFormat("MMM dd", new Locale("en", "US"))
>
> That is to say, we would NOT infer date-month-year ordering from the
> locale, at least for the unix-like parsers.
>
> But there would be a way for the user to supply the date format string as
> well as locale so as to get
> SimpleDateFormat("dd MMM", new Locale("fr", "FR"))
> if it is required.
>
> All this setting would go on as setters on a factory class that the user
> would not have to use. If they didn't setLocale, en_US would be the
> default. If they setLocale but not either date recent or older date format,
> then the standard US ordering would be used but the Locale month names. If
> they specified Locale and older date format, we could infer the newer date
> format as well. And if they specified everything, we could handle that
> case too.
>
> > What date problems do have users reported till today?
> > Acutally i only read the aix language problem (and seen your french
> > test).
>
> We seem to see one or two of these a year.
>
> > ---
> > Mario
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
> > For additional commands, e-mail: commons-dev-help@jakarta.apache.org
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: commons-dev-help@jakarta.apache.org
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: commons-dev-help@jakarta.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org
RE: [NET] Designing a Date Format-aware FTP Entry Parser
Posted by Rory Winston <rw...@eircom.net>.
Steve,
This sounds like it could be the way forward. This way, the user doesn't have to specify anything extra unless they really need to. The only question is, do we generate regexes on the fly, or pull out the enire date string? I would be inclined to go for the latter option.
-----Original Message-----
From: Steve Cohen [mailto:scohen@javactivity.org]
Sent: 28 September 2004 12:24
To: Jakarta Commons Developers List
Subject: Re: [NET] Designing a Date Format-aware FTP Entry Parser
On Monday 27 September 2004 7:50 am, Mario Ivankovits wrote:
> Steve Cohen wrote:
> >I created a hypothetical French user
> >named Jacques on my system, gave him "LANG" of "fr_FR", logged in as him,
> > and got French directory listings, although the dates were of the form
> > jui 7
> >rather than
> >7 jui
>
> So it is as i thought - at least for the unix like ftp server. The date
> format isnt really true locale-specific, only the month name is converted.
> I am not sure if it is worth to implement the whole date stuff just to
> handle the month name - we could achieve the same by simply provide a
> static month-name list and a
> static addMonth(String name, int number) which one can use to add some
> month-names we do not maintain in our default list.
Locale + SimpleDateFormat provides an easier way to do this. A
SimpleDateFormat is constructed with the Locale as a parameter and then
SimpleDateFormat.getShortMonthNames() provides a list of month abbreviations
for that locale.
Another option, though, is NOT to use regular expressions for the date parsing
at all. Let the regex pull out the entire date portion and then parse that
with the SimpleDateFormat.
> This is fairly easy to implement.
> But i dont know what Rory found for NT and therefore i dont know if this
> might work there too.
>
> >It seems to me that we might need no other identifier than Locale. I
> > would caution once again that we not get this mixed up with SYST. I
> > would proceed for now as though there is no way to automate this. Later
> > if we find such a way we can build for it.
>
> But you found that the french date wasnt relly printed in its typical
> manner, maybe another server will do.
> So it is possible you might end up in two French locale definitions and
> then the user has to encode this fact into the locale name e.g "fr_FR"
> and "fr_FR_xyzserver".
> For sure, in this case my proposed solution might not work too.
>
> The question is: Are there server where the date parts are really
> printed in different order depending on the locale?
That is not what I meant when I said we might need no other IDENTIFIER than a
locale. That is, if the user supplied "fr_FR" we would construct a
SimpleDateFormat("MMM dd", new Locale("fr", "FR"))
and if he supplied "en_US" we would construct thus:
SimpleDateFormat("MMM dd", new Locale("en", "US"))
That is to say, we would NOT infer date-month-year ordering from the locale,
at least for the unix-like parsers.
But there would be a way for the user to supply the date format string as well
as locale so as to get
SimpleDateFormat("dd MMM", new Locale("fr", "FR"))
if it is required.
All this setting would go on as setters on a factory class that the user would
not have to use. If they didn't setLocale, en_US would be the default. If
they setLocale but not either date recent or older date format, then the
standard US ordering would be used but the Locale month names. If they
specified Locale and older date format, we could infer the newer date format
as well. And if they specified everything, we could handle that case too.
>
> What date problems do have users reported till today?
> Acutally i only read the aix language problem (and seen your french test).
We seem to see one or two of these a year.
>
> ---
> Mario
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: commons-dev-help@jakarta.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org
Re: [NET] Designing a Date Format-aware FTP Entry Parser
Posted by Steve Cohen <sc...@javactivity.org>.
On Monday 27 September 2004 7:50 am, Mario Ivankovits wrote:
> Steve Cohen wrote:
> >I created a hypothetical French user
> >named Jacques on my system, gave him "LANG" of "fr_FR", logged in as him,
> > and got French directory listings, although the dates were of the form
> > jui 7
> >rather than
> >7 jui
>
> So it is as i thought - at least for the unix like ftp server. The date
> format isnt really true locale-specific, only the month name is converted.
> I am not sure if it is worth to implement the whole date stuff just to
> handle the month name - we could achieve the same by simply provide a
> static month-name list and a
> static addMonth(String name, int number) which one can use to add some
> month-names we do not maintain in our default list.
Locale + SimpleDateFormat provides an easier way to do this. A
SimpleDateFormat is constructed with the Locale as a parameter and then
SimpleDateFormat.getShortMonthNames() provides a list of month abbreviations
for that locale.
Another option, though, is NOT to use regular expressions for the date parsing
at all. Let the regex pull out the entire date portion and then parse that
with the SimpleDateFormat.
> This is fairly easy to implement.
> But i dont know what Rory found for NT and therefore i dont know if this
> might work there too.
>
> >It seems to me that we might need no other identifier than Locale. I
> > would caution once again that we not get this mixed up with SYST. I
> > would proceed for now as though there is no way to automate this. Later
> > if we find such a way we can build for it.
>
> But you found that the french date wasnt relly printed in its typical
> manner, maybe another server will do.
> So it is possible you might end up in two French locale definitions and
> then the user has to encode this fact into the locale name e.g "fr_FR"
> and "fr_FR_xyzserver".
> For sure, in this case my proposed solution might not work too.
>
> The question is: Are there server where the date parts are really
> printed in different order depending on the locale?
That is not what I meant when I said we might need no other IDENTIFIER than a
locale. That is, if the user supplied "fr_FR" we would construct a
SimpleDateFormat("MMM dd", new Locale("fr", "FR"))
and if he supplied "en_US" we would construct thus:
SimpleDateFormat("MMM dd", new Locale("en", "US"))
That is to say, we would NOT infer date-month-year ordering from the locale,
at least for the unix-like parsers.
But there would be a way for the user to supply the date format string as well
as locale so as to get
SimpleDateFormat("dd MMM", new Locale("fr", "FR"))
if it is required.
All this setting would go on as setters on a factory class that the user would
not have to use. If they didn't setLocale, en_US would be the default. If
they setLocale but not either date recent or older date format, then the
standard US ordering would be used but the Locale month names. If they
specified Locale and older date format, we could infer the newer date format
as well. And if they specified everything, we could handle that case too.
>
> What date problems do have users reported till today?
> Acutally i only read the aix language problem (and seen your french test).
We seem to see one or two of these a year.
>
> ---
> Mario
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: commons-dev-help@jakarta.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org
Re: [NET] Designing a Date Format-aware FTP Entry Parser
Posted by Mario Ivankovits <ma...@ops.co.at>.
Steve Cohen wrote:
>I created a hypothetical French user
>named Jacques on my system, gave him "LANG" of "fr_FR", logged in as him, and
>got French directory listings, although the dates were of the form
>jui 7
>rather than
>7 jui
>
>
So it is as i thought - at least for the unix like ftp server. The date
format isnt really true locale-specific, only the month name is converted.
I am not sure if it is worth to implement the whole date stuff just to
handle the month name - we could achieve the same by simply provide a
static month-name list and a
static addMonth(String name, int number) which one can use to add some
month-names we do not maintain in our default list.
This is fairly easy to implement.
But i dont know what Rory found for NT and therefore i dont know if this
might work there too.
>It seems to me that we might need no other identifier than Locale. I would
>caution once again that we not get this mixed up with SYST. I would proceed
>for now as though there is no way to automate this. Later if we find such a
>way we can build for it.
>
>
But you found that the french date wasnt relly printed in its typical
manner, maybe another server will do.
So it is possible you might end up in two French locale definitions and
then the user has to encode this fact into the locale name e.g "fr_FR"
and "fr_FR_xyzserver".
For sure, in this case my proposed solution might not work too.
The question is: Are there server where the date parts are really
printed in different order depending on the locale?
What date problems do have users reported till today?
Acutally i only read the aix language problem (and seen your french test).
---
Mario
---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org
RE: [NET] Designing a Date Format-aware FTP Entry Parser
Posted by Rory Winston <rw...@eircom.net>.
>But this brings up the possibility that non-anonymous FTP might produce
>different results than anonymous FTP to the same server!
>All of which argues for the user being able to specify all the relevant
>parameters, even though we go to some length to assure that he doesn't often
>have to.
Right - I agree. I also concur that I don't think there will be a universal way to automate this.
Re: the Locale issue, would this mean that we would need to provide a different, say, month-parsing regex per Locale ?
-----Original Message-----
From: Steve Cohen [mailto:scohen@javactivity.org]
Sent: 27 September 2004 12:39
To: Jakarta Commons Developers List
Subject: Re: [NET] Designing a Date Format-aware FTP Entry Parser
On Monday 27 September 2004 1:51 am, Mario Ivankovits wrote:
> Steve Cohen schrieb:
> >Where I was sort of heading was a combination of these, since I'm still
> > not sure that server locale implies a particular date format.
>
> You might be right, though i think the combination of server-type/locale
> will be sufficient for a reproducible result.
> At least as long as the ftp-server might not allow the user to define a
> custom dateformat.
> But you are right if you mean the automatic detection could lead to a
> complicated task - i think it is possible that some ftp-server changed
> the date-format during their versions.
> Maybe - if it comes to the automatic detection - we might see we have to
> use the version-part of the SYST command too.
>
> >I don't think so, and I'm not sure that anything
> >forces an ftp server to format its listings in the locale-specific way.
>
> I think a ftp-server might use a "posix" format or the servers locale
> format - other strategies might make not much sense.
Well, I was just looking at some of the man files for ftp on my linux box. I
didn't find what I was looking for (some method of configuring the ftpd
daemon in the ways we are talking about, such as the examples that Rory found
the other day for NT. But I found other things that gave me pause - for
non-anonymous ftp, the daemon looks at the user's shell. Might it perhaps in
some cases look at the user's locale? I created a hypothetical French user
named Jacques on my system, gave him "LANG" of "fr_FR", logged in as him, and
got French directory listings, although the dates were of the form
jui 7
rather than
7 jui
But this brings up the possibility that non-anonymous FTP might produce
different results than anonymous FTP to the same server!
All of which argues for the user being able to specify all the relevant
parameters, even though we go to some length to assure that he doesn't often
have to.
>
> But again - it might make not much sense to spend much effort in this
> now, what if we simply prepare the locale/date structure to be ready for
> such a thing - so we do not have to refactor the api any time later.
> For now a simply "ident" might be enough - later we could use it for the
> server-type (and/or the server-version from SYST)
It seems to me that we might need no other identifier than Locale. I would
caution once again that we not get this mixed up with SYST. I would proceed
for now as though there is no way to automate this. Later if we find such a
way we can build for it.
<snip>
>
> What if we first try to collect some directory listings.
> Maybe we might see only the name of the month changes - then we do not
> need to build all this, but simply extend the parsing of the month name
> to a multi-language like style.
Possibly. The trouble with collecting directory listings, though, is that
it's precisely the private ones which you or I don't have access to that will
be the most problematic, but also, it's precisely these that are probably the
main targets of users who want to code applications using our API. But in
the listings you'll collect you''ll necessarily be limited to public sites
and those few private ones to which you have access.
> The positive effekt could be we do not need to bother the user with this.
>
> ---
> Mario
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: commons-dev-help@jakarta.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org
Re: [NET] Designing a Date Format-aware FTP Entry Parser
Posted by Steve Cohen <sc...@javactivity.org>.
On Monday 27 September 2004 1:51 am, Mario Ivankovits wrote:
> Steve Cohen schrieb:
> >Where I was sort of heading was a combination of these, since I'm still
> > not sure that server locale implies a particular date format.
>
> You might be right, though i think the combination of server-type/locale
> will be sufficient for a reproducible result.
> At least as long as the ftp-server might not allow the user to define a
> custom dateformat.
> But you are right if you mean the automatic detection could lead to a
> complicated task - i think it is possible that some ftp-server changed
> the date-format during their versions.
> Maybe - if it comes to the automatic detection - we might see we have to
> use the version-part of the SYST command too.
>
> >I don't think so, and I'm not sure that anything
> >forces an ftp server to format its listings in the locale-specific way.
>
> I think a ftp-server might use a "posix" format or the servers locale
> format - other strategies might make not much sense.
Well, I was just looking at some of the man files for ftp on my linux box. I
didn't find what I was looking for (some method of configuring the ftpd
daemon in the ways we are talking about, such as the examples that Rory found
the other day for NT. But I found other things that gave me pause - for
non-anonymous ftp, the daemon looks at the user's shell. Might it perhaps in
some cases look at the user's locale? I created a hypothetical French user
named Jacques on my system, gave him "LANG" of "fr_FR", logged in as him, and
got French directory listings, although the dates were of the form
jui 7
rather than
7 jui
But this brings up the possibility that non-anonymous FTP might produce
different results than anonymous FTP to the same server!
All of which argues for the user being able to specify all the relevant
parameters, even though we go to some length to assure that he doesn't often
have to.
>
> But again - it might make not much sense to spend much effort in this
> now, what if we simply prepare the locale/date structure to be ready for
> such a thing - so we do not have to refactor the api any time later.
> For now a simply "ident" might be enough - later we could use it for the
> server-type (and/or the server-version from SYST)
It seems to me that we might need no other identifier than Locale. I would
caution once again that we not get this mixed up with SYST. I would proceed
for now as though there is no way to automate this. Later if we find such a
way we can build for it.
<snip>
>
> What if we first try to collect some directory listings.
> Maybe we might see only the name of the month changes - then we do not
> need to build all this, but simply extend the parsing of the month name
> to a multi-language like style.
Possibly. The trouble with collecting directory listings, though, is that
it's precisely the private ones which you or I don't have access to that will
be the most problematic, but also, it's precisely these that are probably the
main targets of users who want to code applications using our API. But in
the listings you'll collect you''ll necessarily be limited to public sites
and those few private ones to which you have access.
> The positive effekt could be we do not need to bother the user with this.
>
> ---
> Mario
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: commons-dev-help@jakarta.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org
RE: [NET] Designing a Date Format-aware FTP Entry Parser
Posted by Rory Winston <rw...@eircom.net>.
I think that the SimpleDateFormat approach is the way to go. I guess really what we want to achieve is:
- No disruption to the current API semantics (i.e. a user may only have to choose an FTPDateFormat object if they really *have* to, otherwise, things work as normal);
- A mechanism that is pluggable across multiple parser implementations.
I don't think that there is any way around the fact that we may require a user to explicitly enter a date format if they are using a "problematic" system - as Steve has metioned before, the FTP spec is kind of vague when it comes to these sort of specifics, and we can't rely on implementation consistency. FTP is one protocol that is pretty "autodetect-unfriendly" :)
I guess what I would like to see is a connection-specific DateFormat, somthing like:
FTPClient client = new FTPClient();
client.connect(server, FTPDateFormat.getFormat("dd-mm-YYYY"));
or
client.setDateFormat("dd-mm-yyy");
- something like that. Here's a question: if required, would we only need to ask the user for the "less-than-1-year" date format? i.e. given the less-than-1-year Date format, can we reliably identify the older-than-1-year format from that information?
Inside the **Parser class, we could have:
private FTPDateFormat format;
private static final String REGEX =
"((?:0[1-9])|(?:1[0-2]))-"
+ "((?:0[1-9])|(?:[1-2]\\d)|(?:3[0-1]))-"
+ getFTPPDateFormat()
+ "(\\S.*)";
So the FTPDateFormat class maps regexes to date format strings. This would also mean that we could need to parameterize the following code:
String mo = group(1);
String da = group(2);
String yr = group(3);
String hr = group(4);
String min = group(5);
String ampm = group(6);
Perhaps we could hand this off to an FTPDateFormatParser class as well?
I guess ideally, what I would like to be able to do, is for the worst-case scenario, I would have to pass an extra parameter to the Ant task:
<ftp ..... dateFormat="dd/mm/yyyy">
And it would process the listings accordingly.
-----Original Message-----
From: Steve Cohen [mailto:scohen@javactivity.org]
Sent: 27 September 2004 11:53
To: Jakarta Commons Developers List
Subject: Re: [NET] Designing a Date Format-aware FTP Entry Parser
On Monday 27 September 2004 1:51 am, Mario Ivankovits wrote:
> Steve Cohen schrieb:
> >Where I was sort of heading was a combination of these, since I'm still
> > not sure that server locale implies a particular date format.
>
> You might be right, though i think the combination of server-type/locale
> will be sufficient for a reproducible result.
> At least as long as the ftp-server might not allow the user to define a
> custom dateformat.
> But you are right if you mean the automatic detection could lead to a
> complicated task - i think it is possible that some ftp-server changed
> the date-format during their versions.
> Maybe - if it comes to the automatic detection - we might see we have to
> use the version-part of the SYST command too.
>
> >I don't think so, and I'm not sure that anything
> >forces an ftp server to format its listings in the locale-specific way.
>
> I think a ftp-server might use a "posix" format or the servers locale
> format - other strategies might make not much sense.
>
> But again - it might make not much sense to spend much effort in this
> now, what if we simply prepare the locale/date structure to be ready for
> such a thing - so we do not have to refactor the api any time later.
> For now a simply "ident" might be enough - later we could use it for the
> server-type (and/or the server-version from SYST)
>
> >public static FTPDateFormat FTPDateFormatFactory.createFTPDateFormat(
> > Locale locale,
> > SimpleDateFormat newerThanOneYear,
> > SimpleDateFormat olderThanOneYear)
>
> public static FTPDateFormat FTPDateFormatFactory.createFTPDateFormat(
>
> *String ident,*
>
> Locale locale,
> SimpleDateFormat newerThanOneYear,
> SimpleDateFormat olderThanOneYear)
>
> >public static FTPDateFormat FTPDateFormatFactory.createFTPDateFormat(
> > Locale locale)
>
> public static FTPDateFormat FTPDateFormatFactory.createFTPDateFormat(
>
> *String ident,*
>
> Locale locale)
>
>
> What if we first try to collect some directory listings.
> Maybe we might see only the name of the month changes - then we do not
> need to build all this, but simply extend the parsing of the month name
> to a multi-language like style.
> The positive effekt could be we do not need to bother the user with this.
>
> ---
> Mario
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: commons-dev-help@jakarta.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org
Re: [NET] Designing a Date Format-aware FTP Entry Parser
Posted by Steve Cohen <sc...@javactivity.org>.
On Monday 27 September 2004 1:51 am, Mario Ivankovits wrote:
> Steve Cohen schrieb:
> >Where I was sort of heading was a combination of these, since I'm still
> > not sure that server locale implies a particular date format.
>
> You might be right, though i think the combination of server-type/locale
> will be sufficient for a reproducible result.
> At least as long as the ftp-server might not allow the user to define a
> custom dateformat.
> But you are right if you mean the automatic detection could lead to a
> complicated task - i think it is possible that some ftp-server changed
> the date-format during their versions.
> Maybe - if it comes to the automatic detection - we might see we have to
> use the version-part of the SYST command too.
>
> >I don't think so, and I'm not sure that anything
> >forces an ftp server to format its listings in the locale-specific way.
>
> I think a ftp-server might use a "posix" format or the servers locale
> format - other strategies might make not much sense.
>
> But again - it might make not much sense to spend much effort in this
> now, what if we simply prepare the locale/date structure to be ready for
> such a thing - so we do not have to refactor the api any time later.
> For now a simply "ident" might be enough - later we could use it for the
> server-type (and/or the server-version from SYST)
>
> >public static FTPDateFormat FTPDateFormatFactory.createFTPDateFormat(
> > Locale locale,
> > SimpleDateFormat newerThanOneYear,
> > SimpleDateFormat olderThanOneYear)
>
> public static FTPDateFormat FTPDateFormatFactory.createFTPDateFormat(
>
> *String ident,*
>
> Locale locale,
> SimpleDateFormat newerThanOneYear,
> SimpleDateFormat olderThanOneYear)
>
> >public static FTPDateFormat FTPDateFormatFactory.createFTPDateFormat(
> > Locale locale)
>
> public static FTPDateFormat FTPDateFormatFactory.createFTPDateFormat(
>
> *String ident,*
>
> Locale locale)
>
>
> What if we first try to collect some directory listings.
> Maybe we might see only the name of the month changes - then we do not
> need to build all this, but simply extend the parsing of the month name
> to a multi-language like style.
> The positive effekt could be we do not need to bother the user with this.
>
> ---
> Mario
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: commons-dev-help@jakarta.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org
Re: [NET] Designing a Date Format-aware FTP Entry Parser
Posted by Mario Ivankovits <ma...@ops.co.at>.
Steve Cohen schrieb:
>Where I was sort of heading was a combination of these, since I'm still not
>sure that server locale implies a particular date format.
>
You might be right, though i think the combination of server-type/locale
will be sufficient for a reproducible result.
At least as long as the ftp-server might not allow the user to define a
custom dateformat.
But you are right if you mean the automatic detection could lead to a
complicated task - i think it is possible that some ftp-server changed
the date-format during their versions.
Maybe - if it comes to the automatic detection - we might see we have to
use the version-part of the SYST command too.
>I don't think so, and I'm not sure that anything
>forces an ftp server to format its listings in the locale-specific way.
>
>
I think a ftp-server might use a "posix" format or the servers locale
format - other strategies might make not much sense.
But again - it might make not much sense to spend much effort in this
now, what if we simply prepare the locale/date structure to be ready for
such a thing - so we do not have to refactor the api any time later.
For now a simply "ident" might be enough - later we could use it for the
server-type (and/or the server-version from SYST)
>public static FTPDateFormat FTPDateFormatFactory.createFTPDateFormat(
> Locale locale,
> SimpleDateFormat newerThanOneYear,
> SimpleDateFormat olderThanOneYear)
>
>
public static FTPDateFormat FTPDateFormatFactory.createFTPDateFormat(
*String ident,*
Locale locale,
SimpleDateFormat newerThanOneYear,
SimpleDateFormat olderThanOneYear)
>public static FTPDateFormat FTPDateFormatFactory.createFTPDateFormat(
> Locale locale)
>
>
public static FTPDateFormat FTPDateFormatFactory.createFTPDateFormat(
*String ident,*
Locale locale)
What if we first try to collect some directory listings.
Maybe we might see only the name of the month changes - then we do not
need to build all this, but simply extend the parsing of the month name
to a multi-language like style.
The positive effekt could be we do not need to bother the user with this.
---
Mario
---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org
Re: [NET] Designing a Date Format-aware FTP Entry Parser
Posted by Steve Cohen <sc...@javactivity.org>.
Where I was sort of heading was a combination of these, since I'm still not
sure that server locale implies a particular date format. It maybe defines
month abbreviations and the ordering of day, month, and year within a date,
but does it define whether a numeric-only date format, as opposed to an
abbreviated month is used? I don't think so, and I'm not sure that anything
forces an ftp server to format its listings in the locale-specific way.
This brings back to the FTPDateFormat object which I see as a place to resolve
all this uncertainty and ambiguity. And a factory to aid in the creation
public static FTPDateFormat FTPDateFormatFactory.createFTPDateFormat(
Locale locale,
SimpleDateFormat newerThanOneYear,
SimpleDateFormat olderThanOneYear)
I see default static final FTPDateFormat objects for each of the locales so
that another, simpler factory method could create those
public static FTPDateFormat FTPDateFormatFactory.createFTPDateFormat(
Locale locale)
this would return the static object we created as the default for that locale.
This would probably be the preferred way to access it.
I would be very conservative in trying to automate this - I would, in fact,
not do so in the first release. My goal in the first release would be to
implement all this stuff but have all current implementations work as before.
We simply have no idea of the relative prevalence of arrangements in the real
world at this time. But we should devote some effort to defining the way of
accessing this functionality that is as painless for the user as possible.
Then when complaints came in, we would have something to recommend which is
better than having nothing to recommend.
On Sunday 26 September 2004 1:51 am, Mario Ivankovits wrote:
> Steve Cohen wrote:
> >and delegate the task of parsing it to a pair of
> >SimpleDateFormat objects (one for less than 1 year old and the other for
> > one year old or older), each constructed on the basis of a format string
> > and a locale.
>
> Sounds good at all, just one additional question: How should the user
> pass in these date parsers?
>
> 1) explicitly set the date parser per connections
> But this might work against the idea behind the default file entry parser.
> The default file entry parser uses some "magic" to decide the real
> parser and hide the pain about the different styles from the user.
> Depending on the result the possible date formats could be known too
> (except for the locale for sure).
> If the user needs to set a real date parser implementation he always has
> to take the result of the DefaultFileEntryParser into account.
>
> This brings me to
> 2) only set a java.util.Locale per connection
> and pick the needet date parser - in combination with the result of SYST
> - out of a date parser pool.
>
> For sure - it should be possible to do 1) but this should not be the
> preferred way.
>
> ---
> Mario
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: commons-dev-help@jakarta.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org
Re: [NET] Designing a Date Format-aware FTP Entry Parser
Posted by Mario Ivankovits <ma...@ops.co.at>.
Steve Cohen wrote:
>and delegate the task of parsing it to a pair of
>SimpleDateFormat objects (one for less than 1 year old and the other for one
>year old or older), each constructed on the basis of a format string and a
>locale.
>
Sounds good at all, just one additional question: How should the user
pass in these date parsers?
1) explicitly set the date parser per connections
But this might work against the idea behind the default file entry parser.
The default file entry parser uses some "magic" to decide the real
parser and hide the pain about the different styles from the user.
Depending on the result the possible date formats could be known too
(except for the locale for sure).
If the user needs to set a real date parser implementation he always has
to take the result of the DefaultFileEntryParser into account.
This brings me to
2) only set a java.util.Locale per connection
and pick the needet date parser - in combination with the result of SYST
- out of a date parser pool.
For sure - it should be possible to do 1) but this should not be the
preferred way.
---
Mario
---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org