You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ant.apache.org by Jeff Tulley <JT...@novell.com> on 2002/01/02 23:22:54 UTC

Refactor of PathTokenizer - am I missing something?

in org.apache.tools.ant.PathTokenizer, the following code occurs:

    public PathTokenizer(String path) {
       tokenizer = new StringTokenizer(path, ":;", false);
       dosStyleFilesystem = File.pathSeparatorChar == ';'; 
    }
.<snip>
.
.</snip>
    public String nextToken() throws NoSuchElementException {
        String token = null;
        if (lookahead != null) {
            token = lookahead;
            lookahead = null;
        }
        else {
            token = tokenizer.nextToken().trim();
        }            
            
        if (token.length() == 1 && Character.isLetter(token.charAt(0))
                                && dosStyleFilesystem
                                && tokenizer.hasMoreTokens()) {
            // we are on a dos style system so this path could be a
drive
            // spec. We look at the next token
            String nextToken = tokenizer.nextToken().trim();
            if (nextToken.startsWith("\\") ||
nextToken.startsWith("/")) {
                // we know we are on a DOS style platform and the next
path starts with a
                // slash or backslash, so we know this is a drive spec
                token += ":" + nextToken;
            }
            else {
                // store the token just read for next time
                lookahead = nextToken;
            }
        }
           
        return token;
    }

The "if(token.length() == 1" code snippet in nextToken() contains an
assumption that just does not hold up on NetWare, and causes me grief. 
A NetWare volume name (akin to a drive letter) could be multiple
characters long, and would not cause execution of the lookahead code,
whereas it should in some cases.  There is no way that I know of to
rewrite that if statement.

But, looking at the code, I was wondering if the whole thing could be
simplified down to the following:

public class PathTokenizer {
    /**
     * A tokenizer to break the string up based on the ':' or ';'
separators.
     */
    private StringTokenizer tokenizer;

    public PathTokenizer(String path) {
        // create a tokenizer based on the file separator character,
either ";" (Windows, 
        // NetWare) or ":" (UNIX-based)
        tokenizer = new StringTokenizer(path,
String.valueOf(File.pathSeparatorChar), 
                                        false);
    }

    public boolean hasMoreTokens() {
        return tokenizer.hasMoreTokens();
    }
    
    public String nextToken() throws NoSuchElementException {
        return tokenizer.nextToken().trim();
    }
}


Yes, that is the whole class.  So, I'm thinking - it is so simple that
I must be missing something.  Maybe CYGWIN?  (I know Cygwin would get
the paths correct, but would File.pathSeparatorChar be ":"(colon) or
";"(semi-colon)???)

If the code can be reduced down to these 21 lines what you basically
have is only a class that hides the fact that different path separators
are being used on different OS's.  The users of PathTokenizer could just
as easily have directly used StringTokenizer(path,
String.valueOf(File.pathSeparatorChar), false) themselves, and the code
would still be OS independent.


If I am missing something could somebody please point out what, and we
need to work on how to make the code function correctly when there both
":" and ";" occur in a path, with multiple character drive / volume
names.

If I am not missing something I'll go ahead and submit a patch for
review.

Thanks,

Jeff Tulley  (jtulley@novell.com)
(801)861-5322
Novell, Inc., the leading provider of Net services software.

--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>