You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@commons.apache.org by "David Mollitor (Jira)" <ji...@apache.org> on 2021/07/13 21:07:00 UTC

[jira] [Updated] (CSV-283) Remove Whitespace Check Determines Delimiter Twice

     [ https://issues.apache.org/jira/browse/CSV-283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

David Mollitor updated CSV-283:
-------------------------------
    Summary: Remove Whitespace Check Determines Delimiter Twice  (was: Whitespace Check Determines Delimiter Twice)

> Remove Whitespace Check Determines Delimiter Twice
> --------------------------------------------------
>
>                 Key: CSV-283
>                 URL: https://issues.apache.org/jira/browse/CSV-283
>             Project: Commons CSV
>          Issue Type: Improvement
>            Reporter: David Mollitor
>            Priority: Minor
>
> {code:java|title=Lexer.java}
>     /**
>      * Tests if the given char is a whitespace character.
>      *
>      * @return true if the given char is a whitespace character.
>      * @throws IOException If an I/O error occurs.
>      */
>     boolean isWhitespace(final int ch) throws IOException {
>         return !isDelimiter(ch) && Character.isWhitespace((char) ch);
>     }
>                     while (true) {
>                         c = reader.read();
>                         if (isDelimiter(c)) {
>                             token.type = TOKEN;
>                             return token;
>                         }
>                         if (isEndOfFile(c)) {
>                             ...
>                         }
>                         if (readEndOfLine(c)) {
>                             ...
>                         }
>                         if (!isWhitespace(c)) {
>                             // error invalid char between token and next delimiter
>                             throw new IOException("(line " + getCurrentLineNumber() +
>                                     ") invalid char between encapsulated token and delimiter");
>                         }
>                     }
> {code}
> So the first check is for the delimiter, and it returns quick if it finds it.  After that point, it's known that this is NOT a delimiter, so no need to re-check it in {{isWhiteSpace}}.  The delimiter check can be somewhat expensive given it may involve a look-ahead IO read.
> Remove the redundancy. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)