You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@commons.apache.org by "Gary D. Gregory (Jira)" <ji...@apache.org> on 2021/07/14 00:19:00 UTC

[jira] [Resolved] (CSV-283) Remove Whitespace Check Determines Delimiter Twice

     [ https://issues.apache.org/jira/browse/CSV-283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Gary D. Gregory resolved CSV-283.
---------------------------------
    Fix Version/s: 1.9.0
       Resolution: Fixed

[~belugabehr]

Merged PR https://github.com/apache/commons-csv/pull/167, please verify and close.

 

> Remove Whitespace Check Determines Delimiter Twice
> --------------------------------------------------
>
>                 Key: CSV-283
>                 URL: https://issues.apache.org/jira/browse/CSV-283
>             Project: Commons CSV
>          Issue Type: Improvement
>            Reporter: David Mollitor
>            Priority: Minor
>             Fix For: 1.9.0
>
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> {code:java|title=Lexer.java}
>     /**
>      * Tests if the given char is a whitespace character.
>      *
>      * @return true if the given char is a whitespace character.
>      * @throws IOException If an I/O error occurs.
>      */
>     boolean isWhitespace(final int ch) throws IOException {
>         return !isDelimiter(ch) && Character.isWhitespace((char) ch);
>     }
>                     while (true) {
>                         c = reader.read();
>                         if (isDelimiter(c)) {
>                             token.type = TOKEN;
>                             return token;
>                         }
>                         if (isEndOfFile(c)) {
>                             ...
>                         }
>                         if (readEndOfLine(c)) {
>                             ...
>                         }
>                         if (!isWhitespace(c)) {
>                             // error invalid char between token and next delimiter
>                             throw new IOException("(line " + getCurrentLineNumber() +
>                                     ") invalid char between encapsulated token and delimiter");
>                         }
>                     }
> {code}
> So the first check is for the delimiter, and it returns quick if it finds it.  After that point, it's known that this is NOT a delimiter, so no need to re-check it in {{isWhiteSpace}}.  The delimiter check can be somewhat expensive given it may involve a look-ahead IO read.
> Remove the redundancy. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)