You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@commons.apache.org by "Gary D. Gregory (Jira)" <ji...@apache.org> on 2021/07/14 00:19:00 UTC
[jira] [Resolved] (CSV-283) Remove Whitespace Check Determines
Delimiter Twice
[ https://issues.apache.org/jira/browse/CSV-283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Gary D. Gregory resolved CSV-283.
---------------------------------
Fix Version/s: 1.9.0
Resolution: Fixed
[~belugabehr]
Merged PR https://github.com/apache/commons-csv/pull/167, please verify and close.
> Remove Whitespace Check Determines Delimiter Twice
> --------------------------------------------------
>
> Key: CSV-283
> URL: https://issues.apache.org/jira/browse/CSV-283
> Project: Commons CSV
> Issue Type: Improvement
> Reporter: David Mollitor
> Priority: Minor
> Fix For: 1.9.0
>
> Time Spent: 20m
> Remaining Estimate: 0h
>
> {code:java|title=Lexer.java}
> /**
> * Tests if the given char is a whitespace character.
> *
> * @return true if the given char is a whitespace character.
> * @throws IOException If an I/O error occurs.
> */
> boolean isWhitespace(final int ch) throws IOException {
> return !isDelimiter(ch) && Character.isWhitespace((char) ch);
> }
> while (true) {
> c = reader.read();
> if (isDelimiter(c)) {
> token.type = TOKEN;
> return token;
> }
> if (isEndOfFile(c)) {
> ...
> }
> if (readEndOfLine(c)) {
> ...
> }
> if (!isWhitespace(c)) {
> // error invalid char between token and next delimiter
> throw new IOException("(line " + getCurrentLineNumber() +
> ") invalid char between encapsulated token and delimiter");
> }
> }
> {code}
> So the first check is for the delimiter, and it returns quick if it finds it. After that point, it's known that this is NOT a delimiter, so no need to re-check it in {{isWhiteSpace}}. The delimiter check can be somewhat expensive given it may involve a look-ahead IO read.
> Remove the redundancy.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)