You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@commons.apache.org by "Markus Rogg (JIRA)" <ji...@apache.org> on 2006/07/13 11:05:29 UTC

[jira] Created: (SANDBOX-153) Delimiter should be never recognized as whitespace

Delimiter should be never recognized as whitespace
--------------------------------------------------

         Key: SANDBOX-153
         URL: http://issues.apache.org/jira/browse/SANDBOX-153
     Project: Commons Sandbox
        Type: Bug

  Components: CSV  
    Reporter: Markus Rogg


The CSV-Parser ignores whitespaces at the beginning of a token. If the delimiter is a tabspace and data has no encapsulator the parser loses the empty tokens. The parser should never recognize a delimiter as a whitespace. A possible solution for the class CSVParser is to change the method isWhitespace(int) :

  private boolean isWhitespace(int c) {
    return Character.isWhitespace((char) c) && (c != strategy.getDelimiter());
  }

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org


[jira] Commented: (SANDBOX-153) Delimiter should be never recognized as whitespace

Posted by "Henri Yandell (JIRA)" <ji...@apache.org>.
    [ http://issues.apache.org/jira/browse/SANDBOX-153?page=comments#action_12421430 ] 
            
Henri Yandell commented on SANDBOX-153:
---------------------------------------

Sounds valid - would you have time to knock up a JUnit test to show the bug (that the isWhitespace change subsequently fixes)?

> Delimiter should be never recognized as whitespace
> --------------------------------------------------
>
>                 Key: SANDBOX-153
>                 URL: http://issues.apache.org/jira/browse/SANDBOX-153
>             Project: Commons Sandbox
>          Issue Type: Bug
>          Components: CSV
>            Reporter: Markus Rogg
>
> The CSV-Parser ignores whitespaces at the beginning of a token. If the delimiter is a tabspace and data has no encapsulator the parser loses the empty tokens. The parser should never recognize a delimiter as a whitespace. A possible solution for the class CSVParser is to change the method isWhitespace(int) :
>   private boolean isWhitespace(int c) {
>     return Character.isWhitespace((char) c) && (c != strategy.getDelimiter());
>   }

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org


[jira] Commented: (SANDBOX-153) Delimiter should be never recognized as whitespace

Posted by "Markus Rogg (JIRA)" <ji...@apache.org>.
    [ http://issues.apache.org/jira/browse/SANDBOX-153?page=comments#action_12424003 ] 
            
Markus Rogg commented on SANDBOX-153:
-------------------------------------

"Delimiter is whitespace" JUnit test (CSVParserTest)

   public void testDelimiterIsWhitespace() throws IOException {
       String code = "one\ttwo\t\tfour \t five\t  six";
       TestCSVParser parser = new TestCSVParser(new StringReader(code));
       parser.setStrategy(CSVStrategy.TDF_STRATEGY);
       System.out.println("---------\n" + code + "\n-------------");
       assertEquals(CSVParser.TT_TOKEN + ";one;", parser.testNextToken());
       assertEquals(CSVParser.TT_TOKEN + ";two;", parser.testNextToken());
       assertEquals(CSVParser.TT_TOKEN + ";;", parser.testNextToken());
       assertEquals(CSVParser.TT_TOKEN + ";four;", parser.testNextToken());
       assertEquals(CSVParser.TT_TOKEN + ";five;", parser.testNextToken());
       assertEquals(CSVParser.TT_EOF + ";six;", parser.testNextToken());
    }

> Delimiter should be never recognized as whitespace
> --------------------------------------------------
>
>                 Key: SANDBOX-153
>                 URL: http://issues.apache.org/jira/browse/SANDBOX-153
>             Project: Commons Sandbox
>          Issue Type: Bug
>          Components: CSV
>            Reporter: Markus Rogg
>
> The CSV-Parser ignores whitespaces at the beginning of a token. If the delimiter is a tabspace and data has no encapsulator the parser loses the empty tokens. The parser should never recognize a delimiter as a whitespace. A possible solution for the class CSVParser is to change the method isWhitespace(int) :
>   private boolean isWhitespace(int c) {
>     return Character.isWhitespace((char) c) && (c != strategy.getDelimiter());
>   }

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org


[jira] Resolved: (SANDBOX-153) Delimiter should be never recognized as whitespace

Posted by "Henri Yandell (JIRA)" <ji...@apache.org>.
     [ http://issues.apache.org/jira/browse/SANDBOX-153?page=all ]

Henri Yandell resolved SANDBOX-153.
-----------------------------------

    Resolution: Fixed

Thanks Markus; the patch and unit test are both applied and will be in tonight's nightly build.

 svn ci -m "Fixing bug reported byu Markus Rogg in #SANDBOX-153. Whitespace was be
ing treated specially when it was not the delimiter. Unit test and patch applied. "
Sending        src/java/org/apache/commons/csv/CSVParser.java
Sending        src/test/org/apache/commons/csv/CSVParserTest.java
Transmitting file data ..
Committed revision 427470.

> Delimiter should be never recognized as whitespace
> --------------------------------------------------
>
>                 Key: SANDBOX-153
>                 URL: http://issues.apache.org/jira/browse/SANDBOX-153
>             Project: Commons Sandbox
>          Issue Type: Bug
>          Components: CSV
>            Reporter: Markus Rogg
>
> The CSV-Parser ignores whitespaces at the beginning of a token. If the delimiter is a tabspace and data has no encapsulator the parser loses the empty tokens. The parser should never recognize a delimiter as a whitespace. A possible solution for the class CSVParser is to change the method isWhitespace(int) :
>   private boolean isWhitespace(int c) {
>     return Character.isWhitespace((char) c) && (c != strategy.getDelimiter());
>   }

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org