You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@commons.apache.org by GitBox <gi...@apache.org> on 2022/09/06 14:31:32 UTC

[GitHub] [commons-csv] pedro-w opened a new pull request, #257: [CSV-304] Accessors for header/trailer comments

pedro-w opened a new pull request, #257:
URL: https://github.com/apache/commons-csv/pull/257

   Add accessors for header comments (before the header row)
   and trailer comments (after the last record)
   Also add javadoc and tests
   
   See https://issues.apache.org/jira/browse/CSV-304


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@commons.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [commons-csv] garydgregory commented on a diff in pull request #257: [CSV-304] Accessors for header/trailer comments

Posted by GitBox <gi...@apache.org>.
garydgregory commented on code in PR #257:
URL: https://github.com/apache/commons-csv/pull/257#discussion_r964979116


##########
src/main/java/org/apache/commons/csv/CSVParser.java:
##########
@@ -480,10 +483,12 @@ private Headers createHeaders() throws IOException {
                 final CSVRecord nextRecord = this.nextRecord();
                 if (nextRecord != null) {
                     headerRecord = nextRecord.values();
+                    headerComment = nextRecord.getComment();
                 }
             } else {
                 if (this.format.getSkipHeaderRecord()) {
-                    this.nextRecord();
+                    final CSVRecord csvRecord = this.nextRecord();

Review Comment:
   NPE waiting to happen?



##########
src/main/java/org/apache/commons/csv/CSVParser.java:
##########
@@ -352,6 +352,9 @@ public static CSVParser parse(final URL url, final Charset charset, final CSVFor
 
         return new CSVParser(new InputStreamReader(url.openStream(), charset), format);
     }
+    private String headerComment;

Review Comment:
   Add a blank line after the close brace.



##########
src/test/java/org/apache/commons/csv/CSVParserTest.java:
##########
@@ -1375,4 +1375,147 @@ private void validateRecordPosition(final String lineSeparator) throws IOExcepti
 
         parser.close();
     }
+    // CSV with no header comments
+    static private final String CSV_INPUT_NO_COMMENT = "A,B"+CRLF+"1,2"+CRLF;
+    // CSV with a header comment
+    static private final String CSV_INPUT_HEADER_COMMENT = "# header comment" + CRLF + "A,B" + CRLF + "1,2" + CRLF;
+    // CSV with a single line header and trailer comment
+    static private final String CSV_INPUT_HEADER_TRAILER_COMMENT = "# header comment" + CRLF + "A,B" + CRLF + "1,2" + CRLF + "# comment";
+    // CSV with a multi-line header and trailer comment
+    static private final String CSV_INPUT_MULTILINE_HEADER_TRAILER_COMMENT = "# multi-line" + CRLF + "# header comment" + CRLF + "A,B" + CRLF + "1,2" + CRLF + "# multi-line" + CRLF + "# comment";
+    // Format with auto-detected header
+    static private final CSVFormat FORMAT_AUTO_HEADER = CSVFormat.Builder.create(CSVFormat.DEFAULT).setCommentMarker('#').setHeader().build();
+    // Format with explicit header
+    static private final CSVFormat FORMAT_EXPLICIT_HEADER = CSVFormat.Builder.create(CSVFormat.DEFAULT)
+            .setSkipHeaderRecord(true)
+            .setCommentMarker('#')
+            .setHeader("A", "B")
+            .build();
+    // Format with explicit header that does not skip the header line
+    CSVFormat FORMAT_EXPLICIT_HEADER_NOSKIP = CSVFormat.Builder.create(CSVFormat.DEFAULT)
+            .setCommentMarker('#')
+            .setHeader("A", "B")
+            .build();
+    @Test
+    public void testGetHeaderComment_NoComment1() throws IOException {
+
+        try (CSVParser parser = CSVParser.parse(CSV_INPUT_NO_COMMENT, FORMAT_AUTO_HEADER)) {
+            parser.getRecords();
+            // Expect no header comment
+            assertFalse(parser.hasHeaderComment());
+            assertNull(parser.getHeaderComment());
+        }
+    }
+    @Test
+    public void testGetHeaderComment_HeaderComment1() throws IOException {
+        try (CSVParser parser = CSVParser.parse(CSV_INPUT_HEADER_COMMENT, FORMAT_AUTO_HEADER)) {
+            parser.getRecords();
+            // Expect a header comment
+            assertTrue(parser.hasHeaderComment());
+            assertEquals("header comment", parser.getHeaderComment());
+        }
+    }
+    @Test
+    public void testGetHeaderComment_HeaderTrailerComment() throws IOException {
+        try (CSVParser parser = CSVParser.parse(CSV_INPUT_MULTILINE_HEADER_TRAILER_COMMENT, FORMAT_AUTO_HEADER)) {
+            parser.getRecords();
+            // Expect a header comment
+            assertTrue(parser.hasHeaderComment());
+            assertEquals("multi-line"+LF+"header comment", parser.getHeaderComment());
+        }
+    }
+    @Test
+    public void testGetHeaderComment_NoComment2() throws IOException {
+        try (CSVParser parser = CSVParser.parse(CSV_INPUT_NO_COMMENT, FORMAT_EXPLICIT_HEADER)) {
+            parser.getRecords();
+            // Expect no header comment
+            assertFalse(parser.hasHeaderComment());
+            assertNull(parser.getHeaderComment());
+        }
+    }
+    @Test
+    public void testGetHeaderComment_HeaderComment2() throws IOException {
+        try (CSVParser parser = CSVParser.parse(CSV_INPUT_HEADER_COMMENT, FORMAT_EXPLICIT_HEADER)) {
+            parser.getRecords();
+            // Expect a header comment
+            assertTrue(parser.hasHeaderComment());
+            assertEquals("header comment", parser.getHeaderComment());
+        }
+    }
+    @Test
+    public void testGetHeaderComment_NoComment3() throws IOException {
+        try (CSVParser parser = CSVParser.parse(CSV_INPUT_NO_COMMENT, FORMAT_EXPLICIT_HEADER_NOSKIP)) {
+            parser.getRecords();
+            // Expect no header comment
+            assertFalse(parser.hasHeaderComment());
+            assertNull(parser.getHeaderComment());
+        }
+    }
+    @Test
+    public void testGetHeaderComment_HeaderComment3() throws IOException {
+        try (CSVParser parser = CSVParser.parse(CSV_INPUT_HEADER_COMMENT, FORMAT_EXPLICIT_HEADER_NOSKIP)) {
+            parser.getRecords();
+            // Expect no header comment - the text "comment" is attached to the first record
+            assertFalse(parser.hasHeaderComment());
+            assertNull(parser.getHeaderComment());
+        }
+    }
+
+    @Test
+    public void testGetTrailerComment_HeaderComment1() throws IOException {
+        try (CSVParser parser = CSVParser.parse(CSV_INPUT_HEADER_COMMENT, FORMAT_AUTO_HEADER)) {
+            parser.getRecords();
+            assertFalse(parser.hasTrailerComment());
+            assertNull(parser.getTrailerComment());
+        }
+    }
+    @Test
+    public void testGetTrailerComment_HeaderTrailerComment1() throws IOException {
+        try (CSVParser parser = CSVParser.parse(CSV_INPUT_HEADER_TRAILER_COMMENT, FORMAT_AUTO_HEADER)) {
+            parser.getRecords();
+            assertTrue(parser.hasTrailerComment());
+            assertEquals("comment", parser.getTrailerComment());
+        }
+    }
+    @Test
+    public void testGetTrailerComment_MultilineComment() throws IOException {
+        try (CSVParser parser = CSVParser.parse(CSV_INPUT_MULTILINE_HEADER_TRAILER_COMMENT, FORMAT_AUTO_HEADER)) {
+            parser.getRecords();
+            assertTrue(parser.hasTrailerComment());
+            assertEquals("multi-line"+LF+"comment", parser.getTrailerComment());
+        }
+    }
+    @Test
+    public void testGetTrailerComment_HeaderComment2() throws IOException {
+        try (CSVParser parser = CSVParser.parse(CSV_INPUT_HEADER_COMMENT, FORMAT_EXPLICIT_HEADER)) {
+            parser.getRecords();
+            assertFalse(parser.hasTrailerComment());
+            assertNull(parser.getTrailerComment());
+        }
+    }
+    @Test
+    public void testGetTrailerComment_HeaderTrailerComment2() throws IOException {
+        try (CSVParser parser = CSVParser.parse(CSV_INPUT_HEADER_TRAILER_COMMENT, FORMAT_EXPLICIT_HEADER)) {
+            parser.getRecords();
+            assertTrue(parser.hasTrailerComment());
+            assertEquals("comment", parser.getTrailerComment());
+        }
+    }
+    @Test
+    public void testGetTrailerComment_HeaderComment3() throws IOException {
+        try (CSVParser parser = CSVParser.parse(CSV_INPUT_HEADER_COMMENT, FORMAT_EXPLICIT_HEADER_NOSKIP)) {
+            parser.getRecords();
+            assertFalse(parser.hasTrailerComment());
+            assertNull(parser.getTrailerComment());
+        }
+    }
+    @Test

Review Comment:
   Please separate methods with a blank line.



##########
src/main/java/org/apache/commons/csv/CSVParser.java:
##########
@@ -713,6 +764,10 @@ CSVRecord nextRecord() throws IOException {
             case EOF:
                 if (this.reusableToken.isReady) {
                     this.addRecordValue(true);
+                } else {

Review Comment:
   Extra block level is not needed, IOW use "else if" instead of "else { if ... }"



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@commons.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [commons-csv] pedro-w commented on pull request #257: [CSV-304] Accessors for header/trailer comments

Posted by GitBox <gi...@apache.org>.
pedro-w commented on PR #257:
URL: https://github.com/apache/commons-csv/pull/257#issuecomment-1240838646

   > 
   
   
   > Is the intent to allow comments at the start and end of a file or before and after each record?
   
   The former. I'll explain my thinking, *please correct me if I've got it wrong!*
   
       ; Comment 0
       A,B
       ; Comment 1
       1,1
       ; Comment 2
       2,2
       ; Comment 3
   
   The comments attach to the following line so Comments 1 & 2 can currently be extracted from the corresponding `CSVRecord`s and Comment 3 can never be accessed. Comment 0 is a special case. If we ask `CSVParser` to generate the column headings from the first record, it can't be accessed, but otherwise it's attached to "A,B" which is just a normal data record.
   
   My intention was to allow access to Comment 0 via `getHeaderComment` (only in the case of generated column names) and Comment 3 via `getTrailerComment` (always).
   
   
   I'll get onto the other issues too as they're mostly just formatting.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@commons.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [commons-csv] codecov-commenter commented on pull request #257: [CSV-304] Accessors for header/trailer comments

Posted by GitBox <gi...@apache.org>.
codecov-commenter commented on PR #257:
URL: https://github.com/apache/commons-csv/pull/257#issuecomment-1238516193

   # [Codecov](https://codecov.io/gh/apache/commons-csv/pull/257?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#257](https://codecov.io/gh/apache/commons-csv/pull/257?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (0414d1e) into [master](https://codecov.io/gh/apache/commons-csv/commit/c51b595e345954a286d59ed4e37d5d6d319cc78a?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (c51b595) will **increase** coverage by `0.02%`.
   > The diff coverage is `100.00%`.
   
   ```diff
   @@             Coverage Diff              @@
   ##             master     #257      +/-   ##
   ============================================
   + Coverage     96.99%   97.01%   +0.02%     
   - Complexity      529      536       +7     
   ============================================
     Files            11       11              
     Lines          1166     1174       +8     
     Branches        204      205       +1     
   ============================================
   + Hits           1131     1139       +8     
     Misses           23       23              
     Partials         12       12              
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/commons-csv/pull/257?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [...rc/main/java/org/apache/commons/csv/CSVParser.java](https://codecov.io/gh/apache/commons-csv/pull/257/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2NvbW1vbnMvY3N2L0NTVlBhcnNlci5qYXZh) | `95.67% <100.00%> (+0.22%)` | :arrow_up: |
   
   :mega: We’re building smart automated test selection to slash your CI/CD build times. [Learn more](https://about.codecov.io/iterative-testing/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@commons.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [commons-csv] garydgregory merged pull request #257: [CSV-304] Accessors for header/trailer comments

Posted by GitBox <gi...@apache.org>.
garydgregory merged PR #257:
URL: https://github.com/apache/commons-csv/pull/257


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@commons.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [commons-csv] kinow commented on a diff in pull request #257: [CSV-304] Accessors for header/trailer comments

Posted by GitBox <gi...@apache.org>.
kinow commented on code in PR #257:
URL: https://github.com/apache/commons-csv/pull/257#discussion_r964097158


##########
src/main/java/org/apache/commons/csv/CSVParser.java:
##########
@@ -596,7 +601,49 @@ Map<String, Integer> getHeaderMapRaw() {
     public List<String> getHeaderNames() {
         return Collections.unmodifiableList(headers.headerNames);
     }
-
+    /**
+     * Checks whether this parser has a header comment, false otherwise.
+     * The header comment appears before the header record.
+     * Note that if the parser's format has been given an explicit header
+     * (with {@link CSVFormat.Builder#setHeader(String... )} or another overload)
+     * and the header record is not being skipped
+     * ({@link CSVFormat.Builder#setSkipHeaderRecord} is false) then any initial comments
+     * will be associated with the first record, not the header.
+     *
+     * @return true if this parser has seen a header comment, false otherwise
+     */
+    public boolean hasHeaderComment() {
+        return headerComment != null;
+    }
+    /**
+     * Returns the header comment for this parser, if any.
+     * The header comment appears before the header record.
+     *
+     * @return the header comment for this stream, or null if no comment is available.
+     */
+    public String getHeaderComment() {
+        return headerComment;
+    }
+    /**
+     * Checks whether this parser has seen a trailer comment, false otherwise.
+     * Trailer comments are located between the last record and EOF.
+     * The trailer comments will only be available after the parser has
+     * finished processing this stream.
+     *
+     * @return true if this parser has seen a trailer comment, false otherwise
+     */
+    public boolean hasTrailerComment() {
+        return trailerComment != null;
+    }
+    /**
+     * Returns the trailer comment for this record, if any.
+     * Trailer comments are located between the last record and EOF
+     *
+     * @return the trailer comment for this stream, or null if no comment is available.
+     */
+    public String getTrailerComment() {
+        return trailerComment;
+    }

Review Comment:
   New public methods need a `@since ` tag with the version. The version is the `version-SNAPSHOT` from the `pom.xml`, without the `-SNAPSHOT` suffix (i.e. next release version).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@commons.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [commons-csv] pedro-w commented on pull request #257: [CSV-304] Accessors for header/trailer comments

Posted by GitBox <gi...@apache.org>.
pedro-w commented on PR #257:
URL: https://github.com/apache/commons-csv/pull/257#issuecomment-1239390837

   I noticed that github is parsing the Javadoc annotation `@since` as a username. I apologize to the owner of that account.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@commons.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org