You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@commons.apache.org by GitBox <gi...@apache.org> on 2020/10/02 20:45:06 UTC

[GitHub] [commons-csv] SethiPandi opened a new pull request #114: CSV-264: Added DuplicateHeaderMode for flexibility with header strictness.

SethiPandi opened a new pull request #114:
URL: https://github.com/apache/commons-csv/pull/114


   Instead of only having a boolean `true`/`false` for how duplicate header values are handled, this uses an enum instead.
   
   Previously the only possibilities were:
   * `true`: To allow duplicates.
   * `false`: To disallow duplicates, except empty cells which were allowed to be duplicates.
   
   This pull request makes an enum with three options:
   * `ALLOW_ALL`: To always allow duplicates. (Same as `true` previously.)
   * `ALLOW_EMPTY`: To allow duplicates only if they're empty cells. (Same as `false` previously.)
   * `DISALLOW`: To disallow all duplicates. 
   
   This provides a little more flexibility in the strictness for the parser, and makes what the options do clearer.
   
   Jira Issue: https://issues.apache.org/jira/browse/CSV-264


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [commons-csv] coveralls edited a comment on pull request #114: CSV-264: Added DuplicateHeaderMode for flexibility with header strictness.

Posted by GitBox <gi...@apache.org>.
coveralls edited a comment on pull request #114:
URL: https://github.com/apache/commons-csv/pull/114#issuecomment-702969045


   
   [![Coverage Status](https://coveralls.io/builds/35664297/badge)](https://coveralls.io/builds/35664297)
   
   Coverage increased (+0.02%) to 98.473% when pulling **3268d4e831dd9f98e8aceacdad02d24d739a11e8 on SethiPandi:CSV-264** into **e5fbba3eb326bf6443167df90f32560e66fa102a on apache:master**.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [commons-csv] coveralls edited a comment on pull request #114: CSV-264: Added DuplicateHeaderMode for flexibility with header strictness.

Posted by GitBox <gi...@apache.org>.
coveralls edited a comment on pull request #114:
URL: https://github.com/apache/commons-csv/pull/114#issuecomment-702969045


   
   [![Coverage Status](https://coveralls.io/builds/33933379/badge)](https://coveralls.io/builds/33933379)
   
   Coverage increased (+0.01%) to 98.521% when pulling **9590db2520a3a1e56df0a53acacf0d13e28e4b3a on SethiPandi:CSV-264** into **84aabf0009012652621aeae186e40941cf0a0b5d on apache:master**.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [commons-csv] coveralls edited a comment on pull request #114: CSV-264: Added DuplicateHeaderMode for flexibility with header strictness.

Posted by GitBox <gi...@apache.org>.
coveralls edited a comment on pull request #114:
URL: https://github.com/apache/commons-csv/pull/114#issuecomment-702969045


   
   [![Coverage Status](https://coveralls.io/builds/41235098/badge)](https://coveralls.io/builds/41235098)
   
   Coverage increased (+0.02%) to 98.327% when pulling **1be64a61410300d4951524948d8be87771ec1928 on SethFalco:CSV-264** into **46bae4b99b3151e7bb358a5fcb896be44324282d on apache:master**.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@commons.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [commons-csv] aherbert commented on pull request #114: CSV-264: Added DuplicateHeaderMode for flexibility with header strictness.

Posted by GitBox <gi...@apache.org>.
aherbert commented on pull request #114:
URL: https://github.com/apache/commons-csv/pull/114#issuecomment-702966394


   Thanks for the PR. You have removed some public API methods. This is not allowed and would fail binary compatibility tests. I have enabled these on the CI environments. If you rebase on master the errors should appear in the checks.
   
   You can run:
   ```
   mvn
   ```
   
   on the project to see the full checks on the code. 
   
   The `mvn clirr:check` goal will fail and indicate the public methods that have been removed. These should be supported in addition to the new methods with the new duplicate header names enum.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [commons-csv] SethiPandi commented on pull request #114: CSV-264: Added DuplicateHeaderMode for flexibility with header strictness.

Posted by GitBox <gi...@apache.org>.
SethiPandi commented on pull request #114:
URL: https://github.com/apache/commons-csv/pull/114#issuecomment-703626038


   Sorry about that, I rebased master and applied the changes requested.
   I believe this should be good now, though I'll take a peek at the unsuccessful checks now.
   
   I usually use `git pull` and `git merge` to obtain changes, I'm a bit new to using `rebase` however.  
   I'm not 100% sure if I was supposed to use `git push --force-with-lease...`, but the usual `git push...` was throwing warnings, please let me know if that's a problem.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [commons-csv] SethiPandi commented on pull request #114: CSV-264: Added DuplicateHeaderMode for flexibility with header strictness.

Posted by GitBox <gi...@apache.org>.
SethiPandi commented on pull request #114:
URL: https://github.com/apache/commons-csv/pull/114#issuecomment-736170787


   Not sure if you guys are interested in merging this PR, but figured I'd fix the merge conflicts that appeared from recent commits.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [commons-csv] coveralls edited a comment on pull request #114: CSV-264: Added DuplicateHeaderMode for flexibility with header strictness.

Posted by GitBox <gi...@apache.org>.
coveralls edited a comment on pull request #114:
URL: https://github.com/apache/commons-csv/pull/114#issuecomment-702969045


   
   [![Coverage Status](https://coveralls.io/builds/36798850/badge)](https://coveralls.io/builds/36798850)
   
   Coverage increased (+0.02%) to 98.469% when pulling **b2e67292fd1042c7eb1196ba2133ba6bb7a1ee11 on SethFalco:CSV-264** into **bf2f8093a49a3432be62e9fdae073e82ac78bd04 on apache:master**.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [commons-csv] SethFalco commented on pull request #114: CSV-264: Added DuplicateHeaderMode for flexibility with header strictness.

Posted by GitBox <gi...@apache.org>.
SethFalco commented on pull request #114:
URL: https://github.com/apache/commons-csv/pull/114#issuecomment-876841043


   It's worth noting this change doesn't impact how data is resolved internally or how headers are referenced, but just how if duplicate empty headers are allowed. I can peek into the current behavior anyway and try to document it if it's not already.
   
   Meanwhile, I've rebased with master.
   I also squashed my commits to make rebasing a bit easier.
   
   I also updated an exception to no longer recommend using one of the deprecated methods.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@commons.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [commons-csv] SethFalco commented on pull request #114: CSV-264: Added DuplicateHeaderMode for flexibility with header strictness.

Posted by GitBox <gi...@apache.org>.
SethFalco commented on pull request #114:
URL: https://github.com/apache/commons-csv/pull/114#issuecomment-876858602


   I didn't see it documented, figured the easiest way to find out would be to just test it.
   It currently uses the last column with the given name.
   
   > What happens/should happen for each enum value when you call CSVRecord.get(String).
   
   | A | A
   |---|---
   | 1 | 2
   | 3 | 4
   
   ```java
   List<CSVRecord> records = parser.getRecords();
   
   for (CSVRecord record : records)
       System.out.println(record.get("A"));
   
   // 2
   // 4
   ```
   
   > What happens/should happen for each enum value when you call CSVRecord.toMap().
   
   | A | A | B | B
   |---|---|---|---
   | 1 | 2 | 5 | 6
   | 3 | 4 | 7 | 8
   
   ```java
   List<CSVRecord> records = parser.getRecords();
   
   for (CSVRecord record : records) {
       Map<String, String> map = record.toMap();
       System.out.println(map);
   }
   
   // {A=2, B=6}
   // {A=4, B=8}
   ```
   
   I think we could add a docstring on `CSVRecord#get` and `CSVRecord#toMap` to clarify this.
   We could also add something on `CSVFormat#setDuplicateHeaderMode`?
   
   It might be a good idea to add test cases to cover this scenario, if it'll become documented behavior.
   
   Mind if I do this in a separate PR?
   I'd say it counts as a separate change from this one, so it'll keep things tidier.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@commons.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [commons-csv] coveralls commented on pull request #114: CSV-264: Added DuplicateHeaderMode for flexibility with header strictness.

Posted by GitBox <gi...@apache.org>.
coveralls commented on pull request #114:
URL: https://github.com/apache/commons-csv/pull/114#issuecomment-702969045


   
   [![Coverage Status](https://coveralls.io/builds/33899801/badge)](https://coveralls.io/builds/33899801)
   
   Coverage increased (+0.007%) to 98.513% when pulling **4b62b308bf3b7a04fc20c788f580022099dca6e2 on SethiPandi:CSV-264** into **d467e41a8faa4e3d528d94fc41b7dca8462b1622 on apache:master**.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [commons-csv] coveralls edited a comment on pull request #114: CSV-264: Added DuplicateHeaderMode for flexibility with header strictness.

Posted by GitBox <gi...@apache.org>.
coveralls edited a comment on pull request #114:
URL: https://github.com/apache/commons-csv/pull/114#issuecomment-702969045


   
   [![Coverage Status](https://coveralls.io/builds/35672876/badge)](https://coveralls.io/builds/35672876)
   
   Coverage increased (+0.02%) to 98.471% when pulling **a22ace9ef658f7fb5fce66e2613dc88f0eb32488 on SethiPandi:CSV-264** into **7ebb8db97794986ef958a8b9a50d057940b58105 on apache:master**.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [commons-csv] garydgregory commented on pull request #114: CSV-264: Added DuplicateHeaderMode for flexibility with header strictness.

Posted by GitBox <gi...@apache.org>.
garydgregory commented on pull request #114:
URL: https://github.com/apache/commons-csv/pull/114#issuecomment-743782488


   @SethiPandi 
   Please rebase on master.
   TY!
   Gary
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [commons-csv] garydgregory commented on pull request #114: CSV-264: Added DuplicateHeaderMode for flexibility with header strictness.

Posted by GitBox <gi...@apache.org>.
garydgregory commented on pull request #114:
URL: https://github.com/apache/commons-csv/pull/114#issuecomment-874321915


   @SethFalco 
   Please accept my apologieses for the delay in further reviewing this PR.
   
   I feel there is something important missing here:
   
   - What happens/should happen for each enum value when you call `CSVRecord.get(String)`.
   - What happens/should happen for each enum value when you call `CSVRecord.toMap()`.
   
   For example, if duplicate headers are allowed, does the first column win? Should it? Does the last column win? Certainly, it should not be random or undocumented.
   
   Also, would you mind rebasing on master please? I've recently deprecated the `with` methods in favor of a Builder. New settings should be added through the Builder, not `with` methods.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@commons.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [commons-csv] SethFalco commented on pull request #114: CSV-264: Added DuplicateHeaderMode for flexibility with header strictness.

Posted by GitBox <gi...@apache.org>.
SethFalco commented on pull request #114:
URL: https://github.com/apache/commons-csv/pull/114#issuecomment-877230492


   Sorry about the imports, I'm guessing that due to a setting in IntelliJ setting back when I was using that.
   I've rebased with master and done the requested changes. (I hope!)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@commons.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [commons-csv] SethFalco edited a comment on pull request #114: CSV-264: Added DuplicateHeaderMode for flexibility with header strictness.

Posted by GitBox <gi...@apache.org>.
SethFalco edited a comment on pull request #114:
URL: https://github.com/apache/commons-csv/pull/114#issuecomment-876841043


   It's worth noting this change doesn't impact how data is resolved internally or how headers are referenced, but just how if duplicate empty headers are allowed. I can peek into the current behavior anyway and try to document it if it's not already.
   
   Meanwhile, I've rebased with master.
   I also squashed my commits to make rebasing a bit easier.
   
   I also updated an exception to no longer recommend using one of the deprecated methods.
   
   Edit: Sorry, forgot to add @ Deprecated to the getter.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@commons.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [commons-csv] garydgregory commented on pull request #114: CSV-264: Added DuplicateHeaderMode for flexibility with header strictness.

Posted by GitBox <gi...@apache.org>.
garydgregory commented on pull request #114:
URL: https://github.com/apache/commons-csv/pull/114#issuecomment-704605916


   You can force push to your branch all you want; -) we usually do not rewrite history for long lived branches like master and release.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [commons-csv] garydgregory commented on pull request #114: CSV-264: Added DuplicateHeaderMode for flexibility with header strictness.

Posted by GitBox <gi...@apache.org>.
garydgregory commented on pull request #114:
URL: https://github.com/apache/commons-csv/pull/114#issuecomment-874321915


   @SethFalco 
   Please accept my apologieses for the delay in further reviewing this PR.
   
   I feel there is something important missing here:
   
   - What happens/should happen for each enum value when you call `CSVRecord.get(String)`.
   - What happens/should happen for each enum value when you call `CSVRecord.toMap()`.
   
   For example, if duplicate headers are allowed, does the first column win? Should it? Does the last column win? Certainly, it should not be random or undocumented.
   
   Also, would you mind rebasing on master please? I've recently deprecated the `with` methods in favor of a Builder. New settings should be added through the Builder, not `with` methods.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@commons.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [commons-csv] coveralls edited a comment on pull request #114: CSV-264: Added DuplicateHeaderMode for flexibility with header strictness.

Posted by GitBox <gi...@apache.org>.
coveralls edited a comment on pull request #114:
URL: https://github.com/apache/commons-csv/pull/114#issuecomment-702969045


   
   [![Coverage Status](https://coveralls.io/builds/35343192/badge)](https://coveralls.io/builds/35343192)
   
   Coverage increased (+0.02%) to 98.473% when pulling **0cf7e4e04572325bc88a3f8ac6e8b51ac3393d37 on SethiPandi:CSV-264** into **2ac83980ea4630bf92c2bbcf15c6efaa9e71cb3e on apache:master**.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [commons-csv] SethiPandi commented on pull request #114: CSV-264: Added DuplicateHeaderMode for flexibility with header strictness.

Posted by GitBox <gi...@apache.org>.
SethiPandi commented on pull request #114:
URL: https://github.com/apache/commons-csv/pull/114#issuecomment-747406922


   Just a heads up that this is rebased with `master` now!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [commons-csv] garydgregory commented on a change in pull request #114: CSV-264: Added DuplicateHeaderMode for flexibility with header strictness.

Posted by GitBox <gi...@apache.org>.
garydgregory commented on a change in pull request #114:
URL: https://github.com/apache/commons-csv/pull/114#discussion_r666978587



##########
File path: src/main/java/org/apache/commons/csv/CSVFormat.java
##########
@@ -1439,9 +1455,21 @@ public String format(final Object... values) {
      *
      * @return whether duplicate header names are allowed
      * @since 1.7
+     * @deprecated Use {@link #getDuplicateHeaderMode()}.
      */
+    @Deprecated
     public boolean getAllowDuplicateHeaderNames() {
-        return allowDuplicateHeaderNames;
+        return duplicateHeaderMode == DuplicateHeaderMode.ALLOW_ALL;
+    }
+
+    /**
+     * Returns how duplicate headers are handled.

Review comment:
       "Returns" -> "Gets"
   

##########
File path: src/main/java/org/apache/commons/csv/CSVFormat.java
##########
@@ -1439,9 +1455,21 @@ public String format(final Object... values) {
      *
      * @return whether duplicate header names are allowed
      * @since 1.7
+     * @deprecated Use {@link #getDuplicateHeaderMode()}.
      */
+    @Deprecated
     public boolean getAllowDuplicateHeaderNames() {
-        return allowDuplicateHeaderNames;
+        return duplicateHeaderMode == DuplicateHeaderMode.ALLOW_ALL;
+    }
+
+    /**
+     * Returns how duplicate headers are handled.
+     *
+     * @return if duplicate header values are allowed, allowed conditionally, or disallowed.
+     * @since 1.9

Review comment:
       "1.9" -> "1.9.0"

##########
File path: src/main/java/org/apache/commons/csv/DuplicateHeaderMode.java
##########
@@ -0,0 +1,43 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.commons.csv;
+
+/**
+ * Determines how duplicate header fields should be handled
+ * if {@link CSVFormat#withHeader(String...)} is not null.
+ *
+ * @since 1.9
+ */
+public enum DuplicateHeaderMode {
+
+    /**
+     * Allows all duplicate headings.
+     */
+    ALLOW_ALL,
+
+    /**
+     * Allows duplicate headings only if they're empty

Review comment:
       "headings" -> "headers"

##########
File path: src/test/java/org/apache/commons/csv/issues/JiraCsv264Test.java
##########
@@ -0,0 +1,98 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.commons.csv.issues;
+
+import org.apache.commons.csv.CSVFormat;
+import org.apache.commons.csv.DuplicateHeaderMode;
+import org.junit.jupiter.api.Assertions;
+import org.junit.jupiter.api.Test;
+
+import java.io.IOException;
+import java.io.StringReader;
+
+/**
+ * When {@link CSVFormat#withHeader(String...)} is not null;
+ * duplicate headers with empty strings should not be allowed.
+ *
+ * @see <a href="https://issues.apache.org/jira/projects/CSV/issues/CSV-264?filter=allopenissues">Jira Ticker</a>

Review comment:
       Simplify link, use https://issues.apache.org/jira/browse/CSV-264

##########
File path: src/test/java/org/apache/commons/csv/issues/JiraCsv264Test.java
##########
@@ -0,0 +1,98 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.commons.csv.issues;
+
+import org.apache.commons.csv.CSVFormat;
+import org.apache.commons.csv.DuplicateHeaderMode;
+import org.junit.jupiter.api.Assertions;
+import org.junit.jupiter.api.Test;
+
+import java.io.IOException;
+import java.io.StringReader;
+
+/**
+ * When {@link CSVFormat#withHeader(String...)} is not null;
+ * duplicate headers with empty strings should not be allowed.
+ *
+ * @see <a href="https://issues.apache.org/jira/projects/CSV/issues/CSV-264?filter=allopenissues">Jira Ticker</a>
+ */
+public class JiraCsv264Test {
+  
+    private static final String CSV_STRING = "\"\",\"B\",\"\"\n" +
+                                             "\"1\",\"2\",\"3\"\n" +
+                                             "\"4\",\"5\",\"6\"";
+
+    /**
+     * A CSV file with a random gap in the middle.
+     */
+    private static final String CSV_STRING_GAP = "\"A\",\"B\",\"\",\"\",\"E\"\n" +
+                                                 "\"1\",\"2\",\"\",\"\",\"5\"\n" +
+                                                 "\"6\",\"7\",\"\",\"\",\"10\"";
+
+    @Test
+    public void testJiraCsv264() throws IOException {
+        final CSVFormat csvFormat = CSVFormat.DEFAULT
+            .builder()
+            .setHeader()
+            .setDuplicateHeaderMode(DuplicateHeaderMode.DISALLOW)
+            .setAllowMissingColumnNames(true)
+            .build();
+
+        try (StringReader reader = new StringReader(CSV_STRING)) {
+            try {
+                csvFormat.parse(reader);
+                Assertions.fail("This shouldn't parse the CSV string successfully.");
+            } catch (IllegalArgumentException ex) {
+                // Test is successful!
+            }
+        }
+    }
+
+    @Test
+    public void testJiraCsv264WithGapAllowEmpty() throws IOException {
+        final CSVFormat csvFormat = CSVFormat.DEFAULT
+            .builder()
+            .setHeader()
+            .setDuplicateHeaderMode(DuplicateHeaderMode.ALLOW_EMPTY)
+            .setAllowMissingColumnNames(true)
+            .build();
+
+        try (StringReader reader = new StringReader(CSV_STRING_GAP)) {
+            csvFormat.parse(reader);
+        }
+    }
+
+    @Test
+    public void testJiraCsv264WithGapDisallow() throws IOException {
+        final CSVFormat csvFormat = CSVFormat.DEFAULT
+            .builder()
+            .setHeader()
+            .setDuplicateHeaderMode(DuplicateHeaderMode.DISALLOW)
+            .setAllowMissingColumnNames(true)
+            .build();
+
+        try (StringReader reader = new StringReader(CSV_STRING_GAP)) {
+            try {
+                csvFormat.parse(reader);
+                Assertions.fail("This shouldn't parse the CSV string successfully.");
+            } catch (IllegalArgumentException ex) {

Review comment:
       Please use the JUnit API `Assertions.assertThrows(...)`

##########
File path: src/main/java/org/apache/commons/csv/DuplicateHeaderMode.java
##########
@@ -0,0 +1,43 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.commons.csv;
+
+/**
+ * Determines how duplicate header fields should be handled
+ * if {@link CSVFormat#withHeader(String...)} is not null.
+ *
+ * @since 1.9
+ */
+public enum DuplicateHeaderMode {
+
+    /**
+     * Allows all duplicate headings.
+     */
+    ALLOW_ALL,
+
+    /**
+     * Allows duplicate headings only if they're empty
+     * strings or null.
+     */
+    ALLOW_EMPTY,
+
+    /**
+     * Disallows duplicate headings entirely.

Review comment:
       "headings" -> "headers"

##########
File path: src/main/java/org/apache/commons/csv/DuplicateHeaderMode.java
##########
@@ -0,0 +1,43 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.commons.csv;
+
+/**
+ * Determines how duplicate header fields should be handled
+ * if {@link CSVFormat#withHeader(String...)} is not null.
+ *
+ * @since 1.9
+ */
+public enum DuplicateHeaderMode {
+
+    /**
+     * Allows all duplicate headings.

Review comment:
       "headings" -> "headers"

##########
File path: src/test/java/org/apache/commons/csv/issues/JiraCsv264Test.java
##########
@@ -0,0 +1,98 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.commons.csv.issues;
+
+import org.apache.commons.csv.CSVFormat;
+import org.apache.commons.csv.DuplicateHeaderMode;
+import org.junit.jupiter.api.Assertions;
+import org.junit.jupiter.api.Test;
+
+import java.io.IOException;
+import java.io.StringReader;
+
+/**
+ * When {@link CSVFormat#withHeader(String...)} is not null;
+ * duplicate headers with empty strings should not be allowed.
+ *
+ * @see <a href="https://issues.apache.org/jira/projects/CSV/issues/CSV-264?filter=allopenissues">Jira Ticker</a>
+ */
+public class JiraCsv264Test {
+  
+    private static final String CSV_STRING = "\"\",\"B\",\"\"\n" +
+                                             "\"1\",\"2\",\"3\"\n" +
+                                             "\"4\",\"5\",\"6\"";
+
+    /**
+     * A CSV file with a random gap in the middle.
+     */
+    private static final String CSV_STRING_GAP = "\"A\",\"B\",\"\",\"\",\"E\"\n" +
+                                                 "\"1\",\"2\",\"\",\"\",\"5\"\n" +
+                                                 "\"6\",\"7\",\"\",\"\",\"10\"";
+
+    @Test
+    public void testJiraCsv264() throws IOException {
+        final CSVFormat csvFormat = CSVFormat.DEFAULT
+            .builder()
+            .setHeader()
+            .setDuplicateHeaderMode(DuplicateHeaderMode.DISALLOW)
+            .setAllowMissingColumnNames(true)
+            .build();
+
+        try (StringReader reader = new StringReader(CSV_STRING)) {
+            try {
+                csvFormat.parse(reader);
+                Assertions.fail("This shouldn't parse the CSV string successfully.");
+            } catch (IllegalArgumentException ex) {

Review comment:
       Please use the JUnit API `Assertions.assertThrows(...)`

##########
File path: src/main/java/org/apache/commons/csv/CSVParser.java
##########
@@ -17,8 +17,6 @@
 
 package org.apache.commons.csv;
 
-import static org.apache.commons.csv.Token.Type.TOKEN;

Review comment:
       Leave the order as is, please.
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@commons.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [commons-csv] SethFalco edited a comment on pull request #114: CSV-264: Added DuplicateHeaderMode for flexibility with header strictness.

Posted by GitBox <gi...@apache.org>.
SethFalco edited a comment on pull request #114:
URL: https://github.com/apache/commons-csv/pull/114#issuecomment-876841043


   It's worth noting this change doesn't impact how data is resolved internally or how headers are referenced, but just how if duplicate empty headers are allowed. I can peek into the current behavior anyway and try to document it if it's not already.
   
   Meanwhile, I've rebased with master.
   I also squashed my commits to make rebasing a bit easier.
   
   I also updated an exception to no longer recommend using one of the deprecated methods.
   
   Edit: Sorry, forgot to add \@Deprecated to the getter.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@commons.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [commons-csv] garydgregory commented on pull request #114: CSV-264: Added DuplicateHeaderMode for flexibility with header strictness.

Posted by GitBox <gi...@apache.org>.
garydgregory commented on pull request #114:
URL: https://github.com/apache/commons-csv/pull/114#issuecomment-877275446


   @SethFalco 
   TY, the PR looks good to me. I'd like feedback from the community as well.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@commons.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [commons-csv] garydgregory merged pull request #114: CSV-264: Added DuplicateHeaderMode for flexibility with header strictness.

Posted by GitBox <gi...@apache.org>.
garydgregory merged pull request #114:
URL: https://github.com/apache/commons-csv/pull/114


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@commons.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [commons-csv] coveralls edited a comment on pull request #114: CSV-264: Added DuplicateHeaderMode for flexibility with header strictness.

Posted by GitBox <gi...@apache.org>.
coveralls edited a comment on pull request #114:
URL: https://github.com/apache/commons-csv/pull/114#issuecomment-702969045


   
   [![Coverage Status](https://coveralls.io/builds/33932397/badge)](https://coveralls.io/builds/33932397)
   
   Coverage decreased (-0.08%) to 98.422% when pulling **6d33ab6c0f293ae6f31e52dd3205ceb241944b15 on SethiPandi:CSV-264** into **84aabf0009012652621aeae186e40941cf0a0b5d on apache:master**.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [commons-csv] garydgregory edited a comment on pull request #114: CSV-264: Added DuplicateHeaderMode for flexibility with header strictness.

Posted by GitBox <gi...@apache.org>.
garydgregory edited a comment on pull request #114:
URL: https://github.com/apache/commons-csv/pull/114#issuecomment-874321915


   @SethFalco 
   Please accept my apologies for the delay in further reviewing this PR.
   
   I feel there is something important missing here:
   
   - What happens/should happen for each enum value when you call `CSVRecord.get(String)`.
   - What happens/should happen for each enum value when you call `CSVRecord.toMap()`.
   
   For example, if duplicate headers are allowed, does the first column win? Should it? Does the last column win? Certainly, it should not be random or undocumented.
   
   Also, would you mind rebasing on master please? I've recently deprecated the `with` methods in favor of a Builder. New settings should be added through the Builder, not `with` methods.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@commons.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [commons-csv] SethFalco commented on pull request #114: CSV-264: Added DuplicateHeaderMode for flexibility with header strictness.

Posted by GitBox <gi...@apache.org>.
SethFalco commented on pull request #114:
URL: https://github.com/apache/commons-csv/pull/114#issuecomment-1001226987


   Just rebased this to resolve merge conflicts.
   
   > I'd like feedback from the community as well.
   
   Seems no one wants to comment on this. ^-^' 
   
   Any chance this could be merged soon, or still waiting on feedback? (Note, I'm in no rush.)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@commons.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org