You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@carbondata.apache.org by lion-x <gi...@git.apache.org> on 2016/09/01 12:05:44 UTC

[GitHub] incubator-carbondata pull request #118: add comment option

GitHub user lion-x opened a pull request:

    https://github.com/apache/incubator-carbondata/pull/118

    add comment option

    # Why raise this pr?
    
    add csv comment option to csv paser,and fix one bug when passing quotechar.
    
    # How to solve this?
    
    by default # will as the comment char,and user can use COMMENTCHAR to specify.
    
    # How to test
    
    Pass the exist test cases.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/lion-x/incubator-carbondata commentcharOption

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-carbondata/pull/118.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #118
    
----
commit 4c91b30a512dc3ea9cf66ae1d4d6a744cfa71dd5
Author: lion-x <xl...@gmail.com>
Date:   2016-09-01T11:55:43Z

    add comment option

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata pull request #118: [CARBONDATA-201]add comment option

Posted by asfgit <gi...@git.apache.org>.

Github user asfgit closed the pull request at:

    https://github.com/apache/incubator-carbondata/pull/118


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata pull request #118: [CARBONDATA-201]add comment option

Posted by QiangCai <gi...@git.apache.org>.

Github user QiangCai commented on a diff in the pull request:

    https://github.com/apache/incubator-carbondata/pull/118#discussion_r77563339
  
    --- Diff: processing/src/main/java/org/apache/carbondata/processing/csvload/DataGraphExecuter.java ---
    @@ -288,6 +288,8 @@ private void processHadoopFileInputMeta(List<String> measureColumns, List<StepMe
                 ((CsvInputMeta) step.getStepMetaInterface()).setDelimiter(model.getCsvDelimiter());
                 ((CsvInputMeta) step.getStepMetaInterface())
                   .setEscapeCharacter(model.getEscapeCharacter());
    +            ((CsvInputMeta) step.getStepMetaInterface())
    +              .setCommentCharacter(model.getEscapeCharacter());
    --- End diff --
    
    please check


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata pull request #118: [CARBONDATA-201]add comment option

Posted by lion-x <gi...@git.apache.org>.

Github user lion-x commented on a diff in the pull request:

    https://github.com/apache/incubator-carbondata/pull/118#discussion_r77618003
  
    --- Diff: integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/dataload/TestLoadDataWithHiveSyntax.scala ---
    @@ -590,6 +590,23 @@ class TestLoadDataWithHiveSyntax extends QueryTest with BeforeAndAfterAll {
         checkAnswer(sql("select * from carbontable1"), sql("select * from hivetable1"))
       }
     
    +  test("test data loading with comment option") {
    +    sql("drop table if exists comment_test")
    +    sql(
    +      "create table comment_test(imei string, age int, task bigint, num double, level decimal(10," +
    +        "3), productdate timestamp, mark int, name string) STORED BY 'org.apache.carbondata.format'"
    +    )
    +    sql(
    +      "LOAD DATA local inpath './src/test/resources/comment.csv' INTO TABLE comment_test " +
    +        "options('DELIMITER' = ',', 'QUOTECHAR' = '.', 'COMMENTCHAR' = '?','FILEHEADER'='imei,age,task,num,level,productdate,mark,name')"
    +    )
    +    checkAnswer(sql("select imei from comment_test"),Seq(Row("\"huawei"),Row("#huawei"), Row(""),
    +      Row("~huawei")))
    +    sql("drop table if exists comment_test")
    --- End diff --
    
    ok


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata pull request #118: [CARBONDATA-201]add comment option

Posted by QiangCai <gi...@git.apache.org>.

Github user QiangCai commented on a diff in the pull request:

    https://github.com/apache/incubator-carbondata/pull/118#discussion_r77563796
  
    --- Diff: processing/src/main/java/org/apache/carbondata/processing/csvload/DataGraphExecuter.java ---
    @@ -288,6 +288,8 @@ private void processHadoopFileInputMeta(List<String> measureColumns, List<StepMe
                 ((CsvInputMeta) step.getStepMetaInterface()).setDelimiter(model.getCsvDelimiter());
                 ((CsvInputMeta) step.getStepMetaInterface())
                   .setEscapeCharacter(model.getEscapeCharacter());
    +            ((CsvInputMeta) step.getStepMetaInterface())
    +              .setCommentCharacter(model.getCommentCharacter());
    --- End diff --
    
    The method "processHadoopFileInputMeta" is useless.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata pull request #118: [CARBONDATA-201]add comment option

Posted by QiangCai <gi...@git.apache.org>.

Github user QiangCai commented on a diff in the pull request:

    https://github.com/apache/incubator-carbondata/pull/118#discussion_r77563572
  
    --- Diff: processing/src/main/java/org/apache/carbondata/processing/csvreaderstep/CsvInputMeta.java ---
    @@ -162,6 +168,8 @@ private void readData(Node stepnode) throws KettleXMLException {
           blocksID = XMLHandler.getTagValue(stepnode, "blocksID");
           partitionID = XMLHandler.getTagValue(stepnode, "partitionID");
           escapeCharacter = XMLHandler.getTagValue(stepnode, "escapeCharacter");
    +      quoteCharacter = XMLHandler.getTagValue(stepnode, "quoteCharacter");
    +      commentCharacter = XMLHandler.getTagValue(stepnode, "commentCharacter");
    --- End diff --
    
    please keep the same sequence:  escapeCharacter -> quoteCharacter -> commentCharacter



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata pull request #118: [CARBONDATA-201]add comment option

Posted by QiangCai <gi...@git.apache.org>.

Github user QiangCai commented on a diff in the pull request:

    https://github.com/apache/incubator-carbondata/pull/118#discussion_r77617020
  
    --- Diff: integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/dataload/TestLoadDataWithHiveSyntax.scala ---
    @@ -590,6 +590,23 @@ class TestLoadDataWithHiveSyntax extends QueryTest with BeforeAndAfterAll {
         checkAnswer(sql("select * from carbontable1"), sql("select * from hivetable1"))
       }
     
    +  test("test data loading with comment option") {
    +    sql("drop table if exists comment_test")
    +    sql(
    +      "create table comment_test(imei string, age int, task bigint, num double, level decimal(10," +
    +        "3), productdate timestamp, mark int, name string) STORED BY 'org.apache.carbondata.format'"
    +    )
    +    sql(
    +      "LOAD DATA local inpath './src/test/resources/comment.csv' INTO TABLE comment_test " +
    +        "options('DELIMITER' = ',', 'QUOTECHAR' = '.', 'COMMENTCHAR' = '?','FILEHEADER'='imei,age,task,num,level,productdate,mark,name')"
    +    )
    +    checkAnswer(sql("select imei from comment_test"),Seq(Row("\"huawei"),Row("#huawei"), Row(""),
    +      Row("~huawei")))
    +    sql("drop table if exists comment_test")
    --- End diff --
    
    Better to drop table in the method afterAll 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---