You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@carbondata.apache.org by kunal642 <gi...@git.apache.org> on 2017/04/15 08:26:29 UTC

[GitHub] incubator-carbondata pull request #802: [CARBONDATA-926] Changed max columns...

GitHub user kunal642 opened a pull request:

    https://github.com/apache/incubator-carbondata/pull/802

    [CARBONDATA-926] Changed max columns from static value to configurable value in Load DDL

    Changes done:-
    
    1. Earlier the value for max columns that can be read by the csv parser was hard coded.
    Now the value would be read from the query options and if no option is provided then the default value of 0 would be taken.
    
    2. Added query id in compaction flow for driver statistics.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/kunal642/incubator-carbondata CARBONDATA-926

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-carbondata/pull/802.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #802
    
----
commit cf854888f2914bec2239b594deb9edefd72c123e
Author: kunal642 <ku...@knoldus.in>
Date:   2017-04-15T08:18:17Z

    changed max columns from static value to configurable

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata pull request #802: [CARBONDATA-926] Changed max columns...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/incubator-carbondata/pull/802


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata pull request #802: [CARBONDATA-926] Changed max columns...

Posted by gvramana <gi...@git.apache.org>.
Github user gvramana commented on a diff in the pull request:

    https://github.com/apache/incubator-carbondata/pull/802#discussion_r111665089
  
    --- Diff: processing/src/main/java/org/apache/carbondata/processing/csvload/CSVInputFormat.java ---
    @@ -220,8 +241,16 @@ private CsvParserSettings extractCsvParserSettings(Configuration job) {
           parserSettings.setIgnoreTrailingWhitespaces(false);
           parserSettings.setSkipEmptyLines(false);
           parserSettings.setMaxCharsPerColumn(100000);
    -      // TODO get from csv file.
    -      parserSettings.setMaxColumns(1000);
    +      String maxColumns = job.get(MAX_COLUMNS);
    +      int maxColumnsInt = 0;
    --- End diff --
    
    default value should be taken and same should be logged.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata issue #802: [CARBONDATA-926] Changed max columns from s...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/802
  
    Build Success with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/1643/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata pull request #802: [CARBONDATA-926] Changed max columns...

Posted by gvramana <gi...@git.apache.org>.
Github user gvramana commented on a diff in the pull request:

    https://github.com/apache/incubator-carbondata/pull/802#discussion_r111665083
  
    --- Diff: processing/src/main/java/org/apache/carbondata/processing/csvload/CSVInputFormat.java ---
    @@ -231,6 +260,31 @@ private CsvParserSettings extractCsvParserSettings(Configuration job) {
           return parserSettings;
         }
     
    +    /**
    +     * This method will decide the number of columns to be parsed for a row by Csv parser
    +     *
    +     * @param columnCountInSchema total number of columns in schema
    +     * @return
    +     */
    +    private int getMaxColumnsForParsing(int columnCountInSchema, int maxColumns) {
    +      int maxNumberOfColumnsForParsing = DEFAULT_MAX_NUMBER_OF_COLUMNS_FOR_PARSING;
    +      if (maxColumns > 0) {
    +        if (columnCountInSchema >= maxColumns) {
    +          maxNumberOfColumnsForParsing = columnCountInSchema + 1;
    --- End diff --
    
    taking more columns than the configured max value, should not happen.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata issue #802: [CARBONDATA-926] Changed max columns from s...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/802
  
    Build Failed  with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/1657/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata issue #802: [CARBONDATA-926] Changed max columns from s...

Posted by gvramana <gi...@git.apache.org>.
Github user gvramana commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/802
  
    Handle all the validation in driver and remove from executor.
    parameters: csvheadercolumns, maxcolumns(default 2000), Threashold(20000).
    1) User configures both  csvheadercolumns, maxcolumns, 
    - if csvheadercolumns >= maxcolumns, give error " csvheadercolumns should be less than max columns <value>"
    - if maxcolumns > threashold, give error "max columns cannot be more than threashold(20,000)"
    2) User configures csvheadercolumns 
    - if csvheadercolumns >= maxcolumns(default)
        maxcolumns = csvheadercolumns+1
    - if csvheadercolumns >= threashold, give error  " csvheadercolumns should be less than max threashold(20,000)"
    3) User configures nothing
    - if csvheadercolumns >= maxcolumns(default)
        maxcolumns = csvheadercolumns+1
    - if csvheadercolumns >= threashold, give error  " csvheadercolumns should be less than max threashold(20,000)"



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata issue #802: [CARBONDATA-926] Changed max columns from s...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/802
  
    Build Failed  with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/1641/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata issue #802: [CARBONDATA-926] Changed max columns from s...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/802
  
    Build Failed  with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/1635/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata issue #802: [CARBONDATA-926] Changed max columns from s...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/802
  
    Build Failed  with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/1639/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata issue #802: [CARBONDATA-926] Changed max columns from s...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/802
  
    Build Failed  with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/1647/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata issue #802: [CARBONDATA-926] Changed max columns from s...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/802
  
    Build Failed  with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/1640/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata pull request #802: [CARBONDATA-926] Changed max columns...

Posted by gvramana <gi...@git.apache.org>.
Github user gvramana commented on a diff in the pull request:

    https://github.com/apache/incubator-carbondata/pull/802#discussion_r111667378
  
    --- Diff: integration/spark-common/src/main/scala/org/apache/carbondata/spark/util/CommonUtil.scala ---
    @@ -342,7 +345,23 @@ object CommonUtil {
               + "the same. Input file : " + csvFile)
           }
         }
    -
         csvColumns
       }
    +
    +  def validateMaxColumns(csvHeaders: Array[String], maxColumns: String): Unit = {
    +    if (maxColumns != null) {
    +      try {
    +        val maxColumnsInt = maxColumns.toInt
    +        if( csvHeaders.length > maxColumnsInt) {
    +          sys.error("csv headers should be less than the max columns")
    --- End diff --
    
    also need to print max Value which is considered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata issue #802: [CARBONDATA-926] Changed max columns from s...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/802
  
    Build Success with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/1663/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata issue #802: [CARBONDATA-926] Changed max columns from s...

Posted by kunal642 <gi...@git.apache.org>.
Github user kunal642 commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/802
  
    retest this please



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata issue #802: [CARBONDATA-926] Changed max columns from s...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/802
  
    Build Failed  with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/1648/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata issue #802: [CARBONDATA-926] Changed max columns from s...

Posted by gvramana <gi...@git.apache.org>.
Github user gvramana commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/802
  
    LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---