You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@carbondata.apache.org by cenyuhai <gi...@git.apache.org> on 2017/08/06 05:52:00 UTC

[GitHub] carbondata pull request #1239: [CARBONDATA-1338] add tableInfo to CarbonHive...

GitHub user cenyuhai opened a pull request:

    https://github.com/apache/carbondata/pull/1239

    [CARBONDATA-1338] add tableInfo to CarbonHiveInputSplit and no need to get schema from file

    add tableInfo to CarbonHiveInputSplit and no need to get schema from file in map process.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/cenyuhai/incubator-carbondata CARBONDATA-1338

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/carbondata/pull/1239.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1239
    
----
commit 29b46b64096829dc0260bb9d562ec7b8c8e329dc
Author: cenyuhai <26...@qq.com>
Date:   2017-08-06T05:47:56Z

    add tableInfo to CarbonHiveInputSplit and no need to get schema from file

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1239: [CARBONDATA-1338] add tableInfo to CarbonHiveInputSp...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1239
  
    Build Failed with Spark 1.6, Please check CI http://144.76.159.231:8080/job/ApacheCarbonPRBuilder/802/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1239: [CARBONDATA-1343] Hive can't query data when the car...

Posted by cenyuhai <gi...@git.apache.org>.
Github user cenyuhai commented on the issue:

    https://github.com/apache/carbondata/pull/1239
  
    retest please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1239: [CARBONDATA-1343] Hive can't query data when the car...

Posted by cenyuhai <gi...@git.apache.org>.
Github user cenyuhai commented on the issue:

    https://github.com/apache/carbondata/pull/1239
  
    I make a mistake, I use an old branch to commit code...  So close this pr, use another pr #1291 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1239: [CARBONDATA-1338] add tableInfo to CarbonHiveInputSp...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1239
  
    Build Success with Spark 1.6, Please check CI http://144.76.159.231:8080/job/ApacheCarbonPRBuilder/796/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1239: [CARBONDATA-1338] add tableInfo to CarbonHiveInputSp...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1239
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/3400/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1239: [CARBONDATA-1338] add tableInfo to CarbonHiveInputSp...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1239
  
    SDV Build Success with Spark 2.1, Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/125/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1239: [CARBONDATA-1338] add tableInfo to CarbonHiveInputSp...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1239
  
    SDV Build Success with Spark 2.1, Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/123/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1239: [CARBONDATA-1343] Hive can't query data when the car...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1239
  
    SDV Build Failed with Spark 2.1, Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/245/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1239: [CARBONDATA-1338] add tableInfo to CarbonHiveInputSp...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1239
  
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/3394/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata pull request #1239: [CARBONDATA-1343] Hive can't query data when ...

Posted by cenyuhai <gi...@git.apache.org>.
Github user cenyuhai commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1239#discussion_r132695250
  
    --- Diff: integration/hive/src/main/java/org/apache/carbondata/hive/MapredCarbonInputFormat.java ---
    @@ -84,47 +92,66 @@
        * @param configuration
        * @throws IOException
        */
    -  private static void populateCarbonTable(Configuration configuration, String paths)
    +  private CarbonTable populateCarbonTable(Configuration configuration)
           throws IOException {
    -    String dirs = configuration.get(INPUT_DIR, "");
    -    String[] inputPaths = StringUtils.split(dirs);
    -    String validInputPath = null;
    +    TableInfo tableInfo = getTableInfo(configuration);
    +    CarbonTable carbonTable = null;
    +    if (tableInfo != null) {
    +      carbonTable = CarbonTable.buildFromTableInfo(tableInfo);
    +      CarbonMetadata.getInstance().addCarbonTable(carbonTable);
    +      return carbonTable;
    +    }
    +    String inputDir = configuration.get(INPUT_DIR, "");
    +    String[] inputPaths = StringUtils.split(inputDir);
         if (inputPaths.length == 0) {
           throw new InvalidPathException("No input paths specified in job");
    -    } else {
    -      if (paths != null) {
    -        for (String inputPath : inputPaths) {
    -          if (paths.startsWith(inputPath)) {
    -            validInputPath = inputPath;
    -            break;
    -          }
    -        }
    -      }
         }
    +    Arrays.sort(inputPaths);
    +    String tablePath = inputPaths[0].replace("file:", "");
    --- End diff --
    
    because of another issue, my pr is conflict with https://github.com/apache/carbondata/pull/1231


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1239: [CARBONDATA-1343] Hive can't query data when the car...

Posted by cenyuhai <gi...@git.apache.org>.
Github user cenyuhai commented on the issue:

    https://github.com/apache/carbondata/pull/1239
  
    @anubhav100 hive example is ok because #1231 is merged. But this exception still exists.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata pull request #1239: [CARBONDATA-1343] Hive can't query data when ...

Posted by cenyuhai <gi...@git.apache.org>.
Github user cenyuhai closed the pull request at:

    https://github.com/apache/carbondata/pull/1239


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1239: [CARBONDATA-1343] Hive can't query data when the car...

Posted by anubhav100 <gi...@git.apache.org>.
Github user anubhav100 commented on the issue:

    https://github.com/apache/carbondata/pull/1239
  
    @cenyhui can you tried running hive example and see whether it's  working or not last time I check it was working fine on latest master


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata pull request #1239: [CARBONDATA-1343] Hive can't query data when ...

Posted by xuchuanyin <gi...@git.apache.org>.
Github user xuchuanyin commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1239#discussion_r132685712
  
    --- Diff: integration/hive/src/main/java/org/apache/carbondata/hive/MapredCarbonInputFormat.java ---
    @@ -84,47 +92,66 @@
        * @param configuration
        * @throws IOException
        */
    -  private static void populateCarbonTable(Configuration configuration, String paths)
    +  private CarbonTable populateCarbonTable(Configuration configuration)
           throws IOException {
    -    String dirs = configuration.get(INPUT_DIR, "");
    -    String[] inputPaths = StringUtils.split(dirs);
    -    String validInputPath = null;
    +    TableInfo tableInfo = getTableInfo(configuration);
    +    CarbonTable carbonTable = null;
    +    if (tableInfo != null) {
    +      carbonTable = CarbonTable.buildFromTableInfo(tableInfo);
    +      CarbonMetadata.getInstance().addCarbonTable(carbonTable);
    +      return carbonTable;
    +    }
    +    String inputDir = configuration.get(INPUT_DIR, "");
    +    String[] inputPaths = StringUtils.split(inputDir);
         if (inputPaths.length == 0) {
           throw new InvalidPathException("No input paths specified in job");
    -    } else {
    -      if (paths != null) {
    -        for (String inputPath : inputPaths) {
    -          if (paths.startsWith(inputPath)) {
    -            validInputPath = inputPath;
    -            break;
    -          }
    -        }
    -      }
         }
    +    Arrays.sort(inputPaths);
    --- End diff --
    
    why a sort is needed here?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1239: [CARBONDATA-1343] Hive can't query data when the car...

Posted by anubhav100 <gi...@git.apache.org>.
Github user anubhav100 commented on the issue:

    https://github.com/apache/carbondata/pull/1239
  
    @cenyuhai this issue is resolved in latest master


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1239: [CARBONDATA-1343] Hive can't query data when the car...

Posted by cenyuhai <gi...@git.apache.org>.
Github user cenyuhai commented on the issue:

    https://github.com/apache/carbondata/pull/1239
  
    @anubhav100 It is a sad story


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata pull request #1239: [CARBONDATA-1343] Hive can't query data when ...

Posted by xuchuanyin <gi...@git.apache.org>.
Github user xuchuanyin commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1239#discussion_r132685377
  
    --- Diff: integration/hive/src/main/java/org/apache/carbondata/hive/MapredCarbonInputFormat.java ---
    @@ -84,47 +92,66 @@
        * @param configuration
        * @throws IOException
        */
    -  private static void populateCarbonTable(Configuration configuration, String paths)
    +  private CarbonTable populateCarbonTable(Configuration configuration)
           throws IOException {
    -    String dirs = configuration.get(INPUT_DIR, "");
    -    String[] inputPaths = StringUtils.split(dirs);
    -    String validInputPath = null;
    +    TableInfo tableInfo = getTableInfo(configuration);
    +    CarbonTable carbonTable = null;
    +    if (tableInfo != null) {
    +      carbonTable = CarbonTable.buildFromTableInfo(tableInfo);
    +      CarbonMetadata.getInstance().addCarbonTable(carbonTable);
    +      return carbonTable;
    +    }
    +    String inputDir = configuration.get(INPUT_DIR, "");
    +    String[] inputPaths = StringUtils.split(inputDir);
         if (inputPaths.length == 0) {
           throw new InvalidPathException("No input paths specified in job");
    -    } else {
    -      if (paths != null) {
    -        for (String inputPath : inputPaths) {
    -          if (paths.startsWith(inputPath)) {
    -            validInputPath = inputPath;
    -            break;
    -          }
    -        }
    -      }
         }
    +    Arrays.sort(inputPaths);
    +    String tablePath = inputPaths[0].replace("file:", "");
    --- End diff --
    
    Is this a mistake?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1239: [CARBONDATA-1338] add tableInfo to CarbonHiveInputSp...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit commented on the issue:

    https://github.com/apache/carbondata/pull/1239
  
    Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata pull request #1239: [CARBONDATA-1343] Hive can't query data when ...

Posted by cenyuhai <gi...@git.apache.org>.
Github user cenyuhai commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1239#discussion_r132691664
  
    --- Diff: integration/hive/src/main/java/org/apache/carbondata/hive/MapredCarbonInputFormat.java ---
    @@ -84,47 +92,66 @@
        * @param configuration
        * @throws IOException
        */
    -  private static void populateCarbonTable(Configuration configuration, String paths)
    +  private CarbonTable populateCarbonTable(Configuration configuration)
           throws IOException {
    -    String dirs = configuration.get(INPUT_DIR, "");
    -    String[] inputPaths = StringUtils.split(dirs);
    -    String validInputPath = null;
    +    TableInfo tableInfo = getTableInfo(configuration);
    +    CarbonTable carbonTable = null;
    +    if (tableInfo != null) {
    +      carbonTable = CarbonTable.buildFromTableInfo(tableInfo);
    +      CarbonMetadata.getInstance().addCarbonTable(carbonTable);
    +      return carbonTable;
    +    }
    +    String inputDir = configuration.get(INPUT_DIR, "");
    +    String[] inputPaths = StringUtils.split(inputDir);
         if (inputPaths.length == 0) {
           throw new InvalidPathException("No input paths specified in job");
    -    } else {
    -      if (paths != null) {
    -        for (String inputPath : inputPaths) {
    -          if (paths.startsWith(inputPath)) {
    -            validInputPath = inputPath;
    -            break;
    -          }
    -        }
    -      }
         }
    +    Arrays.sort(inputPaths);
    --- End diff --
    
    because there are serveral paths, the sortest path are the table path, so I sort all and then get the first one


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1239: [CARBONDATA-1343] Hive can't query data when the car...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1239
  
    SDV Build Success with Spark 2.1, Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/129/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1239: [CARBONDATA-1343] Hive can't query data when the car...

Posted by cenyuhai <gi...@git.apache.org>.
Github user cenyuhai commented on the issue:

    https://github.com/apache/carbondata/pull/1239
  
    @anubhav100 I try the latest master, it still throws exception.
    {code}
    java.io.IOException: File does not exist: hdfs://bipcluster/user/master/carbon/store/temp/yuhai_carbon/Metadata/schema
            at org.apache.carbondata.hadoop.util.SchemaReader.readCarbonTableFromStore(SchemaReader.java:70)
            at org.apache.carbondata.hadoop.CarbonInputFormat.getOrCreateCarbonTable(CarbonInputFormat.java:186)
            at org.apache.carbondata.hadoop.CarbonInputFormat.getSplits(CarbonInputFormat.java:352)
            at org.apache.carbondata.hive.MapredCarbonInputFormat.getSplits(MapredCarbonInputFormat.java:99)
            at org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:363)
            at org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:486)
            at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:520)
            at org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:332)
            at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:324)
            at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:200)
            at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1307)
            at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1304)
            at java.security.AccessController.doPrivileged(Native Method)
    {code}


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1239: [CARBONDATA-1338] add tableInfo to CarbonHiveInputSp...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1239
  
    Build Failed with Spark 1.6, Please check CI http://144.76.159.231:8080/job/ApacheCarbonPRBuilder/798/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1239: [CARBONDATA-1338] add tableInfo to CarbonHiveInputSp...

Posted by cenyuhai <gi...@git.apache.org>.
Github user cenyuhai commented on the issue:

    https://github.com/apache/carbondata/pull/1239
  
    ok to test


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1239: [CARBONDATA-1338] add tableInfo to CarbonHiveInputSp...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1239
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/3396/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---