You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@carbondata.apache.org by watermen <gi...@git.apache.org> on 2017/03/24 08:49:41 UTC

[GitHub] incubator-carbondata pull request #696: [CARBONDATA-818] Make the file_name ...

GitHub user watermen opened a pull request:

    https://github.com/apache/incubator-carbondata/pull/696

    [CARBONDATA-818] Make the file_name in carbonindex exactly

    The file_name stored in carbonindex is a local path which used on executor as temp dir 
    ```
    /tmp/6937581525189542/0/default/carbon_v3/Fact/Part0/Segment_1/0/part-0-0_batchno0-0-1490344094093.carbondata
    ```
    But I think we want to store the actual carbondata path like
    ```
    /user/hive/warehouse/default/carbon_v3/Fact/Part0/Segment_0/part-0-0-0-1489566284025.carbondata
    ```
    
    I have already check this with @QiangCai.
    
    cc @jackylk 

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/watermen/incubator-carbondata CARBONDATA-818

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-carbondata/pull/696.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #696
    
----
commit 4da2a705ed8f050a8e89da0780d3c56751208a2e
Author: Yadong Qi <qi...@gmail.com>
Date:   2017-03-24T08:26:20Z

    Make the file_name in carbonindex exactly.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata pull request #696: [CARBONDATA-818] Make the file_name ...

Posted by watermen <gi...@git.apache.org>.
Github user watermen commented on a diff in the pull request:

    https://github.com/apache/incubator-carbondata/pull/696#discussion_r108081571
  
    --- Diff: processing/src/main/java/org/apache/carbondata/processing/store/writer/v1/CarbonFactDataWriterImplV1.java ---
    @@ -373,7 +373,7 @@ protected void writeBlockletInfoToFile(FileChannel channel, String filePath)
           FileFooter convertFileMeta = CarbonMetadataUtil
               .convertFileFooter(blockletInfoList, localCardinality.length, localCardinality,
                   thriftColumnSchemaList, dataWriterVo.getSegmentProperties());
    -      fillBlockIndexInfoDetails(convertFileMeta.getNum_rows(), filePath, currentPosition);
    +      fillBlockIndexInfoDetails(convertFileMeta.getNum_rows(), carbonDataFileName, currentPosition);
    --- End diff --
    
    Done


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata issue #696: [CARBONDATA-818] Make the file_name in carb...

Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/696
  
    LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata issue #696: [CARBONDATA-818] Make the file_name in carb...

Posted by watermen <gi...@git.apache.org>.
Github user watermen commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/696
  
    @chenliang613 
    `fileName` is like below, used on executor as temp dir. Actual it is a path.
    ```
    /tmp/6937581525189542/0/default/carbon_v3/Fact/Part0/Segment_0/0/part-0-0_batchno0-0-1490345609845.carbondata.inprogress
    ```
    `carbonDataFileName` is like below
    ```
    part-0-0_batchno0-0-1490345609845.carbondata
    ```
    I think we can rename `fileName` to `carbonDataFileTempPath`, what's your idea?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata issue #696: [CARBONDATA-818] Make the file_name in carb...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/696
  
    Build Failed  with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/1352/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata issue #696: [CARBONDATA-818] Make the file_name in carb...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/696
  
    Build Success with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/1323/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata issue #696: [CARBONDATA-818] Make the file_name in carb...

Posted by chenliang613 <gi...@git.apache.org>.
Github user chenliang613 commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/696
  
    @watermen  
    You are right, the fileName actually is a path parameter.
    Agree to change fileName to carbonDataFileTempPath, please modify it in your PR.
    



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata issue #696: [CARBONDATA-818] Make the file_name in carb...

Posted by Sephiroth-Lin <gi...@git.apache.org>.
Github user Sephiroth-Lin commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/696
  
    LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata issue #696: [CARBONDATA-818] Make the file_name in carb...

Posted by chenliang613 <gi...@git.apache.org>.
Github user chenliang613 commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/696
  
    retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata issue #696: [CARBONDATA-818] Make the file_name in carb...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/696
  
    Build Success with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/1336/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata issue #696: [CARBONDATA-818] Make the file_name in carb...

Posted by QiangCai <gi...@git.apache.org>.
Github user QiangCai commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/696
  
    @watermen 
    It is unnecessary to store carbondata file path in carbonindex file.
    During btree building, just use carbondata file name to sort tableblockinfos.
    please check CarbonUtil.readCarbonIndexFile and TableBlockInfo.compareTo.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata pull request #696: [CARBONDATA-818] Make the file_name ...

Posted by chenliang613 <gi...@git.apache.org>.
Github user chenliang613 commented on a diff in the pull request:

    https://github.com/apache/incubator-carbondata/pull/696#discussion_r108052532
  
    --- Diff: processing/src/main/java/org/apache/carbondata/processing/store/writer/v3/CarbonFactDataWriterImplV3.java ---
    @@ -528,8 +528,7 @@ protected void fillBlockIndexInfoDetails(long numberOfRows, String filePath,
         org.apache.carbondata.core.metadata.blocklet.index.BlockletIndex blockletIndex =
             new org.apache.carbondata.core.metadata.blocklet.index.BlockletIndex(btree, minmax);
         BlockIndexInfo blockIndexInfo =
    -        new BlockIndexInfo(numberOfRows, filePath.substring(0, filePath.lastIndexOf('.')),
    -            currentPosition, blockletIndex);
    +        new BlockIndexInfo(numberOfRows, filePath, currentPosition, blockletIndex);
    --- End diff --
    
    can you explain ,why do this change ? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata issue #696: [CARBONDATA-818] Make the file_name in carb...

Posted by chenliang613 <gi...@git.apache.org>.
Github user chenliang613 commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/696
  
    retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata issue #696: [CARBONDATA-818] Make the file_name in carb...

Posted by QiangCai <gi...@git.apache.org>.
Github user QiangCai commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/696
  
    LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata issue #696: [CARBONDATA-818] Make the file_name in carb...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/696
  
    Build Success with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/1332/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata pull request #696: [CARBONDATA-818] Make the file_name ...

Posted by chenliang613 <gi...@git.apache.org>.
Github user chenliang613 commented on a diff in the pull request:

    https://github.com/apache/incubator-carbondata/pull/696#discussion_r108052363
  
    --- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/dataload/TestDataLoadWithFileName.scala ---
    @@ -0,0 +1,111 @@
    +package org.apache.carbondata.spark.testsuite.dataload
    --- End diff --
    
    Please add license header


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata issue #696: [CARBONDATA-818] Make the file_name in carb...

Posted by watermen <gi...@git.apache.org>.
Github user watermen commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/696
  
    @chenliang613 Thanks for your review, plz review it again.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata issue #696: [CARBONDATA-818] Make the file_name in carb...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/696
  
    Build Success with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/1331/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata issue #696: [CARBONDATA-818] Make the file_name in carb...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/696
  
    Build Success with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/1345/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata pull request #696: [CARBONDATA-818] Make the file_name ...

Posted by watermen <gi...@git.apache.org>.
Github user watermen commented on a diff in the pull request:

    https://github.com/apache/incubator-carbondata/pull/696#discussion_r108081556
  
    --- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/dataload/TestDataLoadWithFileName.scala ---
    @@ -0,0 +1,111 @@
    +package org.apache.carbondata.spark.testsuite.dataload
    --- End diff --
    
    Done


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata issue #696: [CARBONDATA-818] Make the file_name in carb...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/696
  
    add to whitelist


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata issue #696: [CARBONDATA-818] Make the file_name in carb...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/696
  
    Build Success with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/1354/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata issue #696: [CARBONDATA-818] Make the file_name in carb...

Posted by chenliang613 <gi...@git.apache.org>.
Github user chenliang613 commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/696
  
    @watermen 
    
    Thanks for your contribution, everything looks good.
    Only one comment, in AbstractFactDataWriter.java, there are two parameters(fileName, carbonDataFileName) , what is the different ?
    /**
       * file name
       */
      protected String fileName;
    
      /**
       * The name of carbonData file
       */
      protected String carbonDataFileName;


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata issue #696: [CARBONDATA-818] Make the file_name in carb...

Posted by watermen <gi...@git.apache.org>.
Github user watermen commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/696
  
    @QiangCai Store fileName insteads of filePath in carbonindex now. Please review it again.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata issue #696: [CARBONDATA-818] Make the file_name in carb...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/696
  
    Build Success with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/1348/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata pull request #696: [CARBONDATA-818] Make the file_name ...

Posted by watermen <gi...@git.apache.org>.
Github user watermen commented on a diff in the pull request:

    https://github.com/apache/incubator-carbondata/pull/696#discussion_r108081845
  
    --- Diff: processing/src/main/java/org/apache/carbondata/processing/store/writer/v3/CarbonFactDataWriterImplV3.java ---
    @@ -528,8 +528,7 @@ protected void fillBlockIndexInfoDetails(long numberOfRows, String filePath,
         org.apache.carbondata.core.metadata.blocklet.index.BlockletIndex blockletIndex =
             new org.apache.carbondata.core.metadata.blocklet.index.BlockletIndex(btree, minmax);
         BlockIndexInfo blockIndexInfo =
    -        new BlockIndexInfo(numberOfRows, filePath.substring(0, filePath.lastIndexOf('.')),
    -            currentPosition, blockletIndex);
    +        new BlockIndexInfo(numberOfRows, filePath, currentPosition, blockletIndex);
    --- End diff --
    
    # Before
    We pass the fileName and in the end of fileName is `.inprogress`, so we need to do substring.
    ```java
    this.fileName = dataWriterVo.getStoreLocation() + File.separator + carbonDataFileName + CarbonCommonConstants.FILE_INPROGRESS_STATUS;
    ```
    # After
    We pass the carbonDataFileName and we don't need to do substring.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata pull request #696: [CARBONDATA-818] Make the file_name ...

Posted by chenliang613 <gi...@git.apache.org>.
Github user chenliang613 commented on a diff in the pull request:

    https://github.com/apache/incubator-carbondata/pull/696#discussion_r108052521
  
    --- Diff: processing/src/main/java/org/apache/carbondata/processing/store/writer/v1/CarbonFactDataWriterImplV1.java ---
    @@ -373,7 +373,7 @@ protected void writeBlockletInfoToFile(FileChannel channel, String filePath)
           FileFooter convertFileMeta = CarbonMetadataUtil
               .convertFileFooter(blockletInfoList, localCardinality.length, localCardinality,
                   thriftColumnSchemaList, dataWriterVo.getSegmentProperties());
    -      fillBlockIndexInfoDetails(convertFileMeta.getNum_rows(), filePath, currentPosition);
    +      fillBlockIndexInfoDetails(convertFileMeta.getNum_rows(), carbonDataFileName, currentPosition);
    --- End diff --
    
    Please align the parameter name(filePath) for fillBlockIndexInfoDetails of AbstractFactDataWriter.java
    For example : 
     protected void fillBlockIndexInfoDetails(long numberOfRows,
        String carbonDataFileName, long currentPosition)
    
    Please modify accordingly for all part.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata pull request #696: [CARBONDATA-818] Make the file_name ...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/incubator-carbondata/pull/696


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---