You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@carbondata.apache.org by xuchuanyin <gi...@git.apache.org> on 2017/07/15 06:34:26 UTC

[GitHub] carbondata pull request #1175: [CARBONDATA-1281] Support multiple temp dirs ...

GitHub user xuchuanyin opened a pull request:

    https://github.com/apache/carbondata/pull/1175

    [CARBONDATA-1281] Support multiple temp dirs for writing temp files while loading

    # Modifications
    This feature mainly focus on avoiding disk hot-spot in single massive data loading, changes are made in two parts: 
    
    1. randomly choose a yarn local folder to write sort temp file in sort-process;
    
    2.randomly choose a yarn local folder to write carbondata file in write-process.
    
    # Usage
    
    To enable this feature, user should enable `carbon.using.multi.temp.dir=true` and `carbon.use.local.dir=true`.
    
    # Performance
    In my case, this feature improves the loading performance from 35M/s/node to 70+M/s/node


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/xuchuanyin/carbondata feature_multiple_temp_dir_for_load

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/carbondata/pull/1175.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1175
    
----
commit 35cba3d49ccb536328545fd705020edcc50189af
Author: xuchuanyin <xu...@hust.edu.cn>
Date:   2017-07-05T12:56:33Z

    Merge pull request #3 from apache/master
    
    sync code

commit c778f22dbda7f0b36a69a10db5d008744cdb99f3
Author: ravikiran23 <ra...@gmail.com>
Date:   2017-06-22T12:48:19Z

    [CARBONDATA-1214] Changing the delete syntax as in the hive for segment deletion
    
    This closes #1078

commit 46a53962d4913b06d0f4be61c48053106da4a108
Author: jackylk <ja...@huawei.com>
Date:   2017-07-03T13:54:39Z

    modify compare test
    
    fix
    
    fix style
    
    change table

commit a54dda9b7dcd0749e6b187f2579a5b867f421eaf
Author: jatin <ja...@knoldus.in>
Date:   2017-07-05T12:04:19Z

    [CARBONDATA-1266][PRESTO] Fixed issue for non existing table
    
    This closes #1137

commit 32e3d1f7a4ff09309ebf0d4e7315e5dbef2765b4
Author: xuchuanyin <xu...@hust.edu.cn>
Date:   2017-07-05T13:00:45Z

    [CARBONDATA-1267] Add short_int case branch in DeltaIntegalCodec
    
    This closes #1139

commit a479d1672bfbe1c2b92b44737f93a2e14cbef2a7
Author: Geetika Gupta <ge...@knoldus.in>
Date:   2017-07-06T06:14:06Z

    [CARBONDATA-1269][PRESTO] Fixed bug for select operation in non existing database
    
    This closes #1143

commit b69771370f95926999b7cb0501c38ae7202cebf3
Author: sgururajshetty <sg...@gmail.com>
Date:   2017-07-06T06:23:38Z

    [CARBONDATA-1270] Documentation update for Delete by ID and DATE syntax and example
    
    This closes #1141

commit 4d2d518a3d5176d70f81d83c5782ef56f7118800
Author: ashok.blend <as...@gmail.com>
Date:   2017-07-08T10:57:41Z

    [CARBONDATA-1282] Choose BatchedDatasource scan only if schema fits codegen
    
    This closes #1148

commit 5e74e50547a98e2452b296f839a020d978225cf2
Author: chenliang613 <ch...@apache.org>
Date:   2017-07-08T15:53:02Z

    [CARBONDATA-1280] Solve HiveExample dependency issues and fix spark 1.6 CI
    
    This closes #1150

commit 79a777052a8e5ead117f7424a6daf974dc405c26
Author: Liang Chen <ch...@apache.org>
Date:   2017-07-08T22:32:10Z

    fix doc, remove invalid description
    
    This closes #1151

commit 16938770f54eaf5ec646df2692418e726d9defd4
Author: kunalkapoor <ku...@gmail.com>
Date:   2017-07-10T06:42:10Z

    [CARBONDATA-1229] acquired meta.lock during table drop
    
    This closes #1153

commit f5e4bb083f166d62de38267c7503ab3609e1fcca
Author: czg516516 <cz...@163.com>
Date:   2017-07-11T03:01:47Z

    [CARBONDATA-1289] remove unused method
    
    This closes #1157

commit 7e433115a925e6a477d8de157596cc6eb16dfa17
Author: xuchuanyin <xu...@hust.edu.cn>
Date:   2017-07-15T03:14:46Z

    fix confilicts

commit 76b071846c226348b3a934edfda63454ee973254
Author: xuchuanyin <xu...@hust.edu.cn>
Date:   2017-07-15T06:12:35Z

    Support multiple temp dirs for writing files while loading data

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1175: [CARBONDATA-1281] Support multiple temp dirs for wri...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit commented on the issue:

    https://github.com/apache/carbondata/pull/1175
  
    Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1175: [CARBONDATA-1281] Support multiple temp dirs for wri...

Posted by chenliang613 <gi...@git.apache.org>.
Github user chenliang613 commented on the issue:

    https://github.com/apache/carbondata/pull/1175
  
    @xuchuanyin  please squash all commits to one commit.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1175: [CARBONDATA-1281] Support multiple temp dirs for wri...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1175
  
    Build Success with Spark 1.6, Please check CI http://144.76.159.231:8080/job/ApacheCarbonPRBuilder/498/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1175: [CARBONDATA-1281] Support multiple temp dirs for wri...

Posted by xuchuanyin <gi...@git.apache.org>.
Github user xuchuanyin commented on the issue:

    https://github.com/apache/carbondata/pull/1175
  
    @chenliang613 OK, I'll raise another PR. #1177


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #1175: [CARBONDATA-1281] Support multiple temp dirs for wri...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1175
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/3088/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata pull request #1175: [CARBONDATA-1281] Support multiple temp dirs ...

Posted by xuchuanyin <gi...@git.apache.org>.
Github user xuchuanyin closed the pull request at:

    https://github.com/apache/carbondata/pull/1175


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---