You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@carbondata.apache.org by xuchuanyin <gi...@git.apache.org> on 2017/07/15 06:34:26 UTC
[GitHub] carbondata pull request #1175: [CARBONDATA-1281] Support multiple temp dirs ...
GitHub user xuchuanyin opened a pull request:
https://github.com/apache/carbondata/pull/1175
[CARBONDATA-1281] Support multiple temp dirs for writing temp files while loading
# Modifications
This feature mainly focus on avoiding disk hot-spot in single massive data loading, changes are made in two parts:
1. randomly choose a yarn local folder to write sort temp file in sort-process;
2.randomly choose a yarn local folder to write carbondata file in write-process.
# Usage
To enable this feature, user should enable `carbon.using.multi.temp.dir=true` and `carbon.use.local.dir=true`.
# Performance
In my case, this feature improves the loading performance from 35M/s/node to 70+M/s/node
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/xuchuanyin/carbondata feature_multiple_temp_dir_for_load
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/carbondata/pull/1175.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #1175
----
commit 35cba3d49ccb536328545fd705020edcc50189af
Author: xuchuanyin <xu...@hust.edu.cn>
Date: 2017-07-05T12:56:33Z
Merge pull request #3 from apache/master
sync code
commit c778f22dbda7f0b36a69a10db5d008744cdb99f3
Author: ravikiran23 <ra...@gmail.com>
Date: 2017-06-22T12:48:19Z
[CARBONDATA-1214] Changing the delete syntax as in the hive for segment deletion
This closes #1078
commit 46a53962d4913b06d0f4be61c48053106da4a108
Author: jackylk <ja...@huawei.com>
Date: 2017-07-03T13:54:39Z
modify compare test
fix
fix style
change table
commit a54dda9b7dcd0749e6b187f2579a5b867f421eaf
Author: jatin <ja...@knoldus.in>
Date: 2017-07-05T12:04:19Z
[CARBONDATA-1266][PRESTO] Fixed issue for non existing table
This closes #1137
commit 32e3d1f7a4ff09309ebf0d4e7315e5dbef2765b4
Author: xuchuanyin <xu...@hust.edu.cn>
Date: 2017-07-05T13:00:45Z
[CARBONDATA-1267] Add short_int case branch in DeltaIntegalCodec
This closes #1139
commit a479d1672bfbe1c2b92b44737f93a2e14cbef2a7
Author: Geetika Gupta <ge...@knoldus.in>
Date: 2017-07-06T06:14:06Z
[CARBONDATA-1269][PRESTO] Fixed bug for select operation in non existing database
This closes #1143
commit b69771370f95926999b7cb0501c38ae7202cebf3
Author: sgururajshetty <sg...@gmail.com>
Date: 2017-07-06T06:23:38Z
[CARBONDATA-1270] Documentation update for Delete by ID and DATE syntax and example
This closes #1141
commit 4d2d518a3d5176d70f81d83c5782ef56f7118800
Author: ashok.blend <as...@gmail.com>
Date: 2017-07-08T10:57:41Z
[CARBONDATA-1282] Choose BatchedDatasource scan only if schema fits codegen
This closes #1148
commit 5e74e50547a98e2452b296f839a020d978225cf2
Author: chenliang613 <ch...@apache.org>
Date: 2017-07-08T15:53:02Z
[CARBONDATA-1280] Solve HiveExample dependency issues and fix spark 1.6 CI
This closes #1150
commit 79a777052a8e5ead117f7424a6daf974dc405c26
Author: Liang Chen <ch...@apache.org>
Date: 2017-07-08T22:32:10Z
fix doc, remove invalid description
This closes #1151
commit 16938770f54eaf5ec646df2692418e726d9defd4
Author: kunalkapoor <ku...@gmail.com>
Date: 2017-07-10T06:42:10Z
[CARBONDATA-1229] acquired meta.lock during table drop
This closes #1153
commit f5e4bb083f166d62de38267c7503ab3609e1fcca
Author: czg516516 <cz...@163.com>
Date: 2017-07-11T03:01:47Z
[CARBONDATA-1289] remove unused method
This closes #1157
commit 7e433115a925e6a477d8de157596cc6eb16dfa17
Author: xuchuanyin <xu...@hust.edu.cn>
Date: 2017-07-15T03:14:46Z
fix confilicts
commit 76b071846c226348b3a934edfda63454ee973254
Author: xuchuanyin <xu...@hust.edu.cn>
Date: 2017-07-15T06:12:35Z
Support multiple temp dirs for writing files while loading data
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] carbondata issue #1175: [CARBONDATA-1281] Support multiple temp dirs for wri...
Posted by asfgit <gi...@git.apache.org>.
Github user asfgit commented on the issue:
https://github.com/apache/carbondata/pull/1175
Can one of the admins verify this patch?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] carbondata issue #1175: [CARBONDATA-1281] Support multiple temp dirs for wri...
Posted by chenliang613 <gi...@git.apache.org>.
Github user chenliang613 commented on the issue:
https://github.com/apache/carbondata/pull/1175
@xuchuanyin please squash all commits to one commit.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] carbondata issue #1175: [CARBONDATA-1281] Support multiple temp dirs for wri...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1175
Build Success with Spark 1.6, Please check CI http://144.76.159.231:8080/job/ApacheCarbonPRBuilder/498/
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] carbondata issue #1175: [CARBONDATA-1281] Support multiple temp dirs for wri...
Posted by xuchuanyin <gi...@git.apache.org>.
Github user xuchuanyin commented on the issue:
https://github.com/apache/carbondata/pull/1175
@chenliang613 OK, I'll raise another PR. #1177
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] carbondata issue #1175: [CARBONDATA-1281] Support multiple temp dirs for wri...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1175
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/3088/
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] carbondata pull request #1175: [CARBONDATA-1281] Support multiple temp dirs ...
Posted by xuchuanyin <gi...@git.apache.org>.
Github user xuchuanyin closed the pull request at:
https://github.com/apache/carbondata/pull/1175
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---