You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@carbondata.apache.org by QiangCai <gi...@git.apache.org> on 2017/06/07 04:06:43 UTC
[GitHub] carbondata pull request #1002: [CARBONDATA-1136] Fix compaction bug for the ...
GitHub user QiangCai opened a pull request:
https://github.com/apache/carbondata/pull/1002
[CARBONDATA-1136] Fix compaction bug for the partition table
After the compaction of the partition table, the select query is not showing data.
**Analyze**
During compaction, we lost the partition id of table
**Solution**
Continue to use the old partition id in CarbonMergerRDD.scala
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/QiangCai/carbondata fixCompactionIssue
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/carbondata/pull/1002.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #1002
----
commit e05c696900920ed5b98e608305d49c17d192fb5b
Author: QiangCai <da...@gmail.com>
Date: 2017-06-07T03:51:08Z
fix compact bug for partition table
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] carbondata issue #1002: [CARBONDATA-1136] Fix compaction bug for the partiti...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1002
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/2787/
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] carbondata issue #1002: [CARBONDATA-1136] Fix compaction bug for the partiti...
Posted by asfgit <gi...@git.apache.org>.
Github user asfgit commented on the issue:
https://github.com/apache/carbondata/pull/1002
Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/carbondata-pr-spark-1.6/412/
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] carbondata issue #1002: [CARBONDATA-1136] Fix compaction bug for the partiti...
Posted by asfgit <gi...@git.apache.org>.
Github user asfgit commented on the issue:
https://github.com/apache/carbondata/pull/1002
Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/carbondata-pr-spark-1.6/410/<h2>Failed Tests: <span class='status-failure'>1</span></h2><h3><a name='carbondata-pr-spark-1.6/org.apache.carbondata:carbondata-spark-common-test' /><a href='https://builds.apache.org/job/carbondata-pr-spark-1.6/410/org.apache.carbondata$carbondata-spark-common-test/testReport'>carbondata-pr-spark-1.6/org.apache.carbondata:carbondata-spark-common-test</a>: <span class='status-failure'>1</span></h3><ul><li><a href='https://builds.apache.org/job/carbondata-pr-spark-1.6/410/org.apache.carbondata$carbondata-spark-common-test/testReport/org.apache.carbondata.spark.testsuite.allqueries/InsertIntoCarbonTableTestCase/insert_into_carbon_table_from_carbon_table_union_query/'><strong>org.apache.carbondata.spark.testsuite.allqueries.InsertIntoCarbonTableTestCase.insert into carbon table from carbon table union query</strong></a></li></ul>
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] carbondata pull request #1002: [CARBONDATA-1136] Fix compaction bug for the ...
Posted by gvramana <gi...@git.apache.org>.
Github user gvramana commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/1002#discussion_r120927111
--- Diff: integration/spark-common/src/main/scala/org/apache/carbondata/spark/rdd/CarbonMergerRDD.scala ---
@@ -405,11 +411,16 @@ class CarbonMergerRDD[K, V](
NodeInfo(splitsPerNode.getTaskId, splitsPerNode.getCarbonInputSplitList.size()))
if (blockletCount != 0) {
+ val taskInfo = splitInfo.asInstanceOf[CarbonInputSplitTaskInfo]
val multiBlockSplit = new CarbonMultiBlockSplit(absoluteTableIdentifier,
- splitInfo.asInstanceOf[CarbonInputSplitTaskInfo].getCarbonInputSplitList,
+ taskInfo.getCarbonInputSplitList,
Array(nodeName))
- result.add(new CarbonSparkPartition(id, partitionNo, multiBlockSplit))
- partitionNo += 1
+ if (isPartitionTable) {
--- End diff --
This handling will not be sufficient,
When number of partitions(Example:100) is not equal to number of nodes(Example:5) , getPartitions will divide total blocks among available nodes. Then each node will get more than one taskno/partitionNo to handle.
Compute function in executor just merges all the given btrees(segid+taskid) into one task. So multiple taskids/partitions will be merged to one. This disturbs partition mapping.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] carbondata issue #1002: [CARBONDATA-1136] Fix compaction bug for the partiti...
Posted by gvramana <gi...@git.apache.org>.
Github user gvramana commented on the issue:
https://github.com/apache/carbondata/pull/1002
retest this please
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] carbondata issue #1002: [CARBONDATA-1136] Fix compaction bug for the partiti...
Posted by QiangCai <gi...@git.apache.org>.
Github user QiangCai commented on the issue:
https://github.com/apache/carbondata/pull/1002
retest this please
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] carbondata issue #1002: [CARBONDATA-1136] Fix compaction bug for the partiti...
Posted by asfgit <gi...@git.apache.org>.
Github user asfgit commented on the issue:
https://github.com/apache/carbondata/pull/1002
Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/carbondata-pr-spark-1.6/356/
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] carbondata issue #1002: [CARBONDATA-1136] Fix compaction bug for the partiti...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1002
Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/2247/
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] carbondata issue #1002: [CARBONDATA-1136] Fix compaction bug for the partiti...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1002
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/2524/
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] carbondata pull request #1002: [CARBONDATA-1136] Fix compaction bug for the ...
Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:
https://github.com/apache/carbondata/pull/1002
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] carbondata issue #1002: [CARBONDATA-1136] Fix compaction bug for the partiti...
Posted by asfgit <gi...@git.apache.org>.
Github user asfgit commented on the issue:
https://github.com/apache/carbondata/pull/1002
Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/carbondata-pr-spark-1.6/721/
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] carbondata issue #1002: [CARBONDATA-1136] Fix compaction bug for the partiti...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1002
Build Success with Spark 1.6, Please check CI http://144.76.159.231:8080/job/ApacheCarbonPRBuilder/208/
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] carbondata issue #1002: [CARBONDATA-1136] Fix compaction bug for the partiti...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1002
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/2522/
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] carbondata issue #1002: [CARBONDATA-1136] Fix compaction bug for the partiti...
Posted by gvramana <gi...@git.apache.org>.
Github user gvramana commented on the issue:
https://github.com/apache/carbondata/pull/1002
LGTM
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] carbondata issue #1002: [CARBONDATA-1136] Fix compaction bug for the partiti...
Posted by asfgit <gi...@git.apache.org>.
Github user asfgit commented on the issue:
https://github.com/apache/carbondata/pull/1002
Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/carbondata-pr-spark-1.6/120/
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] carbondata issue #1002: [CARBONDATA-1136] Fix compaction bug for the partiti...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/1002
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/2472/
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] carbondata pull request #1002: [CARBONDATA-1136] Fix compaction bug for the ...
Posted by QiangCai <gi...@git.apache.org>.
Github user QiangCai commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/1002#discussion_r121036463
--- Diff: integration/spark-common/src/main/scala/org/apache/carbondata/spark/rdd/CarbonMergerRDD.scala ---
@@ -405,11 +411,16 @@ class CarbonMergerRDD[K, V](
NodeInfo(splitsPerNode.getTaskId, splitsPerNode.getCarbonInputSplitList.size()))
if (blockletCount != 0) {
+ val taskInfo = splitInfo.asInstanceOf[CarbonInputSplitTaskInfo]
val multiBlockSplit = new CarbonMultiBlockSplit(absoluteTableIdentifier,
- splitInfo.asInstanceOf[CarbonInputSplitTaskInfo].getCarbonInputSplitList,
+ taskInfo.getCarbonInputSplitList,
Array(nodeName))
- result.add(new CarbonSparkPartition(id, partitionNo, multiBlockSplit))
- partitionNo += 1
+ if (isPartitionTable) {
--- End diff --
@gvramana right, each node will get more than one taskno/partitionNo to handle. But one spark task just handle one partitionNo/taskNo. a CarbonInputSplitTaskInfo represent a taskNo. So different taskNo will go to different spark task.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] carbondata issue #1002: [CARBONDATA-1136] Fix compaction bug for the partiti...
Posted by gvramana <gi...@git.apache.org>.
Github user gvramana commented on the issue:
https://github.com/apache/carbondata/pull/1002
retest this please
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] carbondata issue #1002: [CARBONDATA-1136] Fix compaction bug for the partiti...
Posted by QiangCai <gi...@git.apache.org>.
Github user QiangCai commented on the issue:
https://github.com/apache/carbondata/pull/1002
@gvramana I will raise another PR to optimize the compaction for normal table.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---