You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@carbondata.apache.org by "Manish Gupta (JIRA)" <ji...@apache.org> on 2018/01/29 08:43:00 UTC
[jira] [Resolved] (CARBONDATA-2016) Exception displays while executing compaction with alter query

     [ https://issues.apache.org/jira/browse/CARBONDATA-2016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Manish Gupta resolved CARBONDATA-2016.
--------------------------------------
       Resolution: Fixed
    Fix Version/s: 1.3.0

> Exception displays while executing compaction with alter query
> --------------------------------------------------------------
>
>                 Key: CARBONDATA-2016
>                 URL: https://issues.apache.org/jira/browse/CARBONDATA-2016
>             Project: CarbonData
>          Issue Type: Bug
>          Components: data-load
>    Affects Versions: 1.3.0
>         Environment: spark 2.1
>            Reporter: Vandana Yadav
>            Assignee: anubhav tarar
>            Priority: Minor
>             Fix For: 1.3.0
>
>          Time Spent: 11.5h
>  Remaining Estimate: 0h
>
> Exception displays while implementing compaction with alter query.
> Steps to reproduce:
> 1) Create a table :
> CREATE TABLE CUSTOMER1 ( C_CUSTKEY INT , C_NAME STRING , C_ADDRESS STRING , C_NATIONKEY INT , C_PHONE STRING , C_ACCTBAL DECIMAL(15,2) , C_MKTSEGMENT STRING , C_COMMENT STRING) stored by 'carbondata';
> 2) Insert data into the table:
> a) insert into customer1 values(1,'vandana','noida',1,'123456789',45987.78,'hello','comment')
> b) insert into customer1 values(2,'vandana','noida',2,'123456789',487.78,'hello','comment')
> c) insert into customer1 values(3,'geetika','delhi',3,'123456789',487897.78,'hello','comment')
> d) insert into customer1 values(4,'sangeeta','delhi',3,'123456789',48789.78,'hello','comment')
> 3) Perform alter table query:
>  alter table customer1 add columns (intfield int) TBLPROPERTIES ('DEFAULT.VALUE.intfield'='10');
> 4) show segments for displaying segments before compaction
> show segments for table customer1;
> output:
> +--------------------+----------+--------------------------+--------------------------+------------+--------------+--+
> | SegmentSequenceId  |  Status  |     Load Start Time      |      Load End Time       | Merged To  | File Format  |
> +--------------------+----------+--------------------------+--------------------------+------------+--------------+--+
> | 3                  | Success  | 2018-01-10 16:16:53.611  | 2018-01-10 16:16:54.99   | NA         | COLUMNAR_V3  |
> | 2                  | Success  | 2018-01-10 16:16:46.878  | 2018-01-10 16:16:47.75   | NA         | COLUMNAR_V3  |
> | 1                  | Success  | 2018-01-10 16:16:38.096  | 2018-01-10 16:16:38.972  | NA         | COLUMNAR_V3  |
> | 0                  | Success  | 2018-01-10 16:16:31.979  | 2018-01-10 16:16:33.293  | NA         | COLUMNAR_V3  |
> +--------------------+----------+--------------------------+--------------------------+------------+--------------+--+
> 4 rows selected (0.029 seconds)
> 5) alter table query for compaction:
> alter table customer1 compact 'minor';
> Expected Result: Table should be compacted successfully.
> Actual Result: 
> Error: org.apache.spark.sql.AnalysisException: Compaction failed. Please check logs for more info. Exception in compaction Compaction Failure in Merger Rdd.; (state=,code=0)
> thriftserver logs:
> 18/01/10 16:17:12 ERROR CompactionResultSortProcessor: [Executor task launch worker-36][partitionID:customer1;queryID:15798380253871] Compaction failed: java.lang.Long cannot be cast to java.lang.Integer
> java.lang.ClassCastException: java.lang.Long cannot be cast to java.lang.Integer
> 	at org.apache.carbondata.processing.sort.sortdata.SortDataRows.writeDataToFile(SortDataRows.java:273)
> 	at org.apache.carbondata.processing.sort.sortdata.SortDataRows.startSorting(SortDataRows.java:214)
> 	at org.apache.carbondata.processing.merger.CompactionResultSortProcessor.processResult(CompactionResultSortProcessor.java:226)
> 	at org.apache.carbondata.processing.merger.CompactionResultSortProcessor.execute(CompactionResultSortProcessor.java:159)
> 	at org.apache.carbondata.spark.rdd.CarbonMergerRDD$$anon$1.<init>(CarbonMergerRDD.scala:234)
> 	at org.apache.carbondata.spark.rdd.CarbonMergerRDD.internalCompute(CarbonMergerRDD.scala:81)
> 	at org.apache.carbondata.spark.rdd.CarbonRDD.compute(CarbonRDD.scala:60)
> 	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
> 	at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
> 	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
> 	at org.apache.spark.scheduler.Task.run(Task.scala:99)
> 	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:282)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> 	at java.lang.Thread.run(Thread.java:745)
> 18/01/10 16:17:12 INFO UnsafeMemoryManager: [Executor task launch worker-36][partitionID:customer1;queryID:15798380253871] Total memory used after task 15798371335347 is 5313 Current tasks running now are : [6856382704941, 14621295743743, 14461639534151, 4378916027096, 15798216567589]
> 18/01/10 16:17:12 INFO UnsafeMemoryManager: [Executor task launch worker-36][partitionID:customer1;queryID:15798380253871] Total memory used after task 15798371335347 is 5313 Current tasks running now are : [6856382704941, 14621295743743, 14461639534151, 4378916027096, 15798216567589]
> 18/01/10 16:17:12 INFO UnsafeMemoryManager: [Executor task launch worker-36][partitionID:customer1;queryID:15798380253871] Total memory used after task 15798371335347 is 5313 Current tasks running now are : [6856382704941, 14621295743743, 14461639534151, 4378916027096, 15798216567589]
> 18/01/10 16:17:12 INFO UnsafeMemoryManager: [Executor task launch worker-36][partitionID:customer1;queryID:15798380253871] Total memory used after task 15798371335347 is 5313 Current tasks running now are : [6856382704941, 14621295743743, 14461639534151, 4378916027096, 15798216567589]
> 18/01/10 16:17:12 INFO CarbonLoaderUtil: LocalFolderDeletionPool:customer1 Deleted the local store location: /tmp/15798371407704_0 : Time taken: 2
> 18/01/10 16:17:12 INFO Executor: Finished task 0.0 in stage 75.0 (TID 490). 1037 bytes result sent to driver
> 18/01/10 16:17:12 INFO TaskSetManager: Finished task 0.0 in stage 75.0 (TID 490) in 39 ms on localhost (executor driver) (1/1)
> 18/01/10 16:17:12 INFO TaskSchedulerImpl: Removed TaskSet 75.0, whose tasks have all completed, from pool 
> 18/01/10 16:17:12 INFO DAGScheduler: ResultStage 75 (collect at CarbonTableCompactor.scala:211) finished in 0.039 s
> 18/01/10 16:17:12 INFO DAGScheduler: Job 76 finished: collect at CarbonTableCompactor.scala:211, took 0.063051 s
> 18/01/10 16:17:12 AUDIT CarbonTableCompactor: [knoldus][hduser][Thread-125]Compaction request failed for table newcarbon.customer1
> 18/01/10 16:17:12 ERROR CarbonTableCompactor: pool-23-thread-7 Compaction request failed for table newcarbon.customer1
> 18/01/10 16:17:12 ERROR CarbonTableCompactor: pool-23-thread-7 Exception in compaction thread Compaction Failure in Merger Rdd.
> java.lang.Exception: Compaction Failure in Merger Rdd.
> 	at org.apache.carbondata.spark.rdd.CarbonTableCompactor.triggerCompaction(CarbonTableCompactor.scala:269)
> 	at org.apache.carbondata.spark.rdd.CarbonTableCompactor.scanSegmentsAndSubmitJob(CarbonTableCompactor.scala:120)
> 	at org.apache.carbondata.spark.rdd.CarbonTableCompactor.executeCompaction(CarbonTableCompactor.scala:71)
> 	at org.apache.carbondata.spark.rdd.CarbonDataRDDFactory$$anon$2.run(CarbonDataRDDFactory.scala:182)
> 	at org.apache.carbondata.spark.rdd.CarbonDataRDDFactory$.startCompactionThreads(CarbonDataRDDFactory.scala:269)
> 	at org.apache.spark.sql.execution.command.management.CarbonAlterTableCompactionCommand.alterTableForCompaction(CarbonAlterTableCompactionCommand.scala:255)
> 	at org.apache.spark.sql.execution.command.management.CarbonAlterTableCompactionCommand.processData(CarbonAlterTableCompactionCommand.scala:111)
> 	at org.apache.spark.sql.execution.command.DataCommand.run(package.scala:71)
> 	at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:58)
> 	at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:56)
> 	at org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:74)
> 	at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:114)
> 	at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:114)
> 	at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:135)
> 	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
> 	at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:132)
> 	at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:113)
> 	at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:87)
> 	at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:87)
> 	at org.apache.spark.sql.Dataset.<init>(Dataset.scala:185)
> 	at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:64)
> 	at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:592)
> 	at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:699)
> 	at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:220)
> 	at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:163)
> 	at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:160)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:422)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
> 	at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1.run(SparkExecuteStatementOperation.scala:173)
> 	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> 	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> 	at java.lang.Thread.run(Thread.java:745)
> 18/01/10 16:17:12 ERROR CarbonDataRDDFactory$: pool-23-thread-7 Exception in compaction thread Compaction Failure in Merger Rdd.
> 18/01/10 16:17:12 INFO HdfsFileLock: pool-23-thread-7 Deleted the lock file hdfs://localhost:54310/opt/prestocarbonStore/newcarbon/customer1/compaction.lock
> 18/01/10 16:17:12 ERROR CarbonAlterTableCompactionCommand: pool-23-thread-7 Exception in start compaction thread. Exception in compaction Compaction Failure in Merger Rdd.
> 18/01/10 16:17:12 ERROR HdfsFileLock: pool-23-thread-7 Not able to delete the lock file because it is not existed in location hdfs://localhost:54310/opt/prestocarbonStore/newcarbon/customer1/compaction.lock
> 18/01/10 16:17:12 ERROR SparkExecuteStatementOperation: Error executing query, currentState RUNNING, 
> org.apache.spark.sql.AnalysisException: Compaction failed. Please check logs for more info. Exception in compaction Compaction Failure in Merger Rdd.;
> 	at org.apache.spark.sql.util.CarbonException$.analysisException(CarbonException.scala:23)
> 	at org.apache.spark.sql.execution.command.management.CarbonAlterTableCompactionCommand.processData(CarbonAlterTableCompactionCommand.scala:120)
> 	at org.apache.spark.sql.execution.command.DataCommand.run(package.scala:71)
> 	at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:58)
> 	at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:56)
> 	at org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:74)
> 	at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:114)
> 	at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:114)
> 	at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:135)
> 	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
> 	at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:132)
> 	at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:113)
> 	at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:87)
> 	at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:87)
> 	at org.apache.spark.sql.Dataset.<init>(Dataset.scala:185)
> 	at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:64)
> 	at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:592)
> 	at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:699)
> 	at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:220)
> 	at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:163)
> 	at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:160)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:422)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
> 	at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1.run(SparkExecuteStatementOperation.scala:173)
> 	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> 	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> 	at java.lang.Thread.run(Thread.java:745)
> 18/01/10 16:17:12 ERROR SparkExecuteStatementOperation: Error running hive query: 
> org.apache.hive.service.cli.HiveSQLException: org.apache.spark.sql.AnalysisException: Compaction failed. Please check logs for more info. Exception in compaction Compaction Failure in Merger Rdd.;
> 	at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:258)
> 	at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:163)
> 	at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:160)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:422)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
> 	at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1.run(SparkExecuteStatementOperation.scala:173)
> 	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> 	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> 	at java.lang.Thread.run(Thread.java:745)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)