You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tez.apache.org by "Siddharth Seth (JIRA)" <ji...@apache.org> on 2017/02/03 18:58:51 UTC

[jira] [Commented] (TEZ-3603) Hive 2.1.1 on Tez engine with Sqoop - Error while Run sqoop import as parquet files

    [ https://issues.apache.org/jira/browse/TEZ-3603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15851945#comment-15851945 ] 

Siddharth Seth commented on TEZ-3603:
-------------------------------------

[~artemvel] - please send this to the user mailing list instead of creating a jira. There's more people watching, and you're more likely to get a response there.

> Hive 2.1.1 on Tez engine with Sqoop - Error while Run sqoop import as parquet files
> -----------------------------------------------------------------------------------
>
>                 Key: TEZ-3603
>                 URL: https://issues.apache.org/jira/browse/TEZ-3603
>             Project: Apache Tez
>          Issue Type: Bug
>    Affects Versions: 0.8.4
>         Environment: centos 7.2
> Sqoop 1.4.6
> Hive 2.1.1
> hadoop 2.7.0
>            Reporter: Artem Velykorodnyi
>
> I tried to import some data to Hive as parquet through sqoop using this command:
> {noformat}
> sqoop import --connect jdbc:mysql://node1:3306/sqoop --username root --password 123456 --table devidents --hive-import --hive-table galinqewra --create-hive-table -m 1 --as-parquetfile
> {noformat}
> in mapred-site.xml  i set mapreduce.framework.name to yarn-tez
> and
> in hive-site.xml  hive.execution.engine to tez
> and it fails with this exception:
> {noformat}
> 17/02/03 01:07:45 INFO client.TezClient: Submitting DAG to YARN, applicationId=application_1486051443218_0001, dagName=codegen_devidents.jar
> 17/02/03 01:07:46 INFO impl.YarnClientImpl: Submitted application application_1486051443218_0001
> 17/02/03 01:07:46 INFO client.TezClient: The url to track the Tez AM: http://node1:8088/proxy/application_1486051443218_0001/
> 17/02/03 01:07:59 INFO mapreduce.Job: The url to track the job: http://node1:8088/proxy/application_1486051443218_0001/
> 17/02/03 01:07:59 INFO mapreduce.Job: Running job: job_1486051443218_0001
> 17/02/03 01:08:00 INFO mapreduce.Job: Job job_1486051443218_0001 running in uber mode : false
> 17/02/03 01:08:00 INFO mapreduce.Job:  map 0% reduce 0%
> 17/02/03 01:08:27 INFO mapreduce.Job: Job job_1486051443218_0001 failed with state FAILED due to: Vertex failed, vertexName=initialmap, vertexId=vertex_1486051443218_0001_1_00, diagnostics=[Task failed, taskId=task_1486051443218_0001_1_00_000000, diagnostics=[TaskAttempt 0 failed, info=[Error: Error while running task ( failure ) : attempt_1486051443218_0001_1_00_000000_0:org.kitesdk.data.DatasetNotFoundException: Descriptor location does not exist: hdfs:/tmp/default/.temp/job_14860514432180_0001/mr/job_14860514432180_0001/.metadata
> 	at org.kitesdk.data.spi.filesystem.FileSystemMetadataProvider.checkExists(FileSystemMetadataProvider.java:562)
> 	at org.kitesdk.data.spi.filesystem.FileSystemMetadataProvider.find(FileSystemMetadataProvider.java:605)
> 	at org.kitesdk.data.spi.filesystem.FileSystemMetadataProvider.load(FileSystemMetadataProvider.java:114)
> 	at org.kitesdk.data.spi.filesystem.FileSystemDatasetRepository.load(FileSystemDatasetRepository.java:197)
> 	at org.kitesdk.data.spi.AbstractDatasetRepository.load(AbstractDatasetRepository.java:40)
> 	at org.kitesdk.data.mapreduce.DatasetKeyOutputFormat.loadJobDataset(DatasetKeyOutputFormat.java:591)
> 	at org.kitesdk.data.mapreduce.DatasetKeyOutputFormat.loadOrCreateTaskAttemptDataset(DatasetKeyOutputFormat.java:602)
> 	at org.kitesdk.data.mapreduce.DatasetKeyOutputFormat.loadOrCreateTaskAttemptView(DatasetKeyOutputFormat.java:615)
> 	at org.kitesdk.data.mapreduce.DatasetKeyOutputFormat.getRecordWriter(DatasetKeyOutputFormat.java:448)
> 	at org.apache.tez.mapreduce.output.MROutput.initialize(MROutput.java:399)
> 	at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask$InitializeOutputCallable._callInternal(LogicalIOProcessorRuntimeTask.java:533)
> 	at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask$InitializeOutputCallable.callInternal(LogicalIOProcessorRuntimeTask.java:516)
> 	at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask$InitializeOutputCallable.callInternal(LogicalIOProcessorRuntimeTask.java:501)
> 	at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
> 	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> 	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> 	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> 	at java.lang.Thread.run(Thread.java:745)
> ]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 killedTasks:0, Vertex vertex_1486051443218_0001_1_00 [initialmap] killed/failed due to:OWN_TASK_FAILURE]. DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:0
> 17/02/03 01:08:27 INFO mapreduce.Job: Counters: 0
> 17/02/03 01:08:27 WARN mapreduce.Counters: Group FileSystemCounters is deprecated. Use org.apache.hadoop.mapreduce.FileSystemCounter instead
> 17/02/03 01:08:27 INFO mapreduce.ImportJobBase: Transferred 0 bytes in 63.4853 seconds (0 bytes/sec)
> 17/02/03 01:08:27 WARN mapreduce.Counters: Group org.apache.hadoop.mapred.Task$Counter is deprecated. Use org.apache.hadoop.mapreduce.TaskCounter instead
> 17/02/03 01:08:27 INFO mapreduce.ImportJobBase: Retrieved 0 records.
> 17/02/03 01:08:27 ERROR tool.ImportTool: Error during import: Import job failed!
> {noformat}
> Hive table is created but no any data in it.
> if i start job on MapReduce mode it successfully completed
> also it pass if i run without '--as-parquetfile'
> any suggestions?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)