You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Kristam Subba Swathi (JIRA)" <ji...@apache.org> on 2012/05/17 16:21:12 UTC

[jira] [Created] (HIVE-3033) Loading data from a file in hdfs to hive table is failing if we try to load the same file into the same table second time

Kristam Subba Swathi created HIVE-3033:
------------------------------------------

             Summary: Loading data from a file in hdfs to hive table is failing if we try to load the same file into the same table second time
                 Key: HIVE-3033
                 URL: https://issues.apache.org/jira/browse/HIVE-3033
             Project: Hive
          Issue Type: Bug
          Components: Metastore
    Affects Versions: 0.8.1, 0.9.0, 0.9.1
            Reporter: Kristam Subba Swathi


Steps to reproduce
---------------------
1)create table in hive
create table emp(IP STRING,showtime double) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\040'
2)load data into the table
LOAD DATA INPATH '/hive/input/data2.txt' OVERWRITE INTO TABLE emp
LOAD DATA INPATH '/hive/input/data2.txt' OVERWRITE INTO TABLE emp
Loading the same file into the same table is failing 
{noformat}
2012-05-11 19:28:54,415 DEBUG metadata.Hive (Hive.java:checkPaths(1937)) - Successfully renamed hdfs://10.18.40.25:54310/HiveNFT_testLoadDataShouldOverWriteIfSameFileAlreadyExistsInTableByGivingTheRooTPath/data2.txt to hdfs://10.18.40.25:54310/HiveNFT_testLoadDataShouldOverWriteIfSameFileAlreadyExistsInTableByGivingTheRooTPath/data2_copy_3.txt
2012-05-11 19:28:54,416 DEBUG ipc.Client (Client.java:sendParam(786)) - IPC Client (32955489) connection to HOST-10-18-40-25/10.18.40.25:54310 from root sending #5749
2012-05-11 19:28:54,416 DEBUG ipc.Client (Client.java:receiveResponse(821)) - IPC Client (32955489) connection to HOST-10-18-40-25/10.18.40.25:54310 from root got value #5749
2012-05-11 19:28:54,417 DEBUG ipc.RPC (WritableRpcEngine.java:invoke(197)) - Call: getFileInfo 2
2012-05-11 19:28:54,417 DEBUG ipc.Client (Client.java:sendParam(786)) - IPC Client (32955489) connection to HOST-10-18-40-25/10.18.40.25:54310 from root sending #5750
2012-05-11 19:28:54,419 DEBUG ipc.Client (Client.java:receiveResponse(821)) - IPC Client (32955489) connection to HOST-10-18-40-25/10.18.40.25:54310 from root got value #5750
2012-05-11 19:28:54,419 DEBUG ipc.RPC (WritableRpcEngine.java:invoke(197)) - Call: getListing 2
2012-05-11 19:28:54,420 ERROR exec.Task (SessionState.java:printError(380)) - Failed with exception copyFiles: error while moving files!!!
org.apache.hadoop.hive.ql.metadata.HiveException: copyFiles: error while moving files!!!
	at org.apache.hadoop.hive.ql.metadata.Hive.copyFiles(Hive.java:1989)
	at org.apache.hadoop.hive.ql.metadata.Table.copyFiles(Table.java:547)
	at org.apache.hadoop.hive.ql.metadata.Hive.loadTable(Hive.java:1283)
	at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:234)
	at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:133)
	at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
	at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1332)
	at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1123)
	at org.apache.hadoop.hive.ql.Driver.run(Driver.java:931)
	at org.apache.hadoop.hive.service.HiveServer$HiveServerHandler.execute(HiveServer.java:191)
	at org.apache.hadoop.hive.service.ThriftHive$Processor$execute.getResult(ThriftHive.java:629)
	at org.apache.hadoop.hive.service.ThriftHive$Processor$execute.getResult(ThriftHive.java:617)
	at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:32)
	at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:34)
	at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:176)
	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
	at java.lang.Thread.run(Thread.java:619)
Caused by: java.io.FileNotFoundException: File hdfs://10.18.40.25:54310/HiveNFT_testLoadDataShouldOverWriteIfSameFileAlreadyExistsInTableByGivingTheRooTPath/data2.txt does not exist.
	at org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:353)
	at org.apache.hadoop.hive.ql.metadata.Hive.copyFiles(Hive.java:1979)
	... 17 more

{noformat}
 



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HIVE-3033) Loading data from a file in hdfs to hive table is failing if we try to load the same file into the same table second time

Posted by "Chinna Rao Lalam (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-3033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chinna Rao Lalam updated HIVE-3033:
-----------------------------------

    Attachment: HIVE-3033.patch
    
> Loading data from a file in hdfs to hive table is failing if we try to load the same file into the same table second time
> -------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-3033
>                 URL: https://issues.apache.org/jira/browse/HIVE-3033
>             Project: Hive
>          Issue Type: Bug
>          Components: Metastore
>    Affects Versions: 0.8.1, 0.9.0, 0.9.1
>            Reporter: Kristam Subba Swathi
>            Assignee: Chinna Rao Lalam
>         Attachments: HIVE-3033.patch
>
>
> Steps to reproduce
> ---------------------
> 1)create table in hive
> create table emp(IP STRING,showtime double) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\040'
> 2)load data into the table
> LOAD DATA INPATH '/hive/input/data2.txt' OVERWRITE INTO TABLE emp
> LOAD DATA INPATH '/hive/input/data2.txt' OVERWRITE INTO TABLE emp
> Loading the same file into the same table is failing 
> {noformat}
> 2012-05-11 19:28:54,415 DEBUG metadata.Hive (Hive.java:checkPaths(1937)) - Successfully renamed hdfs://10.18.40.25:54310/HiveNFT_testLoadDataShouldOverWriteIfSameFileAlreadyExistsInTableByGivingTheRooTPath/data2.txt to hdfs://10.18.40.25:54310/HiveNFT_testLoadDataShouldOverWriteIfSameFileAlreadyExistsInTableByGivingTheRooTPath/data2_copy_3.txt
> 2012-05-11 19:28:54,416 DEBUG ipc.Client (Client.java:sendParam(786)) - IPC Client (32955489) connection to HOST-10-18-40-25/10.18.40.25:54310 from root sending #5749
> 2012-05-11 19:28:54,416 DEBUG ipc.Client (Client.java:receiveResponse(821)) - IPC Client (32955489) connection to HOST-10-18-40-25/10.18.40.25:54310 from root got value #5749
> 2012-05-11 19:28:54,417 DEBUG ipc.RPC (WritableRpcEngine.java:invoke(197)) - Call: getFileInfo 2
> 2012-05-11 19:28:54,417 DEBUG ipc.Client (Client.java:sendParam(786)) - IPC Client (32955489) connection to HOST-10-18-40-25/10.18.40.25:54310 from root sending #5750
> 2012-05-11 19:28:54,419 DEBUG ipc.Client (Client.java:receiveResponse(821)) - IPC Client (32955489) connection to HOST-10-18-40-25/10.18.40.25:54310 from root got value #5750
> 2012-05-11 19:28:54,419 DEBUG ipc.RPC (WritableRpcEngine.java:invoke(197)) - Call: getListing 2
> 2012-05-11 19:28:54,420 ERROR exec.Task (SessionState.java:printError(380)) - Failed with exception copyFiles: error while moving files!!!
> org.apache.hadoop.hive.ql.metadata.HiveException: copyFiles: error while moving files!!!
> 	at org.apache.hadoop.hive.ql.metadata.Hive.copyFiles(Hive.java:1989)
> 	at org.apache.hadoop.hive.ql.metadata.Table.copyFiles(Table.java:547)
> 	at org.apache.hadoop.hive.ql.metadata.Hive.loadTable(Hive.java:1283)
> 	at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:234)
> 	at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:133)
> 	at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
> 	at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1332)
> 	at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1123)
> 	at org.apache.hadoop.hive.ql.Driver.run(Driver.java:931)
> 	at org.apache.hadoop.hive.service.HiveServer$HiveServerHandler.execute(HiveServer.java:191)
> 	at org.apache.hadoop.hive.service.ThriftHive$Processor$execute.getResult(ThriftHive.java:629)
> 	at org.apache.hadoop.hive.service.ThriftHive$Processor$execute.getResult(ThriftHive.java:617)
> 	at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:32)
> 	at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:34)
> 	at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:176)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> 	at java.lang.Thread.run(Thread.java:619)
> Caused by: java.io.FileNotFoundException: File hdfs://10.18.40.25:54310/HiveNFT_testLoadDataShouldOverWriteIfSameFileAlreadyExistsInTableByGivingTheRooTPath/data2.txt does not exist.
> 	at org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:353)
> 	at org.apache.hadoop.hive.ql.metadata.Hive.copyFiles(Hive.java:1979)
> 	... 17 more
> {noformat}
>  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Assigned] (HIVE-3033) Loading data from a file in hdfs to hive table is failing if we try to load the same file into the same table second time

Posted by "Chinna Rao Lalam (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-3033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chinna Rao Lalam reassigned HIVE-3033:
--------------------------------------

    Assignee: Chinna Rao Lalam
    
> Loading data from a file in hdfs to hive table is failing if we try to load the same file into the same table second time
> -------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-3033
>                 URL: https://issues.apache.org/jira/browse/HIVE-3033
>             Project: Hive
>          Issue Type: Bug
>          Components: Metastore
>    Affects Versions: 0.8.1, 0.9.0, 0.9.1
>            Reporter: Kristam Subba Swathi
>            Assignee: Chinna Rao Lalam
>
> Steps to reproduce
> ---------------------
> 1)create table in hive
> create table emp(IP STRING,showtime double) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\040'
> 2)load data into the table
> LOAD DATA INPATH '/hive/input/data2.txt' OVERWRITE INTO TABLE emp
> LOAD DATA INPATH '/hive/input/data2.txt' OVERWRITE INTO TABLE emp
> Loading the same file into the same table is failing 
> {noformat}
> 2012-05-11 19:28:54,415 DEBUG metadata.Hive (Hive.java:checkPaths(1937)) - Successfully renamed hdfs://10.18.40.25:54310/HiveNFT_testLoadDataShouldOverWriteIfSameFileAlreadyExistsInTableByGivingTheRooTPath/data2.txt to hdfs://10.18.40.25:54310/HiveNFT_testLoadDataShouldOverWriteIfSameFileAlreadyExistsInTableByGivingTheRooTPath/data2_copy_3.txt
> 2012-05-11 19:28:54,416 DEBUG ipc.Client (Client.java:sendParam(786)) - IPC Client (32955489) connection to HOST-10-18-40-25/10.18.40.25:54310 from root sending #5749
> 2012-05-11 19:28:54,416 DEBUG ipc.Client (Client.java:receiveResponse(821)) - IPC Client (32955489) connection to HOST-10-18-40-25/10.18.40.25:54310 from root got value #5749
> 2012-05-11 19:28:54,417 DEBUG ipc.RPC (WritableRpcEngine.java:invoke(197)) - Call: getFileInfo 2
> 2012-05-11 19:28:54,417 DEBUG ipc.Client (Client.java:sendParam(786)) - IPC Client (32955489) connection to HOST-10-18-40-25/10.18.40.25:54310 from root sending #5750
> 2012-05-11 19:28:54,419 DEBUG ipc.Client (Client.java:receiveResponse(821)) - IPC Client (32955489) connection to HOST-10-18-40-25/10.18.40.25:54310 from root got value #5750
> 2012-05-11 19:28:54,419 DEBUG ipc.RPC (WritableRpcEngine.java:invoke(197)) - Call: getListing 2
> 2012-05-11 19:28:54,420 ERROR exec.Task (SessionState.java:printError(380)) - Failed with exception copyFiles: error while moving files!!!
> org.apache.hadoop.hive.ql.metadata.HiveException: copyFiles: error while moving files!!!
> 	at org.apache.hadoop.hive.ql.metadata.Hive.copyFiles(Hive.java:1989)
> 	at org.apache.hadoop.hive.ql.metadata.Table.copyFiles(Table.java:547)
> 	at org.apache.hadoop.hive.ql.metadata.Hive.loadTable(Hive.java:1283)
> 	at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:234)
> 	at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:133)
> 	at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
> 	at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1332)
> 	at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1123)
> 	at org.apache.hadoop.hive.ql.Driver.run(Driver.java:931)
> 	at org.apache.hadoop.hive.service.HiveServer$HiveServerHandler.execute(HiveServer.java:191)
> 	at org.apache.hadoop.hive.service.ThriftHive$Processor$execute.getResult(ThriftHive.java:629)
> 	at org.apache.hadoop.hive.service.ThriftHive$Processor$execute.getResult(ThriftHive.java:617)
> 	at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:32)
> 	at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:34)
> 	at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:176)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> 	at java.lang.Thread.run(Thread.java:619)
> Caused by: java.io.FileNotFoundException: File hdfs://10.18.40.25:54310/HiveNFT_testLoadDataShouldOverWriteIfSameFileAlreadyExistsInTableByGivingTheRooTPath/data2.txt does not exist.
> 	at org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:353)
> 	at org.apache.hadoop.hive.ql.metadata.Hive.copyFiles(Hive.java:1979)
> 	... 17 more
> {noformat}
>  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HIVE-3033) Loading data from a file in hdfs to hive table is failing if we try to load the same file into the same table second time

Posted by "Chinna Rao Lalam (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-3033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13277851#comment-13277851 ] 

Chinna Rao Lalam commented on HIVE-3033:
----------------------------------------

While loading the file if any file already exists with that name it will rename the file name but here after renaming still it is refering with old name. It should refer to the new file name.

                
> Loading data from a file in hdfs to hive table is failing if we try to load the same file into the same table second time
> -------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-3033
>                 URL: https://issues.apache.org/jira/browse/HIVE-3033
>             Project: Hive
>          Issue Type: Bug
>          Components: Metastore
>    Affects Versions: 0.8.1, 0.9.0, 0.9.1
>            Reporter: Kristam Subba Swathi
>            Assignee: Chinna Rao Lalam
>
> Steps to reproduce
> ---------------------
> 1)create table in hive
> create table emp(IP STRING,showtime double) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\040'
> 2)load data into the table
> LOAD DATA INPATH '/hive/input/data2.txt' OVERWRITE INTO TABLE emp
> LOAD DATA INPATH '/hive/input/data2.txt' OVERWRITE INTO TABLE emp
> Loading the same file into the same table is failing 
> {noformat}
> 2012-05-11 19:28:54,415 DEBUG metadata.Hive (Hive.java:checkPaths(1937)) - Successfully renamed hdfs://10.18.40.25:54310/HiveNFT_testLoadDataShouldOverWriteIfSameFileAlreadyExistsInTableByGivingTheRooTPath/data2.txt to hdfs://10.18.40.25:54310/HiveNFT_testLoadDataShouldOverWriteIfSameFileAlreadyExistsInTableByGivingTheRooTPath/data2_copy_3.txt
> 2012-05-11 19:28:54,416 DEBUG ipc.Client (Client.java:sendParam(786)) - IPC Client (32955489) connection to HOST-10-18-40-25/10.18.40.25:54310 from root sending #5749
> 2012-05-11 19:28:54,416 DEBUG ipc.Client (Client.java:receiveResponse(821)) - IPC Client (32955489) connection to HOST-10-18-40-25/10.18.40.25:54310 from root got value #5749
> 2012-05-11 19:28:54,417 DEBUG ipc.RPC (WritableRpcEngine.java:invoke(197)) - Call: getFileInfo 2
> 2012-05-11 19:28:54,417 DEBUG ipc.Client (Client.java:sendParam(786)) - IPC Client (32955489) connection to HOST-10-18-40-25/10.18.40.25:54310 from root sending #5750
> 2012-05-11 19:28:54,419 DEBUG ipc.Client (Client.java:receiveResponse(821)) - IPC Client (32955489) connection to HOST-10-18-40-25/10.18.40.25:54310 from root got value #5750
> 2012-05-11 19:28:54,419 DEBUG ipc.RPC (WritableRpcEngine.java:invoke(197)) - Call: getListing 2
> 2012-05-11 19:28:54,420 ERROR exec.Task (SessionState.java:printError(380)) - Failed with exception copyFiles: error while moving files!!!
> org.apache.hadoop.hive.ql.metadata.HiveException: copyFiles: error while moving files!!!
> 	at org.apache.hadoop.hive.ql.metadata.Hive.copyFiles(Hive.java:1989)
> 	at org.apache.hadoop.hive.ql.metadata.Table.copyFiles(Table.java:547)
> 	at org.apache.hadoop.hive.ql.metadata.Hive.loadTable(Hive.java:1283)
> 	at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:234)
> 	at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:133)
> 	at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
> 	at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1332)
> 	at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1123)
> 	at org.apache.hadoop.hive.ql.Driver.run(Driver.java:931)
> 	at org.apache.hadoop.hive.service.HiveServer$HiveServerHandler.execute(HiveServer.java:191)
> 	at org.apache.hadoop.hive.service.ThriftHive$Processor$execute.getResult(ThriftHive.java:629)
> 	at org.apache.hadoop.hive.service.ThriftHive$Processor$execute.getResult(ThriftHive.java:617)
> 	at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:32)
> 	at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:34)
> 	at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:176)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> 	at java.lang.Thread.run(Thread.java:619)
> Caused by: java.io.FileNotFoundException: File hdfs://10.18.40.25:54310/HiveNFT_testLoadDataShouldOverWriteIfSameFileAlreadyExistsInTableByGivingTheRooTPath/data2.txt does not exist.
> 	at org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:353)
> 	at org.apache.hadoop.hive.ql.metadata.Hive.copyFiles(Hive.java:1979)
> 	... 17 more
> {noformat}
>  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira