You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Udit Mehta <um...@groupon.com> on 2016/06/13 23:54:57 UTC

Issue in Insert Overwrite directory operation

Hi All,

I see a weird issue when trying to do a "INSERT OVERWRITE DIRECTORY"
operation. The query seems to work when I limit the data set but fails with
the following exception if the data set is larger:

Failed with exception Unable to move source
hdfs://namenode/user/grp_admin/external_test1/output/.hive-staging_hive_2016-06-13_21-34-36_449_7074605
to destination /user/grp_admin/external_test1/output

I ensured that the directory has enough space so there is no disk quota
issues here.
Does anyone know what is happening here?

Running Hive on Tez. Hive version is 1.2.1. Fails even with Hive on MR.

Run 1 with smaller data set:

    > insert overwrite directory
'/user/grp_admin/external_test1/output' row format delimited fields
terminated by '\t'

    > select * from test_table limit 1000;

Query ID = hive_20160613213624_d9d54ef0-0b28-4e98-b49e-197043f67c43

Total jobs = 3

Launching Job 1 out of 3





Status: Running (Executing on YARN cluster with App id
application_1464825277140_26149)



--------------------------------------------------------------------------------

        VERTICES      STATUS  TOTAL  COMPLETED  RUNNING  PENDING  FAILED  KILLED

--------------------------------------------------------------------------------

Map 1 ..........   SUCCEEDED     12         12        0        0       0       0

Reducer 2 ......   SUCCEEDED      1          1        0        0       0       0

--------------------------------------------------------------------------------

VERTICES: 02/02  [==========================>>] 100%  ELAPSED TIME: 21.03 s

--------------------------------------------------------------------------------

Stage-4 is selected by condition resolver.

Stage-3 is filtered out by condition resolver.

Stage-5 is filtered out by condition resolver.

Moving data to:
hdfs://namenode/user/grp_admin/external_test1/output/.hive-staging_hive_2016-06-13_21-36-24_620_4270199609063911787-1/-ext-10000

Moving data to: /user/grp_admin/external_test1/output

OK

Time taken: 21.501 seconds

Run 2 with larger data set:

    > insert overwrite directory
'/user/grp_admin/external_test1/output' row format delimited fields
terminated by '\t'

    > select * from test_table;

Query ID = hive_20160613213436_a1b0087a-84ff-48a0-ac76-25811aaafe28

Total jobs = 3

Launching Job 1 out of 3

Tez session was closed. Reopening...

Session re-established.





Status: Running (Executing on YARN cluster with App id
application_1464825277140_26149)



--------------------------------------------------------------------------------

        VERTICES      STATUS  TOTAL  COMPLETED  RUNNING  PENDING  FAILED  KILLED

--------------------------------------------------------------------------------

Map 1 ..........   SUCCEEDED     12         12        0        0       0       0

--------------------------------------------------------------------------------

VERTICES: 01/01  [==========================>>] 100%  ELAPSED TIME: 72.69 s

--------------------------------------------------------------------------------

Stage-4 is selected by condition resolver.

Stage-3 is filtered out by condition resolver.

Stage-5 is filtered out by condition resolver.

Moving data to:
hdfs://namenode/user/grp_admin/external_test1/output/.hive-staging_hive_2016-06-13_21-34-36_449_7074605303086037347-1/-ext-10000

Moving data to: /user/grp_admin/external_test1/output

Failed with exception Unable to move source
hdfs://namenode/user/grp_admin/external_test1/output/.hive-staging_hive_2016-06-13_21-34-36_449_7074605303086037347-1/-ext-10000/000000_0
to destination /user/grp_admin/external_test1/output

FAILED: Execution Error, return code 1 from
org.apache.hadoop.hive.ql.exec.MoveTask

RE: Issue in Insert Overwrite directory operation

Posted by "Markovitz, Dudu" <dm...@paypal.com>.
There seems to be a known bug fixed on version 1.3

https://issues.apache.org/jira/browse/HIVE-12364

Dudu

From: Udit Mehta [mailto:umehta@groupon.com]
Sent: Tuesday, June 14, 2016 2:55 AM
To: user@hive.apache.org
Subject: Issue in Insert Overwrite directory operation

Hi All,
I see a weird issue when trying to do a "INSERT OVERWRITE DIRECTORY" operation. The query seems to work when I limit the data set but fails with the following exception if the data set is larger:

Failed with exception Unable to move source hdfs://namenode/user/grp_admin/external_test1/output/.hive-staging_hive_2016-06-13_21-34-36_449_7074605 to destination /user/grp_admin/external_test1/output
I ensured that the directory has enough space so there is no disk quota issues here.
Does anyone know what is happening here?
Running Hive on Tez. Hive version is 1.2.1. Fails even with Hive on MR.

Run 1 with smaller data set:

    > insert overwrite directory '/user/grp_admin/external_test1/output' row format delimited fields terminated by '\t'

    > select * from test_table limit 1000;

Query ID = hive_20160613213624_d9d54ef0-0b28-4e98-b49e-197043f67c43

Total jobs = 3

Launching Job 1 out of 3





Status: Running (Executing on YARN cluster with App id application_1464825277140_26149)



--------------------------------------------------------------------------------

        VERTICES      STATUS  TOTAL  COMPLETED  RUNNING  PENDING  FAILED  KILLED

--------------------------------------------------------------------------------

Map 1 ..........   SUCCEEDED     12         12        0        0       0       0

Reducer 2 ......   SUCCEEDED      1          1        0        0       0       0

--------------------------------------------------------------------------------

VERTICES: 02/02  [==========================>>] 100%  ELAPSED TIME: 21.03 s

--------------------------------------------------------------------------------

Stage-4 is selected by condition resolver.

Stage-3 is filtered out by condition resolver.

Stage-5 is filtered out by condition resolver.

Moving data to: hdfs://namenode/user/grp_admin/external_test1/output/.hive-staging_hive_2016-06-13_21-36-24_620_4270199609063911787-1/-ext-10000

Moving data to: /user/grp_admin/external_test1/output

OK

Time taken: 21.501 seconds



Run 2 with larger data set:

    > insert overwrite directory '/user/grp_admin/external_test1/output' row format delimited fields terminated by '\t'


    > select * from test_table;


Query ID = hive_20160613213436_a1b0087a-84ff-48a0-ac76-25811aaafe28


Total jobs = 3


Launching Job 1 out of 3


Tez session was closed. Reopening...


Session re-established.








Status: Running (Executing on YARN cluster with App id application_1464825277140_26149)





--------------------------------------------------------------------------------


        VERTICES      STATUS  TOTAL  COMPLETED  RUNNING  PENDING  FAILED  KILLED


--------------------------------------------------------------------------------


Map 1 ..........   SUCCEEDED     12         12        0        0       0       0


--------------------------------------------------------------------------------


VERTICES: 01/01  [==========================>>] 100%  ELAPSED TIME: 72.69 s


--------------------------------------------------------------------------------


Stage-4 is selected by condition resolver.


Stage-3 is filtered out by condition resolver.


Stage-5 is filtered out by condition resolver.


Moving data to: hdfs://namenode/user/grp_admin/external_test1/output/.hive-staging_hive_2016-06-13_21-34-36_449_7074605303086037347-1/-ext-10000


Moving data to: /user/grp_admin/external_test1/output


Failed with exception Unable to move source hdfs://namenode/user/grp_admin/external_test1/output/.hive-staging_hive_2016-06-13_21-34-36_449_7074605303086037347-1/-ext-10000/000000_0 to destination /user/grp_admin/external_test1/output


FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask