You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2022/11/21 06:08:08 UTC

[GitHub] [iceberg] leedarHawk opened a new issue, #6235: There is no data in the table, when insert data using Hive on Tez.

leedarHawk opened a new issue, #6235:
URL: https://github.com/apache/iceberg/issues/6235

   ### Apache Iceberg version
   
   0.14.0
   
   ### Query engine
   
   Hive
   
   ### Please describe the bug 🐞
   
   When I insert data using Hive with TEZ, the data is stored on HDFS successfully, but the data is not shown when select. If I change the engine to MR, it works, the data is stored/loaded successfully. 
   
   hive  version: 3.1.3
   tez: 0.10.2
   hadoop: Hadoop 3.1.1.3.1.5.0-152
   
   
   set hive.execution.engine=tez;
   add jar /tmp/iceberg-hive-runtime-0.14.1.jar;
   describe foramtted test.x6;
   +-------------------------------+----------------------------------------------------+----------------------------------------------------+
   |           col_name            |                     data_type                      |                      comment                       |
   +-------------------------------+----------------------------------------------------+----------------------------------------------------+
   | # col_name                    | data_type                                          | comment                                            |
   | user_name                     | string                                             | from deserializer                                  |
   | area                          | string                                             | from deserializer                                  |
   |                               | NULL                                               | NULL                                               |
   | # Detailed Table Information  | NULL                                               | NULL                                               |
   | Database:                     | test                                               | NULL                                               |
   | OwnerType:                    | USER                                               | NULL                                               |
   | Owner:                        | hive                                               | NULL                                               |
   | CreateTime:                   | Mon Nov 21 13:31:53 CST 2022                       | NULL                                               |
   | LastAccessTime:               | UNKNOWN                                            | NULL                                               |
   | Retention:                    | 0                                                  | NULL                                               |
   | Location:                     | hdfs://hdp46:8020/user/hive/warehouse/test.db/x6   | NULL                                               |
   | Table Type:                   | MANAGED_TABLE                                      | NULL                                               |
   | Table Parameters:             | NULL                                               | NULL                                               |
   |                               | bucketing_version                                  | 2                                                  |
   |                               | current-schema                                     | {\"type\":\"struct\",\"schema-id\":0,\"fields\":[{\"id\":1,\"name\":\"user_name\",\"required\":false,\"type\":\"string\"},{\"id\":2,\"name\":\"area\",\"required\":false,\"type\":\"string\"}]} |
   |                               | engine.hive.enabled                                | true                                               |
   |                               | external.table.purge                               | TRUE                                               |
   |                               | metadata_location                                  | hdfs://hdp46:8020/user/hive/warehouse/test.db/x6/metadata/00000-362d4f5e-6575-4be7-aad1-a9c027d1ff43.metadata.json |
   |                               | numFiles                                           | 0                                                  |
   |                               | numRows                                            | 0                                                  |
   |                               | rawDataSize                                        | 0                                                  |
   |                               | snapshot-count                                     | 0                                                  |
   |                               | storage_handler                                    | org.apache.iceberg.mr.hive.HiveIcebergStorageHandler |
   |                               | table_type                                         | ICEBERG                                            |
   |                               | totalSize                                          | 0                                                  |
   |                               | transient_lastDdlTime                              | 1669008713                                         |
   |                               | uuid                                               | 5fe91beb-0cfc-4903-96e6-268b933eca73               |
   |                               | NULL                                               | NULL                                               |
   | # Storage Information         | NULL                                               | NULL                                               |
   | SerDe Library:                | org.apache.iceberg.mr.hive.HiveIcebergSerDe        | NULL                                               |
   | InputFormat:                  | org.apache.iceberg.mr.hive.HiveIcebergInputFormat  | NULL                                               |
   | OutputFormat:                 | org.apache.iceberg.mr.hive.HiveIcebergOutputFormat | NULL                                               |
   | Compressed:                   | No                                                 | NULL                                               |
   | Num Buckets:                  | 0                                                  | NULL                                               |
   | Bucket Columns:               | []                                                 | NULL                                               |
   | Sort Columns:                 | []                                                 | NULL                                               |
   +-------------------------------+----------------------------------------------------+----------------------------------------------------+
   =================================insert data===================
   insert into test.x6  values ('a1', 'b1');
   =================================insert data log start================
   INFO  : Compiling command(queryId=root_20221121133308_796f75d3-e891-4f39-8b48-ac2dcbb8d433): insert into test.x6  values ('a1', 'b1')
   INFO  : Concurrency mode is disabled, not creating a lock manager
   INFO  : Semantic Analysis Completed (retrial = false)
   INFO  : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:col1, type:string, comment:null), FieldSchema(name:col2, type:string, comment:null)], properties:null)
   INFO  : Completed compiling command(queryId=root_20221121133308_796f75d3-e891-4f39-8b48-ac2dcbb8d433); Time taken: 0.782 seconds
   INFO  : Concurrency mode is disabled, not creating a lock manager
   INFO  : Executing command(queryId=root_20221121133308_796f75d3-e891-4f39-8b48-ac2dcbb8d433): insert into test.x6  values ('a1', 'b1')
   INFO  : Query ID = root_20221121133308_796f75d3-e891-4f39-8b48-ac2dcbb8d433
   INFO  : Total jobs = 1
   INFO  : Starting task [Stage-0:DDL] in serial mode
   INFO  : Starting task [Stage-1:DDL] in serial mode
   INFO  : Launching Job 1 out of 1
   INFO  : Starting task [Stage-2:MAPRED] in serial mode
   INFO  : Subscribed to counters: [] for queryId: root_20221121133308_796f75d3-e891-4f39-8b48-ac2dcbb8d433
   INFO  : Session is already open
   INFO  : Dag name: insert into test.x6  values ('a1', 'b1') (Stage-2)
   INFO  : Tez session was closed. Reopening...
   INFO  : Session re-established.
   INFO  : Session re-established.
   INFO  : Status: Running (Executing on YARN cluster with App id application_1668952240853_0007)
   
   INFO  : Starting task [Stage-4:DDL] in serial mode
   INFO  : Completed executing command(queryId=root_20221121133308_796f75d3-e891-4f39-8b48-ac2dcbb8d433); Time taken: 17.825 seconds
   INFO  : OK
   INFO  : Concurrency mode is disabled, not creating a lock manager
   INFO  : Compiling command(queryId=root_20221121133504_3b4e5da8-db8c-4494-b94d-98aa991a33f3): select * from test.x6
   INFO  : Concurrency mode is disabled, not creating a lock manager
   =================================insert data log end================
   =================================select data======================
   select * from test.x6
   =================================select data log start===================
   INFO  : Compiling command(queryId=root_20221121133504_3b4e5da8-db8c-4494-b94d-98aa991a33f3): select * from test.x6
   INFO  : Concurrency mode is disabled, not creating a lock manager
   INFO  : Semantic Analysis Completed (retrial = false)
   INFO  : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:x6.user_name, type:string, comment:null), FieldSchema(name:x6.area, type:string, comment:null)], properties:null)
   INFO  : Completed compiling command(queryId=root_20221121133504_3b4e5da8-db8c-4494-b94d-98aa991a33f3); Time taken: 0.441 seconds
   INFO  : Concurrency mode is disabled, not creating a lock manager
   INFO  : Executing command(queryId=root_20221121133504_3b4e5da8-db8c-4494-b94d-98aa991a33f3): select * from test.x6
   INFO  : Completed executing command(queryId=root_20221121133504_3b4e5da8-db8c-4494-b94d-98aa991a33f3); Time taken: 0.001 seconds
   INFO  : OK
   INFO  : Concurrency mode is disabled, not creating a lock manager
   +---------------+----------+
   | x6.user_name  | x6.area  |
   +---------------+----------+
   +---------------+----------+
   =================================select data log start===================
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] maswin commented on issue #6235: There is no data in the table, when insert data using Hive on Tez.

Posted by "maswin (via GitHub)" <gi...@apache.org>.
maswin commented on issue #6235:
URL: https://github.com/apache/iceberg/issues/6235#issuecomment-1583448330

   @leedarHawk & @Gboyka - Iceberg doesn't support Tez execution engine for Inserts in Hive-3
   
   I have raised a PR to solve this issue - https://github.com/apache/iceberg/pull/7802
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] pvary commented on issue #6235: There is no data in the table, when insert data using Hive on Tez.

Posted by GitBox <gi...@apache.org>.
pvary commented on issue #6235:
URL: https://github.com/apache/iceberg/issues/6235#issuecomment-1369904833

   Maybe Hive 4.0.0-alpha-2 could help. The integration there is way better than with any other Hive versions


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] Gboyka commented on issue #6235: There is no data in the table, when insert data using Hive on Tez.

Posted by GitBox <gi...@apache.org>.
Gboyka commented on issue #6235:
URL: https://github.com/apache/iceberg/issues/6235#issuecomment-1367283307

   Did you find any root cause or resolution for this, I'm facing the same issue


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org