You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2020/10/29 05:44:09 UTC

[GitHub] [iceberg] paulzanmei opened a new issue #1684: Iceberg table hive and Flink cannot read or write to each other?

paulzanmei opened a new issue #1684:
URL: https://github.com/apache/iceberg/issues/1684


   I tried to create a table on flink to write data into it, and use hive to re-read the table and it was empty; I created a table on hive, but I couldn’t write to it using flink.
   
   Metadata uses HiveCatalog
   
   Is there any solution?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] paulzanmei commented on issue #1684: Iceberg table hive and Flink cannot read or write to each other?

Posted by GitBox <gi...@apache.org>.
paulzanmei commented on issue #1684:
URL: https://github.com/apache/iceberg/issues/1684#issuecomment-720259098






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] junsionzhang removed a comment on issue #1684: Iceberg table hive and Flink cannot read or write to each other?

Posted by GitBox <gi...@apache.org>.
junsionzhang removed a comment on issue #1684:
URL: https://github.com/apache/iceberg/issues/1684#issuecomment-807987442


   > ```sql
   > CREATE EXTERNAL TABLE sample_mirror2
   > STORED BY 'org.apache.iceberg.mr.hive.HiveIcebergStorageHandler'
   > LOCATION 'file:///Users/openinx/test/hive-warehouse1/default/sample';
   > ```
   
   I have the same question now.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] junsionzhang removed a comment on issue #1684: Iceberg table hive and Flink cannot read or write to each other?

Posted by GitBox <gi...@apache.org>.
junsionzhang removed a comment on issue #1684:
URL: https://github.com/apache/iceberg/issues/1684#issuecomment-808012631


   > Hi @paulzanmei
   > 
   > I tried your case on my local host, I did not query the hive catalog table ( which is created in flink sql) in hive sql successfully, but I did the similar thing successfully for hadoop table. Say the following steps would work:
   > 
   > Step#1 : Create a hadoop catalog/database/table in flink sql and then write one record:
   > 
   > ```shell
   > Flink SQL> CREATE CATALOG hadoop_catalog WITH (
   > >   'type'='iceberg',
   > >   'catalog-type'='hadoop',
   > >   'clients'='5',
   > >   'warehouse'='file:///Users/openinx/test/hive-warehouse1'
   > > );
   > [INFO] Catalog has been created.
   > 
   > Flink SQL> show databases;
   > default_database
   > 
   > Flink SQL> show tables; 
   > [INFO] Result was empty.
   > 
   > Flink SQL> use catalog hadoop_catalog; 
   > 
   > Flink SQL> show tables;
   > sample
   > 
   > Flink SQL> insert into sample select 1, 'a';
   > [INFO] Submitting SQL update statement to the cluster...
   > [INFO] Table update statement has been successfully submitted to the cluster:
   > Job ID: 4b4b8c2b1a8d6f876a56f55abe4de1fe
   > ```
   > 
   > Step#2: Create an external hive table in hive shell:
   > 
   > ```sql
   > CREATE EXTERNAL TABLE sample_mirror2
   > STORED BY 'org.apache.iceberg.mr.hive.HiveIcebergStorageHandler'
   > LOCATION 'file:///Users/openinx/test/hive-warehouse1/default/sample';
   > ```
   > 
   > And show tables:
   > 
   > ```
   > show tables; 
   > 
   > +-----------------+
   > |    tab_name     |
   > +-----------------+
   > | sample_mirror2  |
   > +-----------------+
   > 1 rows selected (0.056 seconds)
   > ```
   > 
   > Step#3: Query the records in hive shell:
   > 
   > ```sql
   > select * from sample_mirror2;
   > 
   > +--------------------+----------------------+
   > | sample_mirror2.id  | sample_mirror2.data  |
   > +--------------------+----------------------+
   > | 1                  | a                    |
   > +--------------------+----------------------+
   > 1 row selected (0.86 seconds)
   > ```
   > 
   > I will figure out why it does not work for hive catalog table. @paulzanmei FYI
   
   hi, have you figured out the reason now ?  @JingsongLi  said 'Hive reader only supports read a existed Hadoop table...' ,but I think the link(https://iceberg.apache.org/hive/) does not say it clearly .


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] junsionzhang commented on issue #1684: Iceberg table hive and Flink cannot read or write to each other?

Posted by GitBox <gi...@apache.org>.
junsionzhang commented on issue #1684:
URL: https://github.com/apache/iceberg/issues/1684#issuecomment-807987442


   > ```sql
   > CREATE EXTERNAL TABLE sample_mirror2
   > STORED BY 'org.apache.iceberg.mr.hive.HiveIcebergStorageHandler'
   > LOCATION 'file:///Users/openinx/test/hive-warehouse1/default/sample';
   > ```
   
   I have the same question now.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] openinx commented on issue #1684: Iceberg table hive and Flink cannot read or write to each other?

Posted by GitBox <gi...@apache.org>.
openinx commented on issue #1684:
URL: https://github.com/apache/iceberg/issues/1684#issuecomment-720277285






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] junsionzhang commented on issue #1684: Iceberg table hive and Flink cannot read or write to each other?

Posted by GitBox <gi...@apache.org>.
junsionzhang commented on issue #1684:
URL: https://github.com/apache/iceberg/issues/1684#issuecomment-808012608






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] RussellSpitzer commented on issue #1684: Iceberg table hive and Flink cannot read or write to each other?

Posted by GitBox <gi...@apache.org>.
RussellSpitzer commented on issue #1684:
URL: https://github.com/apache/iceberg/issues/1684#issuecomment-718882014


   Could you explain step by step what you did and what you expected to have happen? For example how you created the table and then how you attempted to read from it, what tools were you using etc?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] paulzanmei closed issue #1684: Iceberg table hive and Flink cannot read or write to each other?

Posted by GitBox <gi...@apache.org>.
paulzanmei closed issue #1684:
URL: https://github.com/apache/iceberg/issues/1684


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] paulzanmei commented on issue #1684: Iceberg table hive and Flink cannot read or write to each other?

Posted by GitBox <gi...@apache.org>.
paulzanmei commented on issue #1684:
URL: https://github.com/apache/iceberg/issues/1684#issuecomment-719195944


   @RussellSpitzer   @openinx  
   My expectation is to use hive to create an iceberg table, use flink to write data, and then use hive and Presto to query and analyze the data
   
   
   Operation steps:
   
   hive:
   1.  add jar /path/to/iceberg-hive-runtime.jar
   2. CREATE EXTERNAL TABLE customers
   STORED BY 'org.apache.iceberg.mr.hive.HiveIcebergStorageHandler' 
   TBLPROPERTIES ('iceberg.mr.table.schema'='{"type":"struct","fields":[{"id":1,"name":"customer_id","required":true,"type":"long"},{"id":2,"name":"first_name","required":true,"type":"string"}]}');
   
   flink:
   3. 
   tableEnv.executeSql("CREATE CATALOG iceberg_catalog WITH (\n" + 
       			"  'type'='iceberg',\n" + 
       			"  'catalog-type'='hive',\n" + 
       			"  'uri'='thrift://dcwork03:9083',\n" + 
       			"  'default-database'='test',\n" + 
       			"  'clients'='5',\n" + 
       			//"  'warehouse'='/user/hive/warehouse',\n" + 
       			"  'hive-conf-dir'='/Users/zhongbaoluo/Applications/app/apache-hive-3.1.2/conf/',\n" + 
       			"  'property-version'='1'\n" + 
       			")");
   4. 
   tableEnv.executeSql("INSERT INTO iceberg_catalog.test.customers SELECT `customer_id`,first_name from datagen").print();
   
   5. 
   
   tableEnv.executeSql("SELECT `customer_id`,first_name from iceberg_catalog.test.customers").print();
   
   hive:
   
   6. 
   select * from customers;
   
   
   problem:
   
   1.   The data actually inserted into the iceberg/data file by flink already exists, and flink run cannot be stopped. The next select sql query also has no data.
   ![image](https://user-images.githubusercontent.com/20592867/97662661-974f7580-1ab2-11eb-8bde-96c2c7e4b96b.png)
   
   2. Hive query statement is also empty table
   
   ![image](https://user-images.githubusercontent.com/20592867/97662819-0200b100-1ab3-11eb-9b0b-18639b25bc90.png)
   
   
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] openinx commented on issue #1684: Iceberg table hive and Flink cannot read or write to each other?

Posted by GitBox <gi...@apache.org>.
openinx commented on issue #1684:
URL: https://github.com/apache/iceberg/issues/1684#issuecomment-719124474


   Yeah,  I think we need more context ( such as how did you test those cases step by step ) to help find out what's wrong. Thanks.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] paulzanmei commented on issue #1684: Iceberg table hive and Flink cannot read or write to each other?

Posted by GitBox <gi...@apache.org>.
paulzanmei commented on issue #1684:
URL: https://github.com/apache/iceberg/issues/1684#issuecomment-721582518


   Hi @openinx  @JingsongLi  Thank you for your help. The problem has been solved
   
   Checkpoint needs to be configured to write data successfully


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] JingsongLi commented on issue #1684: Iceberg table hive and Flink cannot read or write to each other?

Posted by GitBox <gi...@apache.org>.
JingsongLi commented on issue #1684:
URL: https://github.com/apache/iceberg/issues/1684#issuecomment-720966040


   According to https://github.com/apache/iceberg/blob/master/site/docs/hive.md
   Hive reader only supports read a existed Hadoop table...


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org