You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2020/08/13 02:50:23 UTC

[GitHub] [iceberg] duanmeng opened a new pull request #1329: Support read data from iceberg table created by SparkSQL using hive

duanmeng opened a new pull request #1329:
URL: https://github.com/apache/iceberg/pull/1329


   This PR implements:
   
   - Add default Inputformat/Outputformat/serde (HiveIcebergInputForma,HiveIcebergOutputForma,HiveIcebergSerD) when creating iceberg table using SparkSQL.
   - Add Hive Runtime to support read data from hive external table by adding jar in hive.
   
   A hive external table would be created when create a iceberg table in SparkSQL, but its default Inputformat/Outputformat/serde is FileInputFormat/FileOutputFormat/LazySimpleSer, we can't read data from this external table.  With this pr we can read data from the external table in hive (add jar and set mapreduce.input.fileinputformat.input.dir.recursive=true).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] rdblue commented on pull request #1329: Support read data from iceberg table created by SparkSQL using hive

Posted by GitBox <gi...@apache.org>.

rdblue commented on pull request #1329:
URL: https://github.com/apache/iceberg/pull/1329#issuecomment-704407528


   @duanmeng, sorry I didn't see this PR sooner. I think you would probably be interested in other PRs that have recently implemented functionality that is similar:
   
   * #1267 added an iceberg-hive-runtime Jar with Iceberg dependencies bundled in
   * #1505 makes tables readable in Hive when created in Spark, if the table or environment has a flag set
   
   Please have a look at those and see if they do everything you need.
   
   I'll close this PR since it overlaps the others, but feel free to open a new one if you have anything else to fix. Ping me and I'll take a look at future PRs.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] duanmeng commented on pull request #1329: Support read data from iceberg table created by SparkSQL using hive

Posted by GitBox <gi...@apache.org>.

duanmeng commented on pull request #1329:
URL: https://github.com/apache/iceberg/pull/1329#issuecomment-705989368


   @rdblue  Got it thanks, these two PRs are exactly what I intend to do in this pr.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] duanmeng commented on pull request #1329: Support read data from iceberg table created by SparkSQL using hive

Posted by GitBox <gi...@apache.org>.

duanmeng commented on pull request #1329:
URL: https://github.com/apache/iceberg/pull/1329#issuecomment-705989368


   @rdblue  Got it thanks, these two PRs are exactly what I intend to do in this pr.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org

[GitHub] [iceberg] rdblue closed pull request #1329: Support read data from iceberg table created by SparkSQL using hive

Posted by GitBox <gi...@apache.org>.

rdblue closed pull request #1329:
URL: https://github.com/apache/iceberg/pull/1329


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org