You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "codope (via GitHub)" <gi...@apache.org> on 2023/04/28 12:36:20 UTC

[GitHub] [hudi] codope commented on pull request #7355: [HUDI-5308] Hive query returns null when the where clause has a partition field

codope commented on PR #7355:
URL: https://github.com/apache/hudi/pull/7355#issuecomment-1527500827

   @xicm I have not come across this issue before. I have queried MOR partitioned table with Hive and Presto with partition predicates and it returns the correct results for me. I even ran the steps that you shared - https://github.com/apache/hudi/pull/7355#issuecomment-1519710563 - but I could not reproduce. Am I missing something?
   ```
   0: jdbc:hive2://hiveserver:10000> select * from test_partition_rt where part = '2021-12-09';
   +----------------------------------------+-----------------------------------------+---------------------------------------+-------------------------------------------+---------------------------------------------------------------------------+-----------------------+-------------------------+-----------------------+-------------------------+--+
   | test_partition_rt._hoodie_commit_time  | test_partition_rt._hoodie_commit_seqno  | test_partition_rt._hoodie_record_key  | test_partition_rt._hoodie_partition_path  |                    test_partition_rt._hoodie_file_name                    | test_partition_rt.id  | test_partition_rt.name  | test_partition_rt.ts  | test_partition_rt.part  |
   +----------------------------------------+-----------------------------------------+---------------------------------------+-------------------------------------------+---------------------------------------------------------------------------+-----------------------+-------------------------+-----------------------+-------------------------+--+
   | 20230428122308910                      | 20230428122308910_0_0                   | 1                                     | part=2021-12-09                           | 27a098b4-0db6-4852-a931-c26831407ccb-0_0-14-10_20230428122308910.parquet  | 1                     | a1                      | 1000                  | 2021-12-09              |
   +----------------------------------------+-----------------------------------------+---------------------------------------+-------------------------------------------+---------------------------------------------------------------------------+-----------------------+-------------------------+-----------------------+-------------------------+--+
   1 row selected (1.964 seconds)
   ```
   
   I think querying by partition is the most common thing to do and if query returned null then it would have surfaced in earlier versions too. What version of Hudi are you using? Can you also share how you connect to Hive and any config that's being set, e.g. `hive.input.format`. For reference, I simply used beeline as:
   ```
   beeline -u jdbc:hive2://hiveserver:10000 \
   >   --hiveconf hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat \
   >   --hiveconf hive.stats.autogather=false
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org