You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2020/10/24 01:48:52 UTC
[GitHub] [hudi] BalaMahesh opened a new issue #2203: [SUPPORT] Hive query on HUDI table with partition column in where condition returning no results
BalaMahesh opened a new issue #2203:
URL: https://github.com/apache/hudi/issues/2203
**Describe the problem you faced**
Hive query for some partitions on the HUDI table with partition column in where condition is returning no result. I have verified partitions by using show partitions, desc formatted etc.,
I am also able to see the .hoodie_partititon_metadata file and parquet file in the table partition directory. By using the parquet-tools , i did cat on the file and it has exactly one ingested event.
select count(*),dt from _ro table group by dt; : This query returns the count as 1 inside that partition (y)
select * from _ro where id=x; (x in the partition y)
but when i do
select * from _ro where dt="y", it returns empty result but for other dt value it returns results.
I am not sure where the exact issue is, is it because the file size is small and it has only record or if hive is behaving miscellaneously . I have seen the query logs and it shows numFiles = 1 , numSplits=1.
**To Reproduce**
Steps to reproduce the behavior:
1. Ingesting records using HoodieDeltaStreamer from JsonKafka Source
2. Partitioning the data based on date field in (yyyy-MM-dd) format
3. Querying the _ro table.
**Expected behavior**
It should return the single row
**Environment Description**
* Hudi version : 0.6.1
* Spark version : 2.4.7
* Hive version : 1.2
* Hadoop version : 2.7.1
* Storage (HDFS/S3/GCS..) : S3
* Running on Docker? (yes/no) : No
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] BalaMahesh commented on issue #2203: [SUPPORT] Hive query on HUDI table with partition column in where condition returning no results
Posted by GitBox <gi...@apache.org>.
BalaMahesh commented on issue #2203:
URL: https://github.com/apache/hudi/issues/2203#issuecomment-716358283
1)
desc formatted udfinfo_ro;
OK
# col_name data_type comment
_hoodie_commit_time string
_hoodie_commit_seqno string
_hoodie_record_key string
_hoodie_partition_path string
_hoodie_file_name string
id bigint
created_at string
updated_at string
tenant_id int
udf1 string
udf2 string
udf3 string
udf4 string
udf5 string
isdeleted int
# Partition Information
# col_name data_type comment
dt string
# Detailed Table Information
Database: hudi
Owner: null
CreateTime: Wed Oct 21 12:53:53 IST 2020
LastAccessTime: UNKNOWN
Protect Mode: None
Retention: 0
Location: s3a://xxxx/test/hudi/data/platform/udfinfo
Table Type: EXTERNAL_TABLE
Table Parameters:
EXTERNAL TRUE
last_commit_time_sync 20201025012452
transient_lastDdlTime 1603265033
# Storage Information
SerDe Library: org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe
InputFormat: org.apache.hudi.hadoop.HoodieParquetInputFormat
OutputFormat: org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat
Compressed: No
Num Buckets: -1
Bucket Columns: []
Sort Columns: []
Storage Desc Params:
serialization.format 1
2)
hive> describe formatted udfinfo_ro partition(dt="2020-04-04");
OK
# col_name data_type comment
_hoodie_commit_time string
_hoodie_commit_seqno string
_hoodie_record_key string
_hoodie_partition_path string
_hoodie_file_name string
id bigint
created_at string
updated_at string
tenant_id int
udf1 string
udf2 string
udf3 string
udf4 string
udf5 string
isdeleted int
# Partition Information
# col_name data_type comment
dt string
# Detailed Partition Information
Partition Value: [2020-04-04]
Database: hudi
Table: udfinfo_ro
CreateTime: Sat Oct 24 22:01:25 IST 2020
LastAccessTime: UNKNOWN
Protect Mode: None
Location: s3a://xxxx/test/hudi/data/platform/udfinfo/dt=2020-04-04
Partition Parameters:
COLUMN_STATS_ACCURATE false
numFiles 1
numRows -1
rawDataSize -1
totalSize 442191
transient_lastDdlTime 1603557085
# Storage Information
SerDe Library: org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe
InputFormat: org.apache.hudi.hadoop.HoodieParquetInputFormat
OutputFormat: org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat
Compressed: No
Num Buckets: -1
Bucket Columns: []
Sort Columns: []
Storage Desc Params:
serialization.format 1
3)
s3a://xxxx/test/hudi/data/platform/udfinfo/
4)
s3a://xxxx/test/hudi/data/platform/udfinfo/dt=2020-04-04
5)
dfs -ls s3a://xxx/test/hudi/data/platform/udfinfo/dt=2020-04-04;
Found 2 items
-rw-rw-rw- 1 93 2020-10-24 21:58 s3a://xxx/test/hudi/data/platform/udfinfo/dt=2020-04-04/.hoodie_partition_metadata
-rw-rw-rw- 1 442191 2020-10-24 21:58 s3a://xxx/test/hudi/data/platform/udfinfo/dt=2020-04-04/472eea6d-b9d2-40b0-97b9-cf74e839faba-0_278-355-147155_20201024215013.parquet
6)
drwxrwxrwx - 0 1970-01-01 05:30 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/.aux
drwxrwxrwx - 0 1970-01-01 05:30 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/.temp
-rw-rw-rw- 1 1483 2020-10-22 22:42 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201022224048.clean
-rw-rw-rw- 1 1464 2020-10-22 22:42 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201022224048.clean.inflight
-rw-rw-rw- 1 1464 2020-10-22 22:42 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201022224048.clean.requested
-rw-rw-rw- 1 1483 2020-10-22 22:53 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201022225151.clean
-rw-rw-rw- 1 1464 2020-10-22 22:53 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201022225151.clean.inflight
-rw-rw-rw- 1 1464 2020-10-22 22:53 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201022225151.clean.requested
-rw-rw-rw- 1 1483 2020-10-22 22:59 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201022225815.clean
-rw-rw-rw- 1 1464 2020-10-22 22:59 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201022225815.clean.inflight
-rw-rw-rw- 1 1464 2020-10-22 22:59 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201022225815.clean.requested
-rw-rw-rw- 1 1483 2020-10-23 10:44 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201023104241.clean
-rw-rw-rw- 1 1464 2020-10-23 10:44 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201023104241.clean.inflight
-rw-rw-rw- 1 1464 2020-10-23 10:44 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201023104241.clean.requested
-rw-rw-rw- 1 1540 2020-10-23 15:26 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201023152442.clean
-rw-rw-rw- 1 1479 2020-10-23 15:26 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201023152442.clean.inflight
-rw-rw-rw- 1 1479 2020-10-23 15:26 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201023152442.clean.requested
-rw-rw-rw- 1 1540 2020-10-24 06:11 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024061034.clean
-rw-rw-rw- 1 1479 2020-10-24 06:11 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024061034.clean.inflight
-rw-rw-rw- 1 1479 2020-10-24 06:11 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024061034.clean.requested
-rw-rw-rw- 1 1540 2020-10-24 06:14 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024061308.clean
-rw-rw-rw- 1 1479 2020-10-24 06:14 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024061308.clean.inflight
-rw-rw-rw- 1 1479 2020-10-24 06:14 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024061308.clean.requested
-rw-rw-rw- 1 1540 2020-10-24 19:59 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024195819.clean
-rw-rw-rw- 1 1479 2020-10-24 19:59 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024195819.clean.inflight
-rw-rw-rw- 1 1479 2020-10-24 19:59 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024195819.clean.requested
-rw-rw-rw- 1 1291 2020-10-24 20:11 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024201122.rollback
-rw-rw-rw- 1 0 2020-10-24 20:11 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024201122.rollback.inflight
-rw-rw-rw- 1 16590 2020-10-24 20:21 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024201124.clean
-rw-rw-rw- 1 5440 2020-10-24 20:21 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024201124.clean.inflight
-rw-rw-rw- 1 5440 2020-10-24 20:21 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024201124.clean.requested
-rw-rw-rw- 1 24036 2020-10-24 20:34 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024202421.clean
-rw-rw-rw- 1 7521 2020-10-24 20:34 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024202421.clean.inflight
-rw-rw-rw- 1 7521 2020-10-24 20:34 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024202421.clean.requested
-rw-rw-rw- 1 247163 2020-10-24 20:33 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024202421.deltacommit
-rw-rw-rw- 1 188642 2020-10-24 20:26 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024202421.deltacommit.inflight
-rw-rw-rw- 1 0 2020-10-24 20:24 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024202421.deltacommit.requested
-rw-rw-rw- 1 8860 2020-10-24 20:38 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024203445.commit
-rw-rw-rw- 1 0 2020-10-24 20:36 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024203445.compaction.inflight
-rw-rw-rw- 1 5801 2020-10-24 20:35 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024203445.compaction.requested
-rw-rw-rw- 1 43009 2020-10-24 20:47 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024203654.clean
-rw-rw-rw- 1 23899 2020-10-24 20:47 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024203654.clean.inflight
-rw-rw-rw- 1 23899 2020-10-24 20:47 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024203654.clean.requested
-rw-rw-rw- 1 245876 2020-10-24 20:45 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024203654.deltacommit
-rw-rw-rw- 1 206099 2020-10-24 20:39 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024203654.deltacommit.inflight
-rw-rw-rw- 1 0 2020-10-24 20:36 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024203654.deltacommit.requested
-rw-rw-rw- 1 52387 2020-10-24 20:58 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024204818.clean
-rw-rw-rw- 1 31371 2020-10-24 20:58 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024204818.clean.inflight
-rw-rw-rw- 1 31371 2020-10-24 20:58 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024204818.clean.requested
-rw-rw-rw- 1 246364 2020-10-24 20:56 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024204818.deltacommit
-rw-rw-rw- 1 213359 2020-10-24 20:50 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024204818.deltacommit.inflight
-rw-rw-rw- 1 0 2020-10-24 20:48 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024204818.deltacommit.requested
-rw-rw-rw- 1 55001 2020-10-24 21:09 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024205856.clean
-rw-rw-rw- 1 31605 2020-10-24 21:09 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024205856.clean.inflight
-rw-rw-rw- 1 31605 2020-10-24 21:09 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024205856.clean.requested
-rw-rw-rw- 1 267488 2020-10-24 21:07 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024205856.deltacommit
-rw-rw-rw- 1 249627 2020-10-24 21:01 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024205856.deltacommit.inflight
-rw-rw-rw- 1 0 2020-10-24 20:58 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024205856.deltacommit.requested
-rw-rw-rw- 1 57377 2020-10-24 21:20 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024211018.clean
-rw-rw-rw- 1 32000 2020-10-24 21:20 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024211018.clean.inflight
-rw-rw-rw- 1 32000 2020-10-24 21:20 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024211018.clean.requested
-rw-rw-rw- 1 256334 2020-10-24 21:18 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024211018.deltacommit
-rw-rw-rw- 1 242389 2020-10-24 21:12 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024211018.deltacommit.inflight
-rw-rw-rw- 1 0 2020-10-24 21:10 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024211018.deltacommit.requested
-rw-rw-rw- 1 60122 2020-10-24 21:31 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024212109.clean
-rw-rw-rw- 1 33583 2020-10-24 21:31 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024212109.clean.inflight
-rw-rw-rw- 1 33583 2020-10-24 21:31 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024212109.clean.requested
-rw-rw-rw- 1 258241 2020-10-24 21:29 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024212109.deltacommit
-rw-rw-rw- 1 232275 2020-10-24 21:23 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024212109.deltacommit.inflight
-rw-rw-rw- 1 0 2020-10-24 21:21 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024212109.deltacommit.requested
-rw-rw-rw- 1 121261 2020-10-24 21:40 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024213135.commit
-rw-rw-rw- 1 0 2020-10-24 21:35 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024213135.compaction.inflight
-rw-rw-rw- 1 69629 2020-10-24 21:33 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024213135.compaction.requested
-rw-rw-rw- 1 75004 2020-10-24 21:49 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024213528.clean
-rw-rw-rw- 1 48155 2020-10-24 21:49 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024213528.clean.inflight
-rw-rw-rw- 1 48155 2020-10-24 21:49 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024213528.clean.requested
-rw-rw-rw- 1 278416 2020-10-24 21:47 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024213528.deltacommit
-rw-rw-rw- 1 261718 2020-10-24 21:41 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024213528.deltacommit.inflight
-rw-rw-rw- 1 0 2020-10-24 21:35 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024213528.deltacommit.requested
-rw-rw-rw- 1 70405 2020-10-24 22:00 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024215013.clean
-rw-rw-rw- 1 42191 2020-10-24 22:00 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024215013.clean.inflight
-rw-rw-rw- 1 42191 2020-10-24 22:00 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024215013.clean.requested
-rw-rw-rw- 1 256728 2020-10-24 21:58 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024215013.deltacommit
-rw-rw-rw- 1 252228 2020-10-24 21:52 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024215013.deltacommit.inflight
-rw-rw-rw- 1 0 2020-10-24 21:50 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024215013.deltacommit.requested
-rw-rw-rw- 1 67460 2020-10-24 22:12 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024220139.clean
-rw-rw-rw- 1 37774 2020-10-24 22:12 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024220139.clean.inflight
-rw-rw-rw- 1 37774 2020-10-24 22:12 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024220139.clean.requested
-rw-rw-rw- 1 270750 2020-10-24 22:10 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024220139.deltacommit
-rw-rw-rw- 1 258152 2020-10-24 22:04 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024220139.deltacommit.inflight
-rw-rw-rw- 1 0 2020-10-24 22:01 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024220139.deltacommit.requested
-rw-rw-rw- 1 69020 2020-10-24 22:23 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024221304.clean
-rw-rw-rw- 1 38429 2020-10-24 22:23 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024221304.clean.inflight
-rw-rw-rw- 1 38429 2020-10-24 22:23 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024221304.clean.requested
-rw-rw-rw- 1 260203 2020-10-24 22:21 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024221304.deltacommit
-rw-rw-rw- 1 247678 2020-10-24 22:15 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024221304.deltacommit.inflight
-rw-rw-rw- 1 0 2020-10-24 22:13 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024221304.deltacommit.requested
-rw-rw-rw- 1 68493 2020-10-24 22:35 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024222402.clean
-rw-rw-rw- 1 36710 2020-10-24 22:35 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024222402.clean.inflight
-rw-rw-rw- 1 36710 2020-10-24 22:35 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024222402.clean.requested
-rw-rw-rw- 1 280449 2020-10-24 22:32 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024222402.deltacommit
-rw-rw-rw- 1 271186 2020-10-24 22:26 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024222402.deltacommit.inflight
-rw-rw-rw- 1 0 2020-10-24 22:24 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024222402.deltacommit.requested
-rw-rw-rw- 1 232548 2020-10-24 22:45 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024223531.commit
-rw-rw-rw- 1 0 2020-10-24 22:37 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024223531.compaction.inflight
-rw-rw-rw- 1 129834 2020-10-24 22:37 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024223531.compaction.requested
-rw-rw-rw- 1 103805 2020-10-24 22:56 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024223749.clean
-rw-rw-rw- 1 72989 2020-10-24 22:55 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024223749.clean.inflight
-rw-rw-rw- 1 72989 2020-10-24 22:55 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024223749.clean.requested
-rw-rw-rw- 1 261991 2020-10-24 22:53 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024223749.deltacommit
-rw-rw-rw- 1 248191 2020-10-24 22:46 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024223749.deltacommit.inflight
-rw-rw-rw- 1 0 2020-10-24 22:37 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024223749.deltacommit.requested
-rw-rw-rw- 1 84058 2020-10-24 23:09 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024225618.clean
-rw-rw-rw- 1 51335 2020-10-24 23:08 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024225618.clean.inflight
-rw-rw-rw- 1 51335 2020-10-24 23:08 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024225618.clean.requested
-rw-rw-rw- 1 287687 2020-10-24 23:06 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024225618.deltacommit
-rw-rw-rw- 1 240213 2020-10-24 22:58 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024225618.deltacommit.inflight
-rw-rw-rw- 1 0 2020-10-24 22:56 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024225618.deltacommit.requested
-rw-rw-rw- 1 84696 2020-10-24 23:22 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024230917.clean
-rw-rw-rw- 1 51153 2020-10-24 23:21 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024230917.clean.inflight
-rw-rw-rw- 1 51153 2020-10-24 23:21 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024230917.clean.requested
-rw-rw-rw- 1 340344 2020-10-24 23:19 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024230917.deltacommit
-rw-rw-rw- 1 278295 2020-10-24 23:12 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024230917.deltacommit.inflight
-rw-rw-rw- 1 0 2020-10-24 23:09 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024230917.deltacommit.requested
-rw-rw-rw- 1 79696 2020-10-24 23:34 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024232220.clean
-rw-rw-rw- 1 45309 2020-10-24 23:34 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024232220.clean.inflight
-rw-rw-rw- 1 45309 2020-10-24 23:34 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024232220.clean.requested
-rw-rw-rw- 1 314247 2020-10-24 23:31 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024232220.deltacommit
-rw-rw-rw- 1 262662 2020-10-24 23:25 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024232220.deltacommit.inflight
-rw-rw-rw- 1 0 2020-10-24 23:22 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024232220.deltacommit.requested
-rw-rw-rw- 1 78586 2020-10-24 23:48 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024233457.clean
-rw-rw-rw- 1 43673 2020-10-24 23:48 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024233457.clean.inflight
-rw-rw-rw- 1 43673 2020-10-24 23:48 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024233457.clean.requested
-rw-rw-rw- 1 344478 2020-10-24 23:45 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024233457.deltacommit
-rw-rw-rw- 1 280170 2020-10-24 23:38 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024233457.deltacommit.inflight
-rw-rw-rw- 1 0 2020-10-24 23:34 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024233457.deltacommit.requested
-rw-rw-rw- 1 267181 2020-10-25 00:03 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024234832.commit
-rw-rw-rw- 1 0 2020-10-24 23:51 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024234832.compaction.inflight
-rw-rw-rw- 1 161578 2020-10-24 23:51 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024234832.compaction.requested
-rw-rw-rw- 1 116432 2020-10-25 00:19 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024235122.clean
-rw-rw-rw- 1 82509 2020-10-25 00:18 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024235122.clean.inflight
-rw-rw-rw- 1 82509 2020-10-25 00:18 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024235122.clean.requested
-rw-rw-rw- 1 375112 2020-10-25 00:15 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024235122.deltacommit
-rw-rw-rw- 1 306974 2020-10-25 00:06 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024235122.deltacommit.inflight
-rw-rw-rw- 1 0 2020-10-24 23:51 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201024235122.deltacommit.requested
-rw-rw-rw- 1 85075 2020-10-25 00:39 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201025001932.clean
-rw-rw-rw- 1 49052 2020-10-25 00:38 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201025001932.clean.inflight
-rw-rw-rw- 1 49052 2020-10-25 00:38 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201025001932.clean.requested
-rw-rw-rw- 1 347734 2020-10-25 00:35 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201025001932.deltacommit
-rw-rw-rw- 1 308098 2020-10-25 00:26 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201025001932.deltacommit.inflight
-rw-rw-rw- 1 0 2020-10-25 00:19 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201025001932.deltacommit.requested
-rw-rw-rw- 1 90054 2020-10-25 01:01 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201025003925.clean
-rw-rw-rw- 1 53688 2020-10-25 01:00 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201025003925.clean.inflight
-rw-rw-rw- 1 53688 2020-10-25 01:00 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201025003925.clean.requested
-rw-rw-rw- 1 356835 2020-10-25 00:57 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201025003925.deltacommit
-rw-rw-rw- 1 317327 2020-10-25 00:47 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201025003925.deltacommit.inflight
-rw-rw-rw- 1 0 2020-10-25 00:39 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201025003925.deltacommit.requested
-rw-rw-rw- 1 83010 2020-10-25 01:24 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201025010134.clean
-rw-rw-rw- 1 45704 2020-10-25 01:24 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201025010134.clean.inflight
-rw-rw-rw- 1 45704 2020-10-25 01:24 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201025010134.clean.requested
-rw-rw-rw- 1 322459 2020-10-25 01:20 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201025010134.deltacommit
-rw-rw-rw- 1 286334 2020-10-25 01:10 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201025010134.deltacommit.inflight
-rw-rw-rw- 1 0 2020-10-25 01:01 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201025010134.deltacommit.requested
-rw-rw-rw- 1 90374 2020-10-25 01:55 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201025012452.clean
-rw-rw-rw- 1 52922 2020-10-25 01:54 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201025012452.clean.inflight
-rw-rw-rw- 1 52922 2020-10-25 01:54 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201025012452.clean.requested
-rw-rw-rw- 1 332491 2020-10-25 01:50 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201025012452.deltacommit
-rw-rw-rw- 1 292832 2020-10-25 01:36 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201025012452.deltacommit.inflight
-rw-rw-rw- 1 0 2020-10-25 01:24 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201025012452.deltacommit.requested
-rw-rw-rw- 1 0 2020-10-25 01:59 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201025015502.compaction.inflight
-rw-rw-rw- 1 216721 2020-10-25 01:58 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201025015502.compaction.requested
-rw-rw-rw- 1 0 2020-10-25 01:59 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/20201025015904.deltacommit.requested
drwxrwxrwx - 0 1970-01-01 05:30 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/archived
-rw-rw-rw- 1 381 2020-10-21 12:50 s3a://xxxx/test/hudi/data/platform/udfinfo/.hoodie/hoodie.properties
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] bvaradar commented on issue #2203: [SUPPORT] Hive query on HUDI table with partition column in where condition returning no results
Posted by GitBox <gi...@apache.org>.
bvaradar commented on issue #2203:
URL: https://github.com/apache/hudi/issues/2203#issuecomment-716642421
@BalaMahesh : Is your empty query of the form : select * from _ro where dt="2020-04-04" ?
Can you also find the unique partition paths that is stored in Hudi - "select dictinct(`_hoodie_partition_path`) from _ro". Can you copy the output here. Does this match with the query filter you are passing ? Also, Are these values consistent with partition values in hive metastore ?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] bvaradar commented on issue #2203: [SUPPORT] Hive query on HUDI table with partition column in where condition returning no results
Posted by GitBox <gi...@apache.org>.
bvaradar commented on issue #2203:
URL: https://github.com/apache/hudi/issues/2203#issuecomment-752990069
Closing due to inactivity.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] bvaradar commented on issue #2203: [SUPPORT] Hive query on HUDI table with partition column in where condition returning no results
Posted by GitBox <gi...@apache.org>.
bvaradar commented on issue #2203:
URL: https://github.com/apache/hudi/issues/2203#issuecomment-717408603
@BalaMahesh : Yeah, they are both consistent. The usual suspects are all fine. The best way to debug is to have some kind of session logging enabled for this hive query and see what hive and hudi does. Another thing is to use spark to directly load the parquet file and look at the record contents to confirm if the parquet file has 1 record.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] bvaradar commented on issue #2203: [SUPPORT] Hive query on HUDI table with partition column in where condition returning no results
Posted by GitBox <gi...@apache.org>.
bvaradar commented on issue #2203:
URL: https://github.com/apache/hudi/issues/2203#issuecomment-718072464
Thanks @BalaMahesh for the clarification. Can you also provide the debug logs for the query which successfully provides the value and the one that does not. We need to look at the execution path to see what is happening.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] BalaMahesh edited a comment on issue #2203: [SUPPORT] Hive query on HUDI table with partition column in where condition returning no results
Posted by GitBox <gi...@apache.org>.
BalaMahesh edited a comment on issue #2203:
URL: https://github.com/apache/hudi/issues/2203#issuecomment-716977933
@bvaradar : Yes the query is in that form
select distinct(_hoodie_partition_path) from _ro :
Result :
dt=2020-03-13
dt=2020-03-14
dt=2020-03-15
dt=2020-03-16
dt=2020-03-17
dt=2020-03-18
dt=2020-03-19
dt=2020-03-20
dt=2020-03-21
dt=2020-03-22
dt=2020-03-23
dt=2020-03-24
dt=2020-03-26
dt=2020-03-27
dt=2020-04-01
dt=2020-04-04
dt=2020-04-09
dt=2020-04-10
dt=2020-04-11
dt=2020-04-13
dt=2020-04-20
dt=2020-04-25
dt=2020-04-27
dt=2020-05-03
dt=2020-05-04
dt=2020-05-16
dt=2020-05-17
dt=2020-05-18
dt=2020-05-19
dt=2020-05-21
dt=2020-05-22
dt=2020-05-23
dt=2020-05-25
dt=2020-05-26
dt=2020-05-27
dt=2020-05-28
dt=2020-05-29
dt=2020-05-30
dt=2020-05-31
dt=2020-06-01
dt=2020-06-02
dt=2020-06-03
dt=2020-06-04
Show partitions :
result :
dt=2020-03-13
dt=2020-03-14
dt=2020-03-15
dt=2020-03-16
dt=2020-03-17
dt=2020-03-18
dt=2020-03-19
dt=2020-03-20
dt=2020-03-21
dt=2020-03-22
dt=2020-03-23
dt=2020-03-24
dt=2020-03-26
dt=2020-03-27
dt=2020-04-01
dt=2020-04-04
dt=2020-04-09
dt=2020-04-10
dt=2020-04-11
dt=2020-04-13
dt=2020-04-20
dt=2020-04-25
dt=2020-04-27
dt=2020-05-03
dt=2020-05-04
dt=2020-05-16
dt=2020-05-17
dt=2020-05-18
dt=2020-05-19
dt=2020-05-21
dt=2020-05-22
dt=2020-05-23
dt=2020-05-25
dt=2020-05-26
dt=2020-05-27
dt=2020-05-28
dt=2020-05-29
dt=2020-05-30
dt=2020-05-31
dt=2020-06-01
dt=2020-06-02
dt=2020-06-03
dt=2020-06-04
From the results I can see that there is consistency .
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] bvaradar commented on issue #2203: [SUPPORT] Hive query on HUDI table with partition column in where condition returning no results
Posted by GitBox <gi...@apache.org>.
bvaradar commented on issue #2203:
URL: https://github.com/apache/hudi/issues/2203#issuecomment-716322522
I have seen this issue when partitions are not registered correctly. Can you provide
1. desc formatted tbl_name
2. describe formatted tbl_name partition (<partition>)
3. Full path of the directory of hudi dataset
4. Full path of the partition of the hudi dataset
5. Directory listing of the partition
6. Directory listing of .hoodie folder.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] BalaMahesh commented on issue #2203: [SUPPORT] Hive query on HUDI table with partition column in where condition returning no results
Posted by GitBox <gi...@apache.org>.
BalaMahesh commented on issue #2203:
URL: https://github.com/apache/hudi/issues/2203#issuecomment-715926815
Update 1:
I have observed that if partition gets data in the next run, query is returning data. If any other partition has only 1 row, then this issue is happening with those partitions . Is there anything required to be changed in the hive ?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] BalaMahesh commented on issue #2203: [SUPPORT] Hive query on HUDI table with partition column in where condition returning no results
Posted by GitBox <gi...@apache.org>.
BalaMahesh commented on issue #2203:
URL: https://github.com/apache/hudi/issues/2203#issuecomment-716977933
@bvaradar : Yes the query is in that form
select distinct(_hoodie_partition_path) from _ro :
Result :
dt=2020-03-13
dt=2020-03-14
dt=2020-03-15
dt=2020-03-16
dt=2020-03-17
dt=2020-03-18
dt=2020-03-19
dt=2020-03-20
dt=2020-03-21
dt=2020-03-22
dt=2020-03-23
dt=2020-03-24
dt=2020-03-26
dt=2020-03-27
dt=2020-04-01
dt=2020-04-04
dt=2020-04-09
dt=2020-04-10
dt=2020-04-11
dt=2020-04-13
dt=2020-04-20
dt=2020-04-25
dt=2020-04-27
dt=2020-05-03
dt=2020-05-04
dt=2020-05-16
dt=2020-05-17
dt=2020-05-18
dt=2020-05-19
dt=2020-05-21
dt=2020-05-22
dt=2020-05-23
dt=2020-05-25
dt=2020-05-26
dt=2020-05-27
dt=2020-05-28
dt=2020-05-29
dt=2020-05-30
dt=2020-05-31
dt=2020-06-01
dt=2020-06-02
dt=2020-06-03
dt=2020-06-04
Show partitions :
result :
dt=2020-03-13
dt=2020-03-14
dt=2020-03-15
dt=2020-03-16
dt=2020-03-17
dt=2020-03-18
dt=2020-03-19
dt=2020-03-20
dt=2020-03-21
dt=2020-03-22
dt=2020-03-23
dt=2020-03-24
dt=2020-03-26
dt=2020-03-27
dt=2020-04-01
dt=2020-04-04
dt=2020-04-09
dt=2020-04-10
dt=2020-04-11
dt=2020-04-13
dt=2020-04-20
dt=2020-04-25
dt=2020-04-27
dt=2020-05-03
dt=2020-05-04
dt=2020-05-16
dt=2020-05-17
dt=2020-05-18
dt=2020-05-19
dt=2020-05-21
dt=2020-05-22
dt=2020-05-23
dt=2020-05-25
dt=2020-05-26
dt=2020-05-27
dt=2020-05-28
dt=2020-05-29
dt=2020-05-30
dt=2020-05-31
dt=2020-06-01
dt=2020-06-02
dt=2020-06-03
dt=2020-06-04
dt=2020-06-05
dt=2020-06-06
dt=2020-06-07
dt=2020-06-08
dt=2020-06-09
dt=2020-06-10
dt=2020-06-11
dt=2020-06-12
dt=2020-06-13
dt=2020-06-14
From the results I can see that there is consistency .
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] bvaradar closed issue #2203: [SUPPORT] Hive query on HUDI table with partition column in where condition returning no results
Posted by GitBox <gi...@apache.org>.
bvaradar closed issue #2203:
URL: https://github.com/apache/hudi/issues/2203
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] BalaMahesh commented on issue #2203: [SUPPORT] Hive query on HUDI table with partition column in where condition returning no results
Posted by GitBox <gi...@apache.org>.
BalaMahesh commented on issue #2203:
URL: https://github.com/apache/hudi/issues/2203#issuecomment-717769134
@bvaradar I have used the parquet-tools jar to read the content of underlying base file and was able to see the single row.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org