You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Hive QA (JIRA)" <ji...@apache.org> on 2018/11/05 19:11:01 UTC

[jira] [Commented] (HIVE-20079) Populate more accurate rawDataSize for parquet format

    [ https://issues.apache.org/jira/browse/HIVE-20079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16675625#comment-16675625 ] 

Hive QA commented on HIVE-20079:
--------------------------------



Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12931668/HIVE-20079.2.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/14750/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/14750/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-14750/

Messages:
{noformat}
**** This message was trimmed, see log for full details ****
Applied patch to 'ql/src/test/results/clientpositive/parquet_vectorization_limit.q.out' cleanly.
error: patch failed: ql/src/test/results/clientpositive/spark/parquet_vectorization_0.q.out:34
Falling back to three-way merge...
Applied patch to 'ql/src/test/results/clientpositive/spark/parquet_vectorization_0.q.out' with conflicts.
error: patch failed: ql/src/test/results/clientpositive/spark/parquet_vectorization_1.q.out:60
Falling back to three-way merge...
Applied patch to 'ql/src/test/results/clientpositive/spark/parquet_vectorization_1.q.out' cleanly.
error: patch failed: ql/src/test/results/clientpositive/spark/parquet_vectorization_10.q.out:64
Falling back to three-way merge...
Applied patch to 'ql/src/test/results/clientpositive/spark/parquet_vectorization_10.q.out' with conflicts.
error: patch failed: ql/src/test/results/clientpositive/spark/parquet_vectorization_11.q.out:46
Falling back to three-way merge...
Applied patch to 'ql/src/test/results/clientpositive/spark/parquet_vectorization_11.q.out' cleanly.
error: patch failed: ql/src/test/results/clientpositive/spark/parquet_vectorization_12.q.out:83
Falling back to three-way merge...
Applied patch to 'ql/src/test/results/clientpositive/spark/parquet_vectorization_12.q.out' with conflicts.
error: patch failed: ql/src/test/results/clientpositive/spark/parquet_vectorization_13.q.out:85
Falling back to three-way merge...
Applied patch to 'ql/src/test/results/clientpositive/spark/parquet_vectorization_13.q.out' with conflicts.
error: patch failed: ql/src/test/results/clientpositive/spark/parquet_vectorization_14.q.out:85
Falling back to three-way merge...
Applied patch to 'ql/src/test/results/clientpositive/spark/parquet_vectorization_14.q.out' with conflicts.
error: patch failed: ql/src/test/results/clientpositive/spark/parquet_vectorization_15.q.out:81
Falling back to three-way merge...
Applied patch to 'ql/src/test/results/clientpositive/spark/parquet_vectorization_15.q.out' with conflicts.
error: patch failed: ql/src/test/results/clientpositive/spark/parquet_vectorization_16.q.out:58
Falling back to three-way merge...
Applied patch to 'ql/src/test/results/clientpositive/spark/parquet_vectorization_16.q.out' with conflicts.
error: patch failed: ql/src/test/results/clientpositive/spark/parquet_vectorization_17.q.out:66
Falling back to three-way merge...
Applied patch to 'ql/src/test/results/clientpositive/spark/parquet_vectorization_17.q.out' with conflicts.
error: patch failed: ql/src/test/results/clientpositive/spark/parquet_vectorization_2.q.out:64
Falling back to three-way merge...
Applied patch to 'ql/src/test/results/clientpositive/spark/parquet_vectorization_2.q.out' cleanly.
error: patch failed: ql/src/test/results/clientpositive/spark/parquet_vectorization_3.q.out:69
Falling back to three-way merge...
Applied patch to 'ql/src/test/results/clientpositive/spark/parquet_vectorization_3.q.out' cleanly.
error: patch failed: ql/src/test/results/clientpositive/spark/parquet_vectorization_4.q.out:64
Falling back to three-way merge...
Applied patch to 'ql/src/test/results/clientpositive/spark/parquet_vectorization_4.q.out' cleanly.
error: patch failed: ql/src/test/results/clientpositive/spark/parquet_vectorization_5.q.out:58
Falling back to three-way merge...
Applied patch to 'ql/src/test/results/clientpositive/spark/parquet_vectorization_5.q.out' cleanly.
error: patch failed: ql/src/test/results/clientpositive/spark/parquet_vectorization_6.q.out:58
Falling back to three-way merge...
Applied patch to 'ql/src/test/results/clientpositive/spark/parquet_vectorization_6.q.out' with conflicts.
error: patch failed: ql/src/test/results/clientpositive/spark/parquet_vectorization_7.q.out:72
Falling back to three-way merge...
Applied patch to 'ql/src/test/results/clientpositive/spark/parquet_vectorization_7.q.out' with conflicts.
error: patch failed: ql/src/test/results/clientpositive/spark/parquet_vectorization_8.q.out:68
Falling back to three-way merge...
Applied patch to 'ql/src/test/results/clientpositive/spark/parquet_vectorization_8.q.out' with conflicts.
error: patch failed: ql/src/test/results/clientpositive/spark/parquet_vectorization_9.q.out:58
Falling back to three-way merge...
Applied patch to 'ql/src/test/results/clientpositive/spark/parquet_vectorization_9.q.out' with conflicts.
error: patch failed: ql/src/test/results/clientpositive/spark/parquet_vectorization_limit.q.out:98
Falling back to three-way merge...
Applied patch to 'ql/src/test/results/clientpositive/spark/parquet_vectorization_limit.q.out' with conflicts.
Going to apply patch with: git apply -p1
/data/hiveptest/working/scratch/build.patch:203: trailing whitespace.
    
/data/hiveptest/working/scratch/build.patch:1355: trailing whitespace.
	rawDataSize         	5936                
/data/hiveptest/working/scratch/build.patch:2049: trailing whitespace.
# col_name            	data_type           	comment             
/data/hiveptest/working/scratch/build.patch:2050: trailing whitespace.
id                  	int                 	                    
/data/hiveptest/working/scratch/build.patch:2051: trailing whitespace.
str                 	string              	                    
error: patch failed: ql/src/test/results/clientpositive/parquet_analyze.q.out:93
Falling back to three-way merge...
Applied patch to 'ql/src/test/results/clientpositive/parquet_analyze.q.out' with conflicts.
error: patch failed: ql/src/test/results/clientpositive/parquet_map_type_vectorization.q.out:125
Falling back to three-way merge...
Applied patch to 'ql/src/test/results/clientpositive/parquet_map_type_vectorization.q.out' with conflicts.
error: patch failed: ql/src/test/results/clientpositive/parquet_vectorization_13.q.out:80
Falling back to three-way merge...
Applied patch to 'ql/src/test/results/clientpositive/parquet_vectorization_13.q.out' with conflicts.
error: patch failed: ql/src/test/results/clientpositive/parquet_vectorization_14.q.out:80
Falling back to three-way merge...
Applied patch to 'ql/src/test/results/clientpositive/parquet_vectorization_14.q.out' cleanly.
error: patch failed: ql/src/test/results/clientpositive/parquet_vectorization_15.q.out:76
Falling back to three-way merge...
Applied patch to 'ql/src/test/results/clientpositive/parquet_vectorization_15.q.out' cleanly.
error: patch failed: ql/src/test/results/clientpositive/parquet_vectorization_16.q.out:53
Falling back to three-way merge...
Applied patch to 'ql/src/test/results/clientpositive/parquet_vectorization_16.q.out' cleanly.
error: patch failed: ql/src/test/results/clientpositive/parquet_vectorization_17.q.out:61
Falling back to three-way merge...
Applied patch to 'ql/src/test/results/clientpositive/parquet_vectorization_17.q.out' cleanly.
error: patch failed: ql/src/test/results/clientpositive/parquet_vectorization_2.q.out:59
Falling back to three-way merge...
Applied patch to 'ql/src/test/results/clientpositive/parquet_vectorization_2.q.out' cleanly.
error: patch failed: ql/src/test/results/clientpositive/parquet_vectorization_3.q.out:64
Falling back to three-way merge...
Applied patch to 'ql/src/test/results/clientpositive/parquet_vectorization_3.q.out' cleanly.
error: patch failed: ql/src/test/results/clientpositive/parquet_vectorization_4.q.out:59
Falling back to three-way merge...
Applied patch to 'ql/src/test/results/clientpositive/parquet_vectorization_4.q.out' cleanly.
error: patch failed: ql/src/test/results/clientpositive/parquet_vectorization_5.q.out:53
Falling back to three-way merge...
Applied patch to 'ql/src/test/results/clientpositive/parquet_vectorization_5.q.out' cleanly.
error: patch failed: ql/src/test/results/clientpositive/parquet_vectorization_6.q.out:55
Falling back to three-way merge...
Applied patch to 'ql/src/test/results/clientpositive/parquet_vectorization_6.q.out' with conflicts.
error: patch failed: ql/src/test/results/clientpositive/parquet_vectorization_7.q.out:67
Falling back to three-way merge...
Applied patch to 'ql/src/test/results/clientpositive/parquet_vectorization_7.q.out' with conflicts.
error: patch failed: ql/src/test/results/clientpositive/parquet_vectorization_8.q.out:63
Falling back to three-way merge...
Applied patch to 'ql/src/test/results/clientpositive/parquet_vectorization_8.q.out' with conflicts.
error: patch failed: ql/src/test/results/clientpositive/parquet_vectorization_9.q.out:53
Falling back to three-way merge...
Applied patch to 'ql/src/test/results/clientpositive/parquet_vectorization_9.q.out' cleanly.
error: patch failed: ql/src/test/results/clientpositive/parquet_vectorization_limit.q.out:90
Falling back to three-way merge...
Applied patch to 'ql/src/test/results/clientpositive/parquet_vectorization_limit.q.out' cleanly.
error: patch failed: ql/src/test/results/clientpositive/spark/parquet_vectorization_0.q.out:34
Falling back to three-way merge...
Applied patch to 'ql/src/test/results/clientpositive/spark/parquet_vectorization_0.q.out' with conflicts.
error: patch failed: ql/src/test/results/clientpositive/spark/parquet_vectorization_1.q.out:60
Falling back to three-way merge...
Applied patch to 'ql/src/test/results/clientpositive/spark/parquet_vectorization_1.q.out' cleanly.
error: patch failed: ql/src/test/results/clientpositive/spark/parquet_vectorization_10.q.out:64
Falling back to three-way merge...
Applied patch to 'ql/src/test/results/clientpositive/spark/parquet_vectorization_10.q.out' with conflicts.
error: patch failed: ql/src/test/results/clientpositive/spark/parquet_vectorization_11.q.out:46
Falling back to three-way merge...
Applied patch to 'ql/src/test/results/clientpositive/spark/parquet_vectorization_11.q.out' cleanly.
error: patch failed: ql/src/test/results/clientpositive/spark/parquet_vectorization_12.q.out:83
Falling back to three-way merge...
Applied patch to 'ql/src/test/results/clientpositive/spark/parquet_vectorization_12.q.out' with conflicts.
error: patch failed: ql/src/test/results/clientpositive/spark/parquet_vectorization_13.q.out:85
Falling back to three-way merge...
Applied patch to 'ql/src/test/results/clientpositive/spark/parquet_vectorization_13.q.out' with conflicts.
error: patch failed: ql/src/test/results/clientpositive/spark/parquet_vectorization_14.q.out:85
Falling back to three-way merge...
Applied patch to 'ql/src/test/results/clientpositive/spark/parquet_vectorization_14.q.out' with conflicts.
error: patch failed: ql/src/test/results/clientpositive/spark/parquet_vectorization_15.q.out:81
Falling back to three-way merge...
Applied patch to 'ql/src/test/results/clientpositive/spark/parquet_vectorization_15.q.out' with conflicts.
error: patch failed: ql/src/test/results/clientpositive/spark/parquet_vectorization_16.q.out:58
Falling back to three-way merge...
Applied patch to 'ql/src/test/results/clientpositive/spark/parquet_vectorization_16.q.out' with conflicts.
error: patch failed: ql/src/test/results/clientpositive/spark/parquet_vectorization_17.q.out:66
Falling back to three-way merge...
Applied patch to 'ql/src/test/results/clientpositive/spark/parquet_vectorization_17.q.out' with conflicts.
error: patch failed: ql/src/test/results/clientpositive/spark/parquet_vectorization_2.q.out:64
Falling back to three-way merge...
Applied patch to 'ql/src/test/results/clientpositive/spark/parquet_vectorization_2.q.out' cleanly.
error: patch failed: ql/src/test/results/clientpositive/spark/parquet_vectorization_3.q.out:69
Falling back to three-way merge...
Applied patch to 'ql/src/test/results/clientpositive/spark/parquet_vectorization_3.q.out' cleanly.
error: patch failed: ql/src/test/results/clientpositive/spark/parquet_vectorization_4.q.out:64
Falling back to three-way merge...
Applied patch to 'ql/src/test/results/clientpositive/spark/parquet_vectorization_4.q.out' cleanly.
error: patch failed: ql/src/test/results/clientpositive/spark/parquet_vectorization_5.q.out:58
Falling back to three-way merge...
Applied patch to 'ql/src/test/results/clientpositive/spark/parquet_vectorization_5.q.out' cleanly.
error: patch failed: ql/src/test/results/clientpositive/spark/parquet_vectorization_6.q.out:58
Falling back to three-way merge...
Applied patch to 'ql/src/test/results/clientpositive/spark/parquet_vectorization_6.q.out' with conflicts.
error: patch failed: ql/src/test/results/clientpositive/spark/parquet_vectorization_7.q.out:72
Falling back to three-way merge...
Applied patch to 'ql/src/test/results/clientpositive/spark/parquet_vectorization_7.q.out' with conflicts.
error: patch failed: ql/src/test/results/clientpositive/spark/parquet_vectorization_8.q.out:68
Falling back to three-way merge...
Applied patch to 'ql/src/test/results/clientpositive/spark/parquet_vectorization_8.q.out' with conflicts.
error: patch failed: ql/src/test/results/clientpositive/spark/parquet_vectorization_9.q.out:58
Falling back to three-way merge...
Applied patch to 'ql/src/test/results/clientpositive/spark/parquet_vectorization_9.q.out' with conflicts.
error: patch failed: ql/src/test/results/clientpositive/spark/parquet_vectorization_limit.q.out:98
Falling back to three-way merge...
Applied patch to 'ql/src/test/results/clientpositive/spark/parquet_vectorization_limit.q.out' with conflicts.
U ql/src/test/results/clientpositive/parquet_analyze.q.out
U ql/src/test/results/clientpositive/parquet_map_type_vectorization.q.out
U ql/src/test/results/clientpositive/parquet_vectorization_13.q.out
U ql/src/test/results/clientpositive/parquet_vectorization_6.q.out
U ql/src/test/results/clientpositive/parquet_vectorization_7.q.out
U ql/src/test/results/clientpositive/parquet_vectorization_8.q.out
U ql/src/test/results/clientpositive/spark/parquet_vectorization_0.q.out
U ql/src/test/results/clientpositive/spark/parquet_vectorization_10.q.out
U ql/src/test/results/clientpositive/spark/parquet_vectorization_12.q.out
U ql/src/test/results/clientpositive/spark/parquet_vectorization_13.q.out
U ql/src/test/results/clientpositive/spark/parquet_vectorization_14.q.out
U ql/src/test/results/clientpositive/spark/parquet_vectorization_15.q.out
U ql/src/test/results/clientpositive/spark/parquet_vectorization_16.q.out
U ql/src/test/results/clientpositive/spark/parquet_vectorization_17.q.out
U ql/src/test/results/clientpositive/spark/parquet_vectorization_6.q.out
U ql/src/test/results/clientpositive/spark/parquet_vectorization_7.q.out
U ql/src/test/results/clientpositive/spark/parquet_vectorization_8.q.out
U ql/src/test/results/clientpositive/spark/parquet_vectorization_9.q.out
U ql/src/test/results/clientpositive/spark/parquet_vectorization_limit.q.out
warning: squelched 23 whitespace errors
warning: 28 lines add whitespace errors.
+ result=1
+ '[' 1 -ne 0 ']'
+ rm -rf yetus_PreCommit-HIVE-Build-14750
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12931668 - PreCommit-HIVE-Build

> Populate more accurate rawDataSize for parquet format
> -----------------------------------------------------
>
>                 Key: HIVE-20079
>                 URL: https://issues.apache.org/jira/browse/HIVE-20079
>             Project: Hive
>          Issue Type: Improvement
>          Components: File Formats
>    Affects Versions: 2.0.0
>            Reporter: Aihua Xu
>            Priority: Major
>         Attachments: HIVE-20079.1.patch, HIVE-20079.2.patch
>
>
> Run the following queries and you will see the raw data for the table is 4 (that is the number of fields) incorrectly. We need to populate correct data size so data can be split properly.
> {noformat}
> SET hive.stats.autogather=true;
> CREATE TABLE parquet_stats (id int,str string) STORED AS PARQUET;
> INSERT INTO parquet_stats values(0, 'this is string 0'), (1, 'string 1');
> DESC FORMATTED parquet_stats;
> {noformat}
> {noformat}
> Table Parameters:
> 	COLUMN_STATS_ACCURATE	true
> 	numFiles            	1
> 	numRows             	2
> 	rawDataSize         	4
> 	totalSize           	373
> 	transient_lastDdlTime	1530660523
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)