You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by Shuaishuai Nie <sh...@microsoft.com> on 2013/12/12 19:30:18 UTC
Re: Review Request 15663: Hive should be able to skip header and footer rows
when reading data file for a table
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15663/#review30269
-----------------------------------------------------------
common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
<https://reviews.apache.org/r/15663/#comment57938>
this is the maximum number of footer a user can define. This prevent user defines too many footers which consume memory
common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
<https://reviews.apache.org/r/15663/#comment57939>
fixed
itests/qtest/pom.xml
<https://reviews.apache.org/r/15663/#comment57940>
this is necessary if the test is in the MimimrCliDriver test class
ql/src/java/org/apache/hadoop/hive/ql/exec/FetchOperator.java
<https://reviews.apache.org/r/15663/#comment57941>
comment added
ql/src/java/org/apache/hadoop/hive/ql/exec/FetchOperator.java
<https://reviews.apache.org/r/15663/#comment57942>
Hi Eric, do you mean I need a blank line before each comment? I didn't see this on other part of the code. Or you mean a space after "//"?
ql/src/java/org/apache/hadoop/hive/ql/exec/FetchOperator.java
<https://reviews.apache.org/r/15663/#comment57943>
fixed
ql/src/java/org/apache/hadoop/hive/ql/exec/FetchOperator.java
<https://reviews.apache.org/r/15663/#comment57944>
name changed
ql/src/java/org/apache/hadoop/hive/ql/exec/FetchOperator.java
<https://reviews.apache.org/r/15663/#comment57945>
Since the return behavior is different in both place, it is hard to reuse the code in both place because of the minor differences
ql/src/java/org/apache/hadoop/hive/ql/exec/FetchOperator.java
<https://reviews.apache.org/r/15663/#comment57946>
fixed
ql/src/java/org/apache/hadoop/hive/ql/exec/FetchOperator.java
<https://reviews.apache.org/r/15663/#comment57947>
fixed
ql/src/java/org/apache/hadoop/hive/ql/exec/FetchOperator.java
<https://reviews.apache.org/r/15663/#comment57948>
fixed
ql/src/java/org/apache/hadoop/hive/ql/io/HiveContextAwareRecordReader.java
<https://reviews.apache.org/r/15663/#comment57949>
fixed the comment. Since I need deep copy of the key and value field through ReflectionUtils, this new class is necessary
ql/src/java/org/apache/hadoop/hive/ql/io/HiveContextAwareRecordReader.java
<https://reviews.apache.org/r/15663/#comment57950>
fixed
ql/src/java/org/apache/hadoop/hive/ql/io/HiveContextAwareRecordReader.java
<https://reviews.apache.org/r/15663/#comment57951>
fixed
ql/src/java/org/apache/hadoop/hive/ql/io/HiveContextAwareRecordReader.java
<https://reviews.apache.org/r/15663/#comment57952>
fixed
ql/src/java/org/apache/hadoop/hive/ql/io/HiveContextAwareRecordReader.java
<https://reviews.apache.org/r/15663/#comment57953>
fixed
ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java
<https://reviews.apache.org/r/15663/#comment57954>
since the header and footer is removed based on each file, I think it should be fine if multiple splits are combined since each file will have its own path
ql/src/test/org/apache/hadoop/hive/ql/io/TestHiveBinarySearchRecordReader.java
<https://reviews.apache.org/r/15663/#comment57955>
yes, otherwise an exception will be thrown when accessing pathToPartitionInfo info during the test since the job context is incomplete in the unit test
ql/src/test/queries/clientpositive/file_with_header_footer.q
<https://reviews.apache.org/r/15663/#comment57956>
negative tests added for this senario
- Shuaishuai Nie
On Nov. 19, 2013, 1:31 a.m., Eric Hanson wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/15663/
> -----------------------------------------------------------
>
> (Updated Nov. 19, 2013, 1:31 a.m.)
>
>
> Review request for hive and Thejas Nair.
>
>
> Bugs: HIVE-5795
> https://issues.apache.org/jira/browse/HIVE-5795
>
>
> Repository: hive-git
>
>
> Description
> -------
>
> Hive should be able to skip header and footer rows when reading data file for a table
>
> (I am uploading this on behalf of Shuaishuai Nie since he's not in the office)
>
>
> Diffs
> -----
>
> common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 32ab3d8
> data/files/header_footer_table_1/0001.txt PRE-CREATION
> data/files/header_footer_table_1/0002.txt PRE-CREATION
> data/files/header_footer_table_1/0003.txt PRE-CREATION
> data/files/header_footer_table_2/2012/01/01/0001.txt PRE-CREATION
> data/files/header_footer_table_2/2012/01/02/0002.txt PRE-CREATION
> data/files/header_footer_table_2/2012/01/03/0003.txt PRE-CREATION
> itests/qtest/pom.xml a453d8a
> ql/src/java/org/apache/hadoop/hive/ql/exec/FetchOperator.java 5abcfc1
> ql/src/java/org/apache/hadoop/hive/ql/io/HiveContextAwareRecordReader.java dd5cb6b
> ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java 0ec6e63
> ql/src/test/org/apache/hadoop/hive/ql/io/TestHiveBinarySearchRecordReader.java 85dd975
> ql/src/test/org/apache/hadoop/hive/ql/io/TestSymlinkTextInputFormat.java 0686d9b
> ql/src/test/queries/clientpositive/file_with_header_footer.q PRE-CREATION
> ql/src/test/results/clientpositive/file_with_header_footer.q.out PRE-CREATION
>
> Diff: https://reviews.apache.org/r/15663/diff/
>
>
> Testing
> -------
>
>
> Thanks,
>
> Eric Hanson
>
>