You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by Shuaishuai Nie <sh...@microsoft.com> on 2013/12/12 19:30:18 UTC

Re: Review Request 15663: Hive should be able to skip header and footer rows when reading data file for a table

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15663/#review30269
-----------------------------------------------------------



common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
<https://reviews.apache.org/r/15663/#comment57938>

    this is the maximum number of footer a user can define. This prevent user defines too many footers which consume memory



common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
<https://reviews.apache.org/r/15663/#comment57939>

    fixed



itests/qtest/pom.xml
<https://reviews.apache.org/r/15663/#comment57940>

    this is necessary if the test is in the MimimrCliDriver test class



ql/src/java/org/apache/hadoop/hive/ql/exec/FetchOperator.java
<https://reviews.apache.org/r/15663/#comment57941>

    comment added



ql/src/java/org/apache/hadoop/hive/ql/exec/FetchOperator.java
<https://reviews.apache.org/r/15663/#comment57942>

    Hi Eric, do you mean I need a blank line before each comment? I didn't see this on other part of the code. Or you mean a space after "//"?



ql/src/java/org/apache/hadoop/hive/ql/exec/FetchOperator.java
<https://reviews.apache.org/r/15663/#comment57943>

    fixed



ql/src/java/org/apache/hadoop/hive/ql/exec/FetchOperator.java
<https://reviews.apache.org/r/15663/#comment57944>

    name changed



ql/src/java/org/apache/hadoop/hive/ql/exec/FetchOperator.java
<https://reviews.apache.org/r/15663/#comment57945>

    Since the return behavior is different in both place, it is hard to reuse the code in both place because of the minor differences



ql/src/java/org/apache/hadoop/hive/ql/exec/FetchOperator.java
<https://reviews.apache.org/r/15663/#comment57946>

    fixed



ql/src/java/org/apache/hadoop/hive/ql/exec/FetchOperator.java
<https://reviews.apache.org/r/15663/#comment57947>

    fixed



ql/src/java/org/apache/hadoop/hive/ql/exec/FetchOperator.java
<https://reviews.apache.org/r/15663/#comment57948>

    fixed



ql/src/java/org/apache/hadoop/hive/ql/io/HiveContextAwareRecordReader.java
<https://reviews.apache.org/r/15663/#comment57949>

    fixed the comment. Since I need deep copy of the key and value field through ReflectionUtils, this new class is necessary



ql/src/java/org/apache/hadoop/hive/ql/io/HiveContextAwareRecordReader.java
<https://reviews.apache.org/r/15663/#comment57950>

    fixed



ql/src/java/org/apache/hadoop/hive/ql/io/HiveContextAwareRecordReader.java
<https://reviews.apache.org/r/15663/#comment57951>

    fixed



ql/src/java/org/apache/hadoop/hive/ql/io/HiveContextAwareRecordReader.java
<https://reviews.apache.org/r/15663/#comment57952>

    fixed



ql/src/java/org/apache/hadoop/hive/ql/io/HiveContextAwareRecordReader.java
<https://reviews.apache.org/r/15663/#comment57953>

    fixed



ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java
<https://reviews.apache.org/r/15663/#comment57954>

    since the header and footer is removed based on each file, I think it should be fine if multiple splits are combined since each file will have its own path



ql/src/test/org/apache/hadoop/hive/ql/io/TestHiveBinarySearchRecordReader.java
<https://reviews.apache.org/r/15663/#comment57955>

    yes, otherwise an exception will be thrown when accessing pathToPartitionInfo info during the test since the job context is incomplete in the unit test



ql/src/test/queries/clientpositive/file_with_header_footer.q
<https://reviews.apache.org/r/15663/#comment57956>

    negative tests added for this senario


- Shuaishuai Nie


On Nov. 19, 2013, 1:31 a.m., Eric Hanson wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/15663/
> -----------------------------------------------------------
> 
> (Updated Nov. 19, 2013, 1:31 a.m.)
> 
> 
> Review request for hive and Thejas Nair.
> 
> 
> Bugs: HIVE-5795
>     https://issues.apache.org/jira/browse/HIVE-5795
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> Hive should be able to skip header and footer rows when reading data file for a table
> 
> (I am uploading this on behalf of Shuaishuai Nie since he's not in the office)
> 
> 
> Diffs
> -----
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 32ab3d8 
>   data/files/header_footer_table_1/0001.txt PRE-CREATION 
>   data/files/header_footer_table_1/0002.txt PRE-CREATION 
>   data/files/header_footer_table_1/0003.txt PRE-CREATION 
>   data/files/header_footer_table_2/2012/01/01/0001.txt PRE-CREATION 
>   data/files/header_footer_table_2/2012/01/02/0002.txt PRE-CREATION 
>   data/files/header_footer_table_2/2012/01/03/0003.txt PRE-CREATION 
>   itests/qtest/pom.xml a453d8a 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/FetchOperator.java 5abcfc1 
>   ql/src/java/org/apache/hadoop/hive/ql/io/HiveContextAwareRecordReader.java dd5cb6b 
>   ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java 0ec6e63 
>   ql/src/test/org/apache/hadoop/hive/ql/io/TestHiveBinarySearchRecordReader.java 85dd975 
>   ql/src/test/org/apache/hadoop/hive/ql/io/TestSymlinkTextInputFormat.java 0686d9b 
>   ql/src/test/queries/clientpositive/file_with_header_footer.q PRE-CREATION 
>   ql/src/test/results/clientpositive/file_with_header_footer.q.out PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/15663/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Eric Hanson
> 
>