You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by cheng xu <ch...@intel.com> on 2015/05/20 16:54:06 UTC
Review Request 34473: HIVE-10749 Implement Insert statement for
parquet
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34473/
-----------------------------------------------------------
Review request for hive, Alan Gates and Sergio Pena.
Bugs: HIVE-10749
https://issues.apache.org/jira/browse/HIVE-10749
Repository: hive-git
Description
-------
Implement the insert statement for parquet format.
Diffs
-----
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/MapredParquetInputFormat.java 000eb38
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/MapredParquetOutputFormat.java 8380117
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/VectorizedParquetInputFormat.java 4e1820c
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/acid/ParquetRawRecordMerger.java PRE-CREATION
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/acid/ParquetRecordUpdater.java PRE-CREATION
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/HiveSchemaConverter.java 43c772f
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/ParquetRecordReaderWrapper.java 0a5edbb
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ParquetStructObjectInspector.java PRE-CREATION
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/ParquetRecordWriterWrapper.java 0d32e49
ql/src/test/org/apache/hadoop/hive/ql/io/parquet/AbstractTestParquetDirect.java 5f7f597
ql/src/test/org/apache/hadoop/hive/ql/io/parquet/acid/TestParquetRecordUpdater.java PRE-CREATION
ql/src/test/queries/clientpositive/acid_parquet_insert.q PRE-CREATION
ql/src/test/results/clientpositive/acid_parquet_insert.q.out PRE-CREATION
Diff: https://reviews.apache.org/r/34473/diff/
Testing
-------
Newly added qtest and UT passed locally
Thanks,
cheng xu
Re: Review Request 34473: HIVE-10749 Implement Insert statement for
parquet
Posted by Sergio Pena <se...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34473/#review84521
-----------------------------------------------------------
I merged master to parquet, so that you don't have to the NullWritable in this patch.
Could you update the patch, and upload it again?
- Sergio Pena
On May 20, 2015, 2:54 p.m., cheng xu wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/34473/
> -----------------------------------------------------------
>
> (Updated May 20, 2015, 2:54 p.m.)
>
>
> Review request for hive, Alan Gates and Sergio Pena.
>
>
> Bugs: HIVE-10749
> https://issues.apache.org/jira/browse/HIVE-10749
>
>
> Repository: hive-git
>
>
> Description
> -------
>
> Implement the insert statement for parquet format.
>
>
> Diffs
> -----
>
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/MapredParquetInputFormat.java 000eb38
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/MapredParquetOutputFormat.java 8380117
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/VectorizedParquetInputFormat.java 4e1820c
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/acid/ParquetRawRecordMerger.java PRE-CREATION
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/acid/ParquetRecordUpdater.java PRE-CREATION
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/HiveSchemaConverter.java 43c772f
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/ParquetRecordReaderWrapper.java 0a5edbb
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ParquetStructObjectInspector.java PRE-CREATION
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/ParquetRecordWriterWrapper.java 0d32e49
> ql/src/test/org/apache/hadoop/hive/ql/io/parquet/AbstractTestParquetDirect.java 5f7f597
> ql/src/test/org/apache/hadoop/hive/ql/io/parquet/acid/TestParquetRecordUpdater.java PRE-CREATION
> ql/src/test/queries/clientpositive/acid_parquet_insert.q PRE-CREATION
> ql/src/test/results/clientpositive/acid_parquet_insert.q.out PRE-CREATION
>
> Diff: https://reviews.apache.org/r/34473/diff/
>
>
> Testing
> -------
>
> Newly added qtest and UT passed locally
>
>
> Thanks,
>
> cheng xu
>
>
Re: Review Request 34473: HIVE-10749 Implement Insert statement for
parquet
Posted by cheng xu <ch...@intel.com>.
> On May 20, 2015, 8:45 p.m., Alexander Pivovarov wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ParquetStructObjectInspector.java, line 207
> > <https://reviews.apache.org/r/34473/diff/1/?file=965270#file965270line207>
> >
> > you can use
> > final ArrayList<Object> list = new ArrayList<Object>(Collections.nCopies(fields.size(), null));
> > instead
I don't think so because only in the insert statement, we can't understand how to inspect the row object until creating parquet writer. This is why I create the new constructor in ParquetStructObjectInspector. Thank yoU!
- cheng
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34473/#review84574
-----------------------------------------------------------
On May 20, 2015, 2:54 p.m., cheng xu wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/34473/
> -----------------------------------------------------------
>
> (Updated May 20, 2015, 2:54 p.m.)
>
>
> Review request for hive, Alan Gates and Sergio Pena.
>
>
> Bugs: HIVE-10749
> https://issues.apache.org/jira/browse/HIVE-10749
>
>
> Repository: hive-git
>
>
> Description
> -------
>
> Implement the insert statement for parquet format.
>
>
> Diffs
> -----
>
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/MapredParquetInputFormat.java 000eb38
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/MapredParquetOutputFormat.java 8380117
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/VectorizedParquetInputFormat.java 4e1820c
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/acid/ParquetRawRecordMerger.java PRE-CREATION
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/acid/ParquetRecordUpdater.java PRE-CREATION
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/HiveSchemaConverter.java 43c772f
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/ParquetRecordReaderWrapper.java 0a5edbb
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ParquetStructObjectInspector.java PRE-CREATION
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/ParquetRecordWriterWrapper.java 0d32e49
> ql/src/test/org/apache/hadoop/hive/ql/io/parquet/AbstractTestParquetDirect.java 5f7f597
> ql/src/test/org/apache/hadoop/hive/ql/io/parquet/acid/TestParquetRecordUpdater.java PRE-CREATION
> ql/src/test/queries/clientpositive/acid_parquet_insert.q PRE-CREATION
> ql/src/test/results/clientpositive/acid_parquet_insert.q.out PRE-CREATION
>
> Diff: https://reviews.apache.org/r/34473/diff/
>
>
> Testing
> -------
>
> Newly added qtest and UT passed locally
>
>
> Thanks,
>
> cheng xu
>
>
Re: Review Request 34473: HIVE-10749 Implement Insert statement for
parquet
Posted by Alexander Pivovarov <ap...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34473/#review84574
-----------------------------------------------------------
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/MapredParquetOutputFormat.java
<https://reviews.apache.org/r/34473/#comment135898>
missing space after comma and before tableProperties
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ParquetStructObjectInspector.java
<https://reviews.apache.org/r/34473/#comment135901>
you can use
final ArrayList<Object> list = new ArrayList<Object>(Collections.nCopies(fields.size(), null));
instead
- Alexander Pivovarov
On May 20, 2015, 2:54 p.m., cheng xu wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/34473/
> -----------------------------------------------------------
>
> (Updated May 20, 2015, 2:54 p.m.)
>
>
> Review request for hive, Alan Gates and Sergio Pena.
>
>
> Bugs: HIVE-10749
> https://issues.apache.org/jira/browse/HIVE-10749
>
>
> Repository: hive-git
>
>
> Description
> -------
>
> Implement the insert statement for parquet format.
>
>
> Diffs
> -----
>
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/MapredParquetInputFormat.java 000eb38
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/MapredParquetOutputFormat.java 8380117
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/VectorizedParquetInputFormat.java 4e1820c
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/acid/ParquetRawRecordMerger.java PRE-CREATION
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/acid/ParquetRecordUpdater.java PRE-CREATION
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/HiveSchemaConverter.java 43c772f
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/ParquetRecordReaderWrapper.java 0a5edbb
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ParquetStructObjectInspector.java PRE-CREATION
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/write/ParquetRecordWriterWrapper.java 0d32e49
> ql/src/test/org/apache/hadoop/hive/ql/io/parquet/AbstractTestParquetDirect.java 5f7f597
> ql/src/test/org/apache/hadoop/hive/ql/io/parquet/acid/TestParquetRecordUpdater.java PRE-CREATION
> ql/src/test/queries/clientpositive/acid_parquet_insert.q PRE-CREATION
> ql/src/test/results/clientpositive/acid_parquet_insert.q.out PRE-CREATION
>
> Diff: https://reviews.apache.org/r/34473/diff/
>
>
> Testing
> -------
>
> Newly added qtest and UT passed locally
>
>
> Thanks,
>
> cheng xu
>
>
Re: Review Request 34473: HIVE-10749 Implement Insert statement for
parquet
Posted by cheng xu <ch...@intel.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34473/
-----------------------------------------------------------
(Updated May 28, 2015, 8:29 a.m.)
Review request for hive, Alan Gates, Owen O'Malley, and Sergio Pena.
Bugs: HIVE-10749
https://issues.apache.org/jira/browse/HIVE-10749
Repository: hive-git
Description
-------
Implement the insert statement for parquet format.
Diffs (updated)
-----
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/MapredParquetOutputFormat.java c6fb26c
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/acid/ParquetRecordUpdater.java PRE-CREATION
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/ParquetRecordReaderWrapper.java f513572
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ObjectArrayWritableObjectInspector.java 571f993
ql/src/test/org/apache/hadoop/hive/ql/io/parquet/acid/TestParquetRecordUpdater.java PRE-CREATION
ql/src/test/queries/clientpositive/acid_parquet_insert.q PRE-CREATION
ql/src/test/results/clientpositive/acid_parquet_insert.q.out PRE-CREATION
Diff: https://reviews.apache.org/r/34473/diff/
Testing
-------
Newly added qtest and UT passed locally
Thanks,
cheng xu
Re: Review Request 34473: HIVE-10749 Implement Insert statement for
parquet
Posted by Sergio Pena <se...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34473/#review85265
-----------------------------------------------------------
I run a INSERT command, but delta directories were not created.
-rwxr-xr-x 1 sergio supergroup 257 2015-05-26 16:24 /user/hive/warehouse/numbers_bucketed/000000_0
-rwxr-xr-x 1 sergio supergroup 265 2015-05-26 16:24 /user/hive/warehouse/numbers_bucketed/000001_0
- Sergio Pena
On May 22, 2015, 6:26 a.m., cheng xu wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/34473/
> -----------------------------------------------------------
>
> (Updated May 22, 2015, 6:26 a.m.)
>
>
> Review request for hive, Alan Gates, Owen O'Malley, and Sergio Pena.
>
>
> Bugs: HIVE-10749
> https://issues.apache.org/jira/browse/HIVE-10749
>
>
> Repository: hive-git
>
>
> Description
> -------
>
> Implement the insert statement for parquet format.
>
>
> Diffs
> -----
>
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/MapredParquetOutputFormat.java c6fb26c
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/acid/ParquetRecordUpdater.java PRE-CREATION
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/ParquetRecordReaderWrapper.java f513572
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ObjectArrayWritableObjectInspector.java 571f993
> ql/src/test/org/apache/hadoop/hive/ql/io/parquet/acid/TestParquetRecordUpdater.java PRE-CREATION
> ql/src/test/queries/clientpositive/acid_parquet_insert.q PRE-CREATION
> ql/src/test/results/clientpositive/acid_parquet_insert.q.out PRE-CREATION
>
> Diff: https://reviews.apache.org/r/34473/diff/
>
>
> Testing
> -------
>
> Newly added qtest and UT passed locally
>
>
> Thanks,
>
> cheng xu
>
>
Re: Review Request 34473: HIVE-10749 Implement Insert statement for
parquet
Posted by cheng xu <ch...@intel.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34473/
-----------------------------------------------------------
(Updated May 22, 2015, 6:26 a.m.)
Review request for hive, Alan Gates, Owen O'Malley, and Sergio Pena.
Changes
-------
Summary:
1. use some utility to reduce LOC
2. remove *ParquetRecordReaderWrapper.java* and use *ObjectArrayWritableObjectInspector* instead
Bugs: HIVE-10749
https://issues.apache.org/jira/browse/HIVE-10749
Repository: hive-git
Description
-------
Implement the insert statement for parquet format.
Diffs (updated)
-----
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/MapredParquetOutputFormat.java c6fb26c
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/acid/ParquetRecordUpdater.java PRE-CREATION
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/ParquetRecordReaderWrapper.java f513572
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ObjectArrayWritableObjectInspector.java 571f993
ql/src/test/org/apache/hadoop/hive/ql/io/parquet/acid/TestParquetRecordUpdater.java PRE-CREATION
ql/src/test/queries/clientpositive/acid_parquet_insert.q PRE-CREATION
ql/src/test/results/clientpositive/acid_parquet_insert.q.out PRE-CREATION
Diff: https://reviews.apache.org/r/34473/diff/
Testing
-------
Newly added qtest and UT passed locally
Thanks,
cheng xu
Re: Review Request 34473: HIVE-10749 Implement Insert statement for
parquet
Posted by Alan Gates <ga...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34473/#review84729
-----------------------------------------------------------
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/acid/ParquetRecordUpdater.java
<https://reviews.apache.org/r/34473/#comment136066>
Do you intend to use this in conjunction with hive.hcatalog.streaming? If so, closing the file on a flush is not what you'll want.
- Alan Gates
On May 21, 2015, 7:45 a.m., cheng xu wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/34473/
> -----------------------------------------------------------
>
> (Updated May 21, 2015, 7:45 a.m.)
>
>
> Review request for hive, Alan Gates and Sergio Pena.
>
>
> Bugs: HIVE-10749
> https://issues.apache.org/jira/browse/HIVE-10749
>
>
> Repository: hive-git
>
>
> Description
> -------
>
> Implement the insert statement for parquet format.
>
>
> Diffs
> -----
>
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/MapredParquetOutputFormat.java c6fb26c
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/acid/ParquetRecordUpdater.java PRE-CREATION
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/ParquetRecordReaderWrapper.java f513572
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ParquetStructObjectInspector.java PRE-CREATION
> ql/src/test/org/apache/hadoop/hive/ql/io/parquet/acid/TestParquetRecordUpdater.java PRE-CREATION
> ql/src/test/queries/clientpositive/acid_parquet_insert.q PRE-CREATION
> ql/src/test/results/clientpositive/acid_parquet_insert.q.out PRE-CREATION
>
> Diff: https://reviews.apache.org/r/34473/diff/
>
>
> Testing
> -------
>
> Newly added qtest and UT passed locally
>
>
> Thanks,
>
> cheng xu
>
>
Re: Review Request 34473: HIVE-10749 Implement Insert statement for
parquet
Posted by cheng xu <ch...@intel.com>.
> On May 21, 2015, 7:18 p.m., Sergio Pena wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ParquetStructObjectInspector.java, line 1
> > <https://reviews.apache.org/r/34473/diff/2/?file=966163#file966163line1>
> >
> > I am waiting to commit the patch from HIVE-10749 that uses a similar class named ObjectArrayWritableObjectInspector.
> >
> > Also, I think this is already part o the parquet branch.
It's https://issues.apache.org/jira/browse/HIVE-9658
- cheng
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34473/#review84758
-----------------------------------------------------------
On May 22, 2015, 6:26 a.m., cheng xu wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/34473/
> -----------------------------------------------------------
>
> (Updated May 22, 2015, 6:26 a.m.)
>
>
> Review request for hive, Alan Gates, Owen O'Malley, and Sergio Pena.
>
>
> Bugs: HIVE-10749
> https://issues.apache.org/jira/browse/HIVE-10749
>
>
> Repository: hive-git
>
>
> Description
> -------
>
> Implement the insert statement for parquet format.
>
>
> Diffs
> -----
>
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/MapredParquetOutputFormat.java c6fb26c
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/acid/ParquetRecordUpdater.java PRE-CREATION
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/ParquetRecordReaderWrapper.java f513572
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ObjectArrayWritableObjectInspector.java 571f993
> ql/src/test/org/apache/hadoop/hive/ql/io/parquet/acid/TestParquetRecordUpdater.java PRE-CREATION
> ql/src/test/queries/clientpositive/acid_parquet_insert.q PRE-CREATION
> ql/src/test/results/clientpositive/acid_parquet_insert.q.out PRE-CREATION
>
> Diff: https://reviews.apache.org/r/34473/diff/
>
>
> Testing
> -------
>
> Newly added qtest and UT passed locally
>
>
> Thanks,
>
> cheng xu
>
>
Re: Review Request 34473: HIVE-10749 Implement Insert statement for
parquet
Posted by cheng xu <ch...@intel.com>.
> On May 21, 2015, 7:18 p.m., Sergio Pena wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/io/parquet/MapredParquetOutputFormat.java, line 59
> > <https://reviews.apache.org/r/34473/diff/2/?file=966160#file966160line59>
> >
> > Could you separate words with _? Like ENABLE_ACID_SCHEMA_INFO. It helps to read the constant more easily.
> >
> > Do we have to enable transactions exclusively for parquet? Isn't there another variable that enables trasnactions on Hive that we can use?
This variable is used for setting the schema for parquet. It's only related to whether you need to write data to base file or not. So we have to use this way to append the original data with ACID info.
> On May 21, 2015, 7:18 p.m., Sergio Pena wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/io/parquet/MapredParquetOutputFormat.java, lines 98-103
> > <https://reviews.apache.org/r/34473/diff/2/?file=966160#file966160line98>
> >
> > You can use this one line to return the column list:
> >
> > return (List<String>) StringUtils.getStringCollection(tableProperties.getProperty(IOConstants.COLUMNS));
> >
> > It will return an empty list array if COLUMN is empty.
Great suggestion!
- cheng
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34473/#review84758
-----------------------------------------------------------
On May 22, 2015, 6:26 a.m., cheng xu wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/34473/
> -----------------------------------------------------------
>
> (Updated May 22, 2015, 6:26 a.m.)
>
>
> Review request for hive, Alan Gates, Owen O'Malley, and Sergio Pena.
>
>
> Bugs: HIVE-10749
> https://issues.apache.org/jira/browse/HIVE-10749
>
>
> Repository: hive-git
>
>
> Description
> -------
>
> Implement the insert statement for parquet format.
>
>
> Diffs
> -----
>
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/MapredParquetOutputFormat.java c6fb26c
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/acid/ParquetRecordUpdater.java PRE-CREATION
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/ParquetRecordReaderWrapper.java f513572
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ObjectArrayWritableObjectInspector.java 571f993
> ql/src/test/org/apache/hadoop/hive/ql/io/parquet/acid/TestParquetRecordUpdater.java PRE-CREATION
> ql/src/test/queries/clientpositive/acid_parquet_insert.q PRE-CREATION
> ql/src/test/results/clientpositive/acid_parquet_insert.q.out PRE-CREATION
>
> Diff: https://reviews.apache.org/r/34473/diff/
>
>
> Testing
> -------
>
> Newly added qtest and UT passed locally
>
>
> Thanks,
>
> cheng xu
>
>
Re: Review Request 34473: HIVE-10749 Implement Insert statement for
parquet
Posted by Sergio Pena <se...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34473/#review84758
-----------------------------------------------------------
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/MapredParquetOutputFormat.java
<https://reviews.apache.org/r/34473/#comment136104>
Could you separate words with _? Like ENABLE_ACID_SCHEMA_INFO. It helps to read the constant more easily.
Do we have to enable transactions exclusively for parquet? Isn't there another variable that enables trasnactions on Hive that we can use?
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/MapredParquetOutputFormat.java
<https://reviews.apache.org/r/34473/#comment136111>
Could you separate the workds? Like ENABLE_ACID_SCHEMA_INFO. It makes the code more readable.
Also, isn't there another variable that we can use to detect if transactions are enabled? I am not sure if we should add more variables to Hive.
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/MapredParquetOutputFormat.java
<https://reviews.apache.org/r/34473/#comment136107>
You can use this one line to return the column list:
return (List<String>) StringUtils.getStringCollection(tableProperties.getProperty(IOConstants.COLUMNS));
It will return an empty list array if COLUMN is empty.
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/MapredParquetOutputFormat.java
<https://reviews.apache.org/r/34473/#comment136112>
You can save code by using this line:
return (List<String>) StringUtils.getStringCollection(tableProperties.getProperty(IOConstants.COLUMNS));
It will return an empty list if the parameter is empty.
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/MapredParquetOutputFormat.java
<https://reviews.apache.org/r/34473/#comment136108>
You can call TypeInfoUtils.getTypeINfosFromTypeString() with an empty string here. It will return an empty list. Let's save code by using:
ArrayList<TypeInfo> columnTypes = TypeInfoUtils.getTypeInfosFromTypeString(columnTypeProperty);
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/MapredParquetOutputFormat.java
<https://reviews.apache.org/r/34473/#comment136113>
You can save code by using this line:
ArrayList<TypeInfo> columnTypes = TypeInfoUtils.getTypeInfosFromTypeString(columnTypeProperty);
It will return an empty list if the parameter is empty.
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/MapredParquetOutputFormat.java
<https://reviews.apache.org/r/34473/#comment136109>
Same here, you can save code with this:
ArrayList<String> columnNames = (ArrayList<String>) StringUtils.getStringCollection(columnNameProperty);
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/MapredParquetOutputFormat.java
<https://reviews.apache.org/r/34473/#comment136114>
Same thing here:
ArrayList<String> columnNames = (ArrayList<String>) StringUtils.getStringCollection(columnNameProperty);
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/ParquetRecordReaderWrapper.java
<https://reviews.apache.org/r/34473/#comment136117>
Why do you need a Writable? HIVE-9658 tries to avoid wrapping java types into writable if they are being used by Hive to save memory usage.
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ParquetStructObjectInspector.java
<https://reviews.apache.org/r/34473/#comment136116>
I am waiting to commit the patch from HIVE-10749 that uses a similar class named ObjectArrayWritableObjectInspector.
Also, I think this is already part o the parquet branch.
- Sergio Pena
On May 21, 2015, 7:45 a.m., cheng xu wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/34473/
> -----------------------------------------------------------
>
> (Updated May 21, 2015, 7:45 a.m.)
>
>
> Review request for hive, Alan Gates and Sergio Pena.
>
>
> Bugs: HIVE-10749
> https://issues.apache.org/jira/browse/HIVE-10749
>
>
> Repository: hive-git
>
>
> Description
> -------
>
> Implement the insert statement for parquet format.
>
>
> Diffs
> -----
>
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/MapredParquetOutputFormat.java c6fb26c
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/acid/ParquetRecordUpdater.java PRE-CREATION
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/ParquetRecordReaderWrapper.java f513572
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ParquetStructObjectInspector.java PRE-CREATION
> ql/src/test/org/apache/hadoop/hive/ql/io/parquet/acid/TestParquetRecordUpdater.java PRE-CREATION
> ql/src/test/queries/clientpositive/acid_parquet_insert.q PRE-CREATION
> ql/src/test/results/clientpositive/acid_parquet_insert.q.out PRE-CREATION
>
> Diff: https://reviews.apache.org/r/34473/diff/
>
>
> Testing
> -------
>
> Newly added qtest and UT passed locally
>
>
> Thanks,
>
> cheng xu
>
>
Re: Review Request 34473: HIVE-10749 Implement Insert statement for
parquet
Posted by cheng xu <ch...@intel.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34473/
-----------------------------------------------------------
(Updated May 21, 2015, 7:45 a.m.)
Review request for hive, Alan Gates and Sergio Pena.
Changes
-------
Summary:
1. fix code style issues
2. remove codes irrelevant to the insert statement
3. fix one issue about SetParquetSchema from previous patch
Bugs: HIVE-10749
https://issues.apache.org/jira/browse/HIVE-10749
Repository: hive-git
Description
-------
Implement the insert statement for parquet format.
Diffs (updated)
-----
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/MapredParquetOutputFormat.java c6fb26c
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/acid/ParquetRecordUpdater.java PRE-CREATION
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/ParquetRecordReaderWrapper.java f513572
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ParquetStructObjectInspector.java PRE-CREATION
ql/src/test/org/apache/hadoop/hive/ql/io/parquet/acid/TestParquetRecordUpdater.java PRE-CREATION
ql/src/test/queries/clientpositive/acid_parquet_insert.q PRE-CREATION
ql/src/test/results/clientpositive/acid_parquet_insert.q.out PRE-CREATION
Diff: https://reviews.apache.org/r/34473/diff/
Testing
-------
Newly added qtest and UT passed locally
Thanks,
cheng xu