You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by abdelhakim deneche <ad...@gmail.com> on 2015/05/05 17:40:09 UTC
Re: Review Request 32273: DRILL-2408: Invalid (0 length) parquet file
created by CTAS
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32273/
-----------------------------------------------------------
(Updated May 5, 2015, 3:40 p.m.)
Review request for drill, Aman Sinha, Jason Altekruse, and Venki Korukanti.
Changes
-------
fixed how BaseTestQuery updates the storage plugin to use one single temp folder path for all drillbits
rebased on top of master
Bugs: DRILL-2408
https://issues.apache.org/jira/browse/DRILL-2408
Repository: drill-git
Description
-------
I updated ParquetRecordWriter to delete the last file created if it's empty (no records written to it). I also added two unit tests one that checks the default case where we try to create a table using a query that returns 0 rows, the second case can happen if the ParquetRecordWriter flushes it's content just after writing the last record, it will then create a new empty file.
Diffs (updated)
-----
exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/ParquetRecordWriter.java 3506ffa
exec/java-exec/src/main/java/org/apache/drill/exec/util/TestUtilities.java a1fcc2a
exec/java-exec/src/test/java/org/apache/drill/BaseTestQuery.java d6e6d08
exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/writer/TestParquetWriter.java 5670e1e
exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/writer/TestParquetWriterEmptyFiles.java PRE-CREATION
exec/jdbc/src/main/java/org/apache/drill/jdbc/DrillConnectionImpl.java c73eb50
Diff: https://reviews.apache.org/r/32273/diff/
Testing
-------
- Added 2 unit tests to TestParquetWriter
- Unit tests are passing
Thanks,
abdelhakim deneche
Re: Review Request 32273: DRILL-2408: Invalid (0 length) parquet file
created by CTAS
Posted by Venki Korukanti <ve...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32273/#review82553
-----------------------------------------------------------
Ship it!
Ship It!
- Venki Korukanti
On May 5, 2015, 3:40 p.m., abdelhakim deneche wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/32273/
> -----------------------------------------------------------
>
> (Updated May 5, 2015, 3:40 p.m.)
>
>
> Review request for drill, Aman Sinha, Jason Altekruse, and Venki Korukanti.
>
>
> Bugs: DRILL-2408
> https://issues.apache.org/jira/browse/DRILL-2408
>
>
> Repository: drill-git
>
>
> Description
> -------
>
> I updated ParquetRecordWriter to delete the last file created if it's empty (no records written to it). I also added two unit tests one that checks the default case where we try to create a table using a query that returns 0 rows, the second case can happen if the ParquetRecordWriter flushes it's content just after writing the last record, it will then create a new empty file.
>
>
> Diffs
> -----
>
> exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/ParquetRecordWriter.java 3506ffa
> exec/java-exec/src/main/java/org/apache/drill/exec/util/TestUtilities.java a1fcc2a
> exec/java-exec/src/test/java/org/apache/drill/BaseTestQuery.java d6e6d08
> exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/writer/TestParquetWriter.java 5670e1e
> exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/writer/TestParquetWriterEmptyFiles.java PRE-CREATION
> exec/jdbc/src/main/java/org/apache/drill/jdbc/DrillConnectionImpl.java c73eb50
>
> Diff: https://reviews.apache.org/r/32273/diff/
>
>
> Testing
> -------
>
> - Added 2 unit tests to TestParquetWriter
> - Unit tests are passing
>
>
> Thanks,
>
> abdelhakim deneche
>
>
Re: Review Request 32273: DRILL-2408: Invalid (0 length) parquet file
created by CTAS
Posted by Aman Sinha <as...@maprtech.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32273/#review82825
-----------------------------------------------------------
Ship it!
Ship It!
- Aman Sinha
On May 7, 2015, 1:27 p.m., abdelhakim deneche wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/32273/
> -----------------------------------------------------------
>
> (Updated May 7, 2015, 1:27 p.m.)
>
>
> Review request for drill and Aman Sinha.
>
>
> Bugs: DRILL-2408
> https://issues.apache.org/jira/browse/DRILL-2408
>
>
> Repository: drill-git
>
>
> Description
> -------
>
> I updated ParquetRecordWriter to delete the last file created if it's empty (no records written to it). I also added two unit tests one that checks the default case where we try to create a table using a query that returns 0 rows, the second case can happen if the ParquetRecordWriter flushes it's content just after writing the last record, it will then create a new empty file.
>
>
> Diffs
> -----
>
> exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/ParquetRecordWriter.java 8615eb7
> exec/java-exec/src/main/java/org/apache/drill/exec/util/TestUtilities.java a1fcc2a
> exec/java-exec/src/test/java/org/apache/drill/BaseTestQuery.java 200bbc6
> exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/writer/TestParquetWriter.java 5670e1e
> exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/writer/TestParquetWriterEmptyFiles.java PRE-CREATION
> exec/jdbc/src/main/java/org/apache/drill/jdbc/DrillConnectionImpl.java c73eb50
>
> Diff: https://reviews.apache.org/r/32273/diff/
>
>
> Testing
> -------
>
> - Added 2 unit tests to TestParquetWriter
>
> Unit tests are passing along with regression/tpch100
>
>
> Thanks,
>
> abdelhakim deneche
>
>
Re: Review Request 32273: DRILL-2408: Invalid (0 length) parquet file
created by CTAS
Posted by abdelhakim deneche <ad...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32273/
-----------------------------------------------------------
(Updated May 7, 2015, 1:27 p.m.)
Review request for drill and Aman Sinha.
Changes
-------
Fixed ParquetRecordWriter to increment recordCount after parquet file is successfully created
rebased on top of master
Bugs: DRILL-2408
https://issues.apache.org/jira/browse/DRILL-2408
Repository: drill-git
Description
-------
I updated ParquetRecordWriter to delete the last file created if it's empty (no records written to it). I also added two unit tests one that checks the default case where we try to create a table using a query that returns 0 rows, the second case can happen if the ParquetRecordWriter flushes it's content just after writing the last record, it will then create a new empty file.
Diffs (updated)
-----
exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/ParquetRecordWriter.java 8615eb7
exec/java-exec/src/main/java/org/apache/drill/exec/util/TestUtilities.java a1fcc2a
exec/java-exec/src/test/java/org/apache/drill/BaseTestQuery.java 200bbc6
exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/writer/TestParquetWriter.java 5670e1e
exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/writer/TestParquetWriterEmptyFiles.java PRE-CREATION
exec/jdbc/src/main/java/org/apache/drill/jdbc/DrillConnectionImpl.java c73eb50
Diff: https://reviews.apache.org/r/32273/diff/
Testing (updated)
-------
- Added 2 unit tests to TestParquetWriter
Unit tests are passing along with regression/tpch100
Thanks,
abdelhakim deneche
Re: Review Request 32273: DRILL-2408: Invalid (0 length) parquet file
created by CTAS
Posted by abdelhakim deneche <ad...@gmail.com>.
> On May 6, 2015, 6:58 a.m., Aman Sinha wrote:
> > exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/ParquetRecordWriter.java, line 207
> > <https://reviews.apache.org/r/32273/diff/6/?file=950361#file950361line207>
> >
> > If for some reason parquetFileWriter is null, this would silently not write anything and return. Would it be better to throw an error instead ?
In the case of a permission error (or any other problem), an exception is thrown when we try to instantiate the parquetFileWriter (in endRecord()) which will fail the query.
- abdelhakim
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32273/#review82645
-----------------------------------------------------------
On May 5, 2015, 3:40 p.m., abdelhakim deneche wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/32273/
> -----------------------------------------------------------
>
> (Updated May 5, 2015, 3:40 p.m.)
>
>
> Review request for drill, Aman Sinha, Jason Altekruse, and Venki Korukanti.
>
>
> Bugs: DRILL-2408
> https://issues.apache.org/jira/browse/DRILL-2408
>
>
> Repository: drill-git
>
>
> Description
> -------
>
> I updated ParquetRecordWriter to delete the last file created if it's empty (no records written to it). I also added two unit tests one that checks the default case where we try to create a table using a query that returns 0 rows, the second case can happen if the ParquetRecordWriter flushes it's content just after writing the last record, it will then create a new empty file.
>
>
> Diffs
> -----
>
> exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/ParquetRecordWriter.java 3506ffa
> exec/java-exec/src/main/java/org/apache/drill/exec/util/TestUtilities.java a1fcc2a
> exec/java-exec/src/test/java/org/apache/drill/BaseTestQuery.java d6e6d08
> exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/writer/TestParquetWriter.java 5670e1e
> exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/writer/TestParquetWriterEmptyFiles.java PRE-CREATION
> exec/jdbc/src/main/java/org/apache/drill/jdbc/DrillConnectionImpl.java c73eb50
>
> Diff: https://reviews.apache.org/r/32273/diff/
>
>
> Testing
> -------
>
> - Added 2 unit tests to TestParquetWriter
> - Unit tests are passing
>
>
> Thanks,
>
> abdelhakim deneche
>
>
Re: Review Request 32273: DRILL-2408: Invalid (0 length) parquet file
created by CTAS
Posted by Aman Sinha <as...@maprtech.com>.
> On May 6, 2015, 6:58 a.m., Aman Sinha wrote:
> > exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/ParquetRecordWriter.java, line 207
> > <https://reviews.apache.org/r/32273/diff/6/?file=950361#file950361line207>
> >
> > If for some reason parquetFileWriter is null, this would silently not write anything and return. Would it be better to throw an error instead ?
>
> abdelhakim deneche wrote:
> In the case of a permission error (or any other problem), an exception is thrown when we try to instantiate the parquetFileWriter (in endRecord()) which will fail the query.
As discussed, it would be better to check the record count in the flush method and only increment the record count if the ParquetWriter is created successfully.
- Aman
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32273/#review82645
-----------------------------------------------------------
On May 5, 2015, 3:40 p.m., abdelhakim deneche wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/32273/
> -----------------------------------------------------------
>
> (Updated May 5, 2015, 3:40 p.m.)
>
>
> Review request for drill, Aman Sinha, Jason Altekruse, and Venki Korukanti.
>
>
> Bugs: DRILL-2408
> https://issues.apache.org/jira/browse/DRILL-2408
>
>
> Repository: drill-git
>
>
> Description
> -------
>
> I updated ParquetRecordWriter to delete the last file created if it's empty (no records written to it). I also added two unit tests one that checks the default case where we try to create a table using a query that returns 0 rows, the second case can happen if the ParquetRecordWriter flushes it's content just after writing the last record, it will then create a new empty file.
>
>
> Diffs
> -----
>
> exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/ParquetRecordWriter.java 3506ffa
> exec/java-exec/src/main/java/org/apache/drill/exec/util/TestUtilities.java a1fcc2a
> exec/java-exec/src/test/java/org/apache/drill/BaseTestQuery.java d6e6d08
> exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/writer/TestParquetWriter.java 5670e1e
> exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/writer/TestParquetWriterEmptyFiles.java PRE-CREATION
> exec/jdbc/src/main/java/org/apache/drill/jdbc/DrillConnectionImpl.java c73eb50
>
> Diff: https://reviews.apache.org/r/32273/diff/
>
>
> Testing
> -------
>
> - Added 2 unit tests to TestParquetWriter
> - Unit tests are passing
>
>
> Thanks,
>
> abdelhakim deneche
>
>
Re: Review Request 32273: DRILL-2408: Invalid (0 length) parquet file
created by CTAS
Posted by Aman Sinha <as...@maprtech.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32273/#review82645
-----------------------------------------------------------
exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/ParquetRecordWriter.java
<https://reviews.apache.org/r/32273/#comment133416>
If for some reason parquetFileWriter is null, this would silently not write anything and return. Would it be better to throw an error instead ?
- Aman Sinha
On May 5, 2015, 3:40 p.m., abdelhakim deneche wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/32273/
> -----------------------------------------------------------
>
> (Updated May 5, 2015, 3:40 p.m.)
>
>
> Review request for drill, Aman Sinha, Jason Altekruse, and Venki Korukanti.
>
>
> Bugs: DRILL-2408
> https://issues.apache.org/jira/browse/DRILL-2408
>
>
> Repository: drill-git
>
>
> Description
> -------
>
> I updated ParquetRecordWriter to delete the last file created if it's empty (no records written to it). I also added two unit tests one that checks the default case where we try to create a table using a query that returns 0 rows, the second case can happen if the ParquetRecordWriter flushes it's content just after writing the last record, it will then create a new empty file.
>
>
> Diffs
> -----
>
> exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/ParquetRecordWriter.java 3506ffa
> exec/java-exec/src/main/java/org/apache/drill/exec/util/TestUtilities.java a1fcc2a
> exec/java-exec/src/test/java/org/apache/drill/BaseTestQuery.java d6e6d08
> exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/writer/TestParquetWriter.java 5670e1e
> exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/writer/TestParquetWriterEmptyFiles.java PRE-CREATION
> exec/jdbc/src/main/java/org/apache/drill/jdbc/DrillConnectionImpl.java c73eb50
>
> Diff: https://reviews.apache.org/r/32273/diff/
>
>
> Testing
> -------
>
> - Added 2 unit tests to TestParquetWriter
> - Unit tests are passing
>
>
> Thanks,
>
> abdelhakim deneche
>
>