You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by abdelhakim deneche <ad...@gmail.com> on 2015/05/05 17:40:09 UTC

Re: Review Request 32273: DRILL-2408: Invalid (0 length) parquet file created by CTAS

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32273/
-----------------------------------------------------------

(Updated May 5, 2015, 3:40 p.m.)


Review request for drill, Aman Sinha, Jason Altekruse, and Venki Korukanti.


Changes
-------

fixed how BaseTestQuery updates the storage plugin to use one single temp folder path for all drillbits
rebased on top of master


Bugs: DRILL-2408
    https://issues.apache.org/jira/browse/DRILL-2408


Repository: drill-git


Description
-------

I updated ParquetRecordWriter to delete the last file created if it's empty (no records written to it). I also added two unit tests one that checks the default case where we try to create a table using a query that returns 0 rows, the second case can happen if the ParquetRecordWriter flushes it's content just after writing the last record, it will then create a new empty file.


Diffs (updated)
-----

  exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/ParquetRecordWriter.java 3506ffa 
  exec/java-exec/src/main/java/org/apache/drill/exec/util/TestUtilities.java a1fcc2a 
  exec/java-exec/src/test/java/org/apache/drill/BaseTestQuery.java d6e6d08 
  exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/writer/TestParquetWriter.java 5670e1e 
  exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/writer/TestParquetWriterEmptyFiles.java PRE-CREATION 
  exec/jdbc/src/main/java/org/apache/drill/jdbc/DrillConnectionImpl.java c73eb50 

Diff: https://reviews.apache.org/r/32273/diff/


Testing
-------

- Added 2 unit tests to TestParquetWriter
- Unit tests are passing


Thanks,

abdelhakim deneche


Re: Review Request 32273: DRILL-2408: Invalid (0 length) parquet file created by CTAS

Posted by Venki Korukanti <ve...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32273/#review82553
-----------------------------------------------------------

Ship it!


Ship It!

- Venki Korukanti


On May 5, 2015, 3:40 p.m., abdelhakim deneche wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/32273/
> -----------------------------------------------------------
> 
> (Updated May 5, 2015, 3:40 p.m.)
> 
> 
> Review request for drill, Aman Sinha, Jason Altekruse, and Venki Korukanti.
> 
> 
> Bugs: DRILL-2408
>     https://issues.apache.org/jira/browse/DRILL-2408
> 
> 
> Repository: drill-git
> 
> 
> Description
> -------
> 
> I updated ParquetRecordWriter to delete the last file created if it's empty (no records written to it). I also added two unit tests one that checks the default case where we try to create a table using a query that returns 0 rows, the second case can happen if the ParquetRecordWriter flushes it's content just after writing the last record, it will then create a new empty file.
> 
> 
> Diffs
> -----
> 
>   exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/ParquetRecordWriter.java 3506ffa 
>   exec/java-exec/src/main/java/org/apache/drill/exec/util/TestUtilities.java a1fcc2a 
>   exec/java-exec/src/test/java/org/apache/drill/BaseTestQuery.java d6e6d08 
>   exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/writer/TestParquetWriter.java 5670e1e 
>   exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/writer/TestParquetWriterEmptyFiles.java PRE-CREATION 
>   exec/jdbc/src/main/java/org/apache/drill/jdbc/DrillConnectionImpl.java c73eb50 
> 
> Diff: https://reviews.apache.org/r/32273/diff/
> 
> 
> Testing
> -------
> 
> - Added 2 unit tests to TestParquetWriter
> - Unit tests are passing
> 
> 
> Thanks,
> 
> abdelhakim deneche
> 
>


Re: Review Request 32273: DRILL-2408: Invalid (0 length) parquet file created by CTAS

Posted by Aman Sinha <as...@maprtech.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32273/#review82825
-----------------------------------------------------------

Ship it!


Ship It!

- Aman Sinha


On May 7, 2015, 1:27 p.m., abdelhakim deneche wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/32273/
> -----------------------------------------------------------
> 
> (Updated May 7, 2015, 1:27 p.m.)
> 
> 
> Review request for drill and Aman Sinha.
> 
> 
> Bugs: DRILL-2408
>     https://issues.apache.org/jira/browse/DRILL-2408
> 
> 
> Repository: drill-git
> 
> 
> Description
> -------
> 
> I updated ParquetRecordWriter to delete the last file created if it's empty (no records written to it). I also added two unit tests one that checks the default case where we try to create a table using a query that returns 0 rows, the second case can happen if the ParquetRecordWriter flushes it's content just after writing the last record, it will then create a new empty file.
> 
> 
> Diffs
> -----
> 
>   exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/ParquetRecordWriter.java 8615eb7 
>   exec/java-exec/src/main/java/org/apache/drill/exec/util/TestUtilities.java a1fcc2a 
>   exec/java-exec/src/test/java/org/apache/drill/BaseTestQuery.java 200bbc6 
>   exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/writer/TestParquetWriter.java 5670e1e 
>   exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/writer/TestParquetWriterEmptyFiles.java PRE-CREATION 
>   exec/jdbc/src/main/java/org/apache/drill/jdbc/DrillConnectionImpl.java c73eb50 
> 
> Diff: https://reviews.apache.org/r/32273/diff/
> 
> 
> Testing
> -------
> 
> - Added 2 unit tests to TestParquetWriter
> 
> Unit tests are passing along with regression/tpch100
> 
> 
> Thanks,
> 
> abdelhakim deneche
> 
>


Re: Review Request 32273: DRILL-2408: Invalid (0 length) parquet file created by CTAS

Posted by abdelhakim deneche <ad...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32273/
-----------------------------------------------------------

(Updated May 7, 2015, 1:27 p.m.)


Review request for drill and Aman Sinha.


Changes
-------

Fixed ParquetRecordWriter to increment recordCount after parquet file is successfully created
rebased on top of master


Bugs: DRILL-2408
    https://issues.apache.org/jira/browse/DRILL-2408


Repository: drill-git


Description
-------

I updated ParquetRecordWriter to delete the last file created if it's empty (no records written to it). I also added two unit tests one that checks the default case where we try to create a table using a query that returns 0 rows, the second case can happen if the ParquetRecordWriter flushes it's content just after writing the last record, it will then create a new empty file.


Diffs (updated)
-----

  exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/ParquetRecordWriter.java 8615eb7 
  exec/java-exec/src/main/java/org/apache/drill/exec/util/TestUtilities.java a1fcc2a 
  exec/java-exec/src/test/java/org/apache/drill/BaseTestQuery.java 200bbc6 
  exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/writer/TestParquetWriter.java 5670e1e 
  exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/writer/TestParquetWriterEmptyFiles.java PRE-CREATION 
  exec/jdbc/src/main/java/org/apache/drill/jdbc/DrillConnectionImpl.java c73eb50 

Diff: https://reviews.apache.org/r/32273/diff/


Testing (updated)
-------

- Added 2 unit tests to TestParquetWriter

Unit tests are passing along with regression/tpch100


Thanks,

abdelhakim deneche


Re: Review Request 32273: DRILL-2408: Invalid (0 length) parquet file created by CTAS

Posted by abdelhakim deneche <ad...@gmail.com>.

> On May 6, 2015, 6:58 a.m., Aman Sinha wrote:
> > exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/ParquetRecordWriter.java, line 207
> > <https://reviews.apache.org/r/32273/diff/6/?file=950361#file950361line207>
> >
> >     If for some reason parquetFileWriter is null, this would silently not write anything and return.  Would it be better to throw an error instead ?

In the case of a permission error (or any other problem), an exception is thrown when we try to instantiate the parquetFileWriter (in endRecord()) which will fail the query.


- abdelhakim


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32273/#review82645
-----------------------------------------------------------


On May 5, 2015, 3:40 p.m., abdelhakim deneche wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/32273/
> -----------------------------------------------------------
> 
> (Updated May 5, 2015, 3:40 p.m.)
> 
> 
> Review request for drill, Aman Sinha, Jason Altekruse, and Venki Korukanti.
> 
> 
> Bugs: DRILL-2408
>     https://issues.apache.org/jira/browse/DRILL-2408
> 
> 
> Repository: drill-git
> 
> 
> Description
> -------
> 
> I updated ParquetRecordWriter to delete the last file created if it's empty (no records written to it). I also added two unit tests one that checks the default case where we try to create a table using a query that returns 0 rows, the second case can happen if the ParquetRecordWriter flushes it's content just after writing the last record, it will then create a new empty file.
> 
> 
> Diffs
> -----
> 
>   exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/ParquetRecordWriter.java 3506ffa 
>   exec/java-exec/src/main/java/org/apache/drill/exec/util/TestUtilities.java a1fcc2a 
>   exec/java-exec/src/test/java/org/apache/drill/BaseTestQuery.java d6e6d08 
>   exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/writer/TestParquetWriter.java 5670e1e 
>   exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/writer/TestParquetWriterEmptyFiles.java PRE-CREATION 
>   exec/jdbc/src/main/java/org/apache/drill/jdbc/DrillConnectionImpl.java c73eb50 
> 
> Diff: https://reviews.apache.org/r/32273/diff/
> 
> 
> Testing
> -------
> 
> - Added 2 unit tests to TestParquetWriter
> - Unit tests are passing
> 
> 
> Thanks,
> 
> abdelhakim deneche
> 
>


Re: Review Request 32273: DRILL-2408: Invalid (0 length) parquet file created by CTAS

Posted by Aman Sinha <as...@maprtech.com>.

> On May 6, 2015, 6:58 a.m., Aman Sinha wrote:
> > exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/ParquetRecordWriter.java, line 207
> > <https://reviews.apache.org/r/32273/diff/6/?file=950361#file950361line207>
> >
> >     If for some reason parquetFileWriter is null, this would silently not write anything and return.  Would it be better to throw an error instead ?
> 
> abdelhakim deneche wrote:
>     In the case of a permission error (or any other problem), an exception is thrown when we try to instantiate the parquetFileWriter (in endRecord()) which will fail the query.

As discussed, it would be better to check the record count in the flush method and only increment the record count if the ParquetWriter is created successfully.


- Aman


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32273/#review82645
-----------------------------------------------------------


On May 5, 2015, 3:40 p.m., abdelhakim deneche wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/32273/
> -----------------------------------------------------------
> 
> (Updated May 5, 2015, 3:40 p.m.)
> 
> 
> Review request for drill, Aman Sinha, Jason Altekruse, and Venki Korukanti.
> 
> 
> Bugs: DRILL-2408
>     https://issues.apache.org/jira/browse/DRILL-2408
> 
> 
> Repository: drill-git
> 
> 
> Description
> -------
> 
> I updated ParquetRecordWriter to delete the last file created if it's empty (no records written to it). I also added two unit tests one that checks the default case where we try to create a table using a query that returns 0 rows, the second case can happen if the ParquetRecordWriter flushes it's content just after writing the last record, it will then create a new empty file.
> 
> 
> Diffs
> -----
> 
>   exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/ParquetRecordWriter.java 3506ffa 
>   exec/java-exec/src/main/java/org/apache/drill/exec/util/TestUtilities.java a1fcc2a 
>   exec/java-exec/src/test/java/org/apache/drill/BaseTestQuery.java d6e6d08 
>   exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/writer/TestParquetWriter.java 5670e1e 
>   exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/writer/TestParquetWriterEmptyFiles.java PRE-CREATION 
>   exec/jdbc/src/main/java/org/apache/drill/jdbc/DrillConnectionImpl.java c73eb50 
> 
> Diff: https://reviews.apache.org/r/32273/diff/
> 
> 
> Testing
> -------
> 
> - Added 2 unit tests to TestParquetWriter
> - Unit tests are passing
> 
> 
> Thanks,
> 
> abdelhakim deneche
> 
>


Re: Review Request 32273: DRILL-2408: Invalid (0 length) parquet file created by CTAS

Posted by Aman Sinha <as...@maprtech.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/32273/#review82645
-----------------------------------------------------------



exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/ParquetRecordWriter.java
<https://reviews.apache.org/r/32273/#comment133416>

    If for some reason parquetFileWriter is null, this would silently not write anything and return.  Would it be better to throw an error instead ?


- Aman Sinha


On May 5, 2015, 3:40 p.m., abdelhakim deneche wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/32273/
> -----------------------------------------------------------
> 
> (Updated May 5, 2015, 3:40 p.m.)
> 
> 
> Review request for drill, Aman Sinha, Jason Altekruse, and Venki Korukanti.
> 
> 
> Bugs: DRILL-2408
>     https://issues.apache.org/jira/browse/DRILL-2408
> 
> 
> Repository: drill-git
> 
> 
> Description
> -------
> 
> I updated ParquetRecordWriter to delete the last file created if it's empty (no records written to it). I also added two unit tests one that checks the default case where we try to create a table using a query that returns 0 rows, the second case can happen if the ParquetRecordWriter flushes it's content just after writing the last record, it will then create a new empty file.
> 
> 
> Diffs
> -----
> 
>   exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/ParquetRecordWriter.java 3506ffa 
>   exec/java-exec/src/main/java/org/apache/drill/exec/util/TestUtilities.java a1fcc2a 
>   exec/java-exec/src/test/java/org/apache/drill/BaseTestQuery.java d6e6d08 
>   exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/writer/TestParquetWriter.java 5670e1e 
>   exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/writer/TestParquetWriterEmptyFiles.java PRE-CREATION 
>   exec/jdbc/src/main/java/org/apache/drill/jdbc/DrillConnectionImpl.java c73eb50 
> 
> Diff: https://reviews.apache.org/r/32273/diff/
> 
> 
> Testing
> -------
> 
> - Added 2 unit tests to TestParquetWriter
> - Unit tests are passing
> 
> 
> Thanks,
> 
> abdelhakim deneche
> 
>