You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by ppadma <gi...@git.apache.org> on 2017/05/05 17:22:32 UTC

[GitHub] drill pull request #826: DRILL-5379: Set Hdfs Block Size based on Parquet Bl...

GitHub user ppadma opened a pull request:

    https://github.com/apache/drill/pull/826

    DRILL-5379: Set Hdfs Block Size based on Parquet Block Size

    Provide an option to specify blocksize during file creation.
    This will help create parquet files with single block on HDFS, helping improve
    performance when we read those files.
    See DRILL-5379 for details.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/ppadma/drill DRILL-5379

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/drill/pull/826.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #826
    
----
commit ae77a26aa950e401f2ca40488021431ebfde7156
Author: Padma Penumarthy <pp...@yahoo.com>
Date:   2017-04-20T00:25:20Z

    DRILL-5379: Set Hdfs Block Size based on Parquet Block Size
    Provide an option to specify blocksize during file creation.
    This will help create parquet files with single block on HDFS, helping improve
    performance when we read those files.
    See DRILL-5379 for details.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill issue #826: DRILL-5379: Set Hdfs Block Size based on Parquet Block Siz...

Posted by ppadma <gi...@git.apache.org>.
Github user ppadma commented on the issue:

    https://github.com/apache/drill/pull/826
  
    @kkhatua  HDFS allows specifying block size during file creation which overrides the default file system block size. With this PR, we can have single HDFS block per Parquet file that can be larger than file system block size , without changing file system default block size. I know it is confusing. But, it is possible to create file with  block size that is different from file system default block size. 
    
     


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #826: DRILL-5379: Set Hdfs Block Size based on Parquet Bl...

Posted by ppadma <gi...@git.apache.org>.
Github user ppadma commented on a diff in the pull request:

    https://github.com/apache/drill/pull/826#discussion_r117380486
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/ParquetRecordWriter.java ---
    @@ -380,14 +384,21 @@ public void endRecord() throws IOException {
     
           // since ParquetFileWriter will overwrite empty output file (append is not supported)
           // we need to re-apply file permission
    -      parquetFileWriter = new ParquetFileWriter(conf, schema, path, ParquetFileWriter.Mode.OVERWRITE);
    +      if (useSingleFSBlock) {
    +        // Round up blockSize to multiple of 64K.
    +        long writeBlockSize = ((long) ceil((double)blockSize/BLOCKSIZE_MULTIPLE)) * BLOCKSIZE_MULTIPLE;
    --- End diff --
    
    ok. I made the change. Please review updated diffs. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #826: DRILL-5379: Set Hdfs Block Size based on Parquet Bl...

Posted by parthchandra <gi...@git.apache.org>.
Github user parthchandra commented on a diff in the pull request:

    https://github.com/apache/drill/pull/826#discussion_r117076766
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/ExecConstants.java ---
    @@ -160,6 +160,9 @@
       OptionValidator OUTPUT_FORMAT_VALIDATOR = new StringValidator(OUTPUT_FORMAT_OPTION, "parquet");
       String PARQUET_BLOCK_SIZE = "store.parquet.block-size";
       OptionValidator PARQUET_BLOCK_SIZE_VALIDATOR = new LongValidator(PARQUET_BLOCK_SIZE, 512*1024*1024);
    +  String PARQUET_WRITER_USE_CONFIGURED_BLOCK_SIZE = "store.parquet.writer.use_configured_block-size";
    --- End diff --
    
    This could be better named. The Parquet writer already uses the configured block size (the Parquet block size). You really want an option to create the file with the same HDFS block size as the Parquet row group for performance reasons. Why not name it to indicate that?  


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #826: DRILL-5379: Set Hdfs Block Size based on Parquet Bl...

Posted by ppadma <gi...@git.apache.org>.
Github user ppadma commented on a diff in the pull request:

    https://github.com/apache/drill/pull/826#discussion_r116895850
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/ParquetRecordWriter.java ---
    @@ -380,14 +384,21 @@ public void endRecord() throws IOException {
     
           // since ParquetFileWriter will overwrite empty output file (append is not supported)
           // we need to re-apply file permission
    -      parquetFileWriter = new ParquetFileWriter(conf, schema, path, ParquetFileWriter.Mode.OVERWRITE);
    +      if (useConfiguredBlockSize) {
    --- End diff --
    
    What we are doing is create parquet file as single block without changing the file system default block size.  For ex. default Parquet block size is 512MB and if file system block size is 128MB, we create single file with 4 blocks on filesystem, which can get distributed on different nodes, not good for performance. If we change Parquet block size to 128MB (to match with file system block size), for same amount of data, we end up creating 4 files, one block each, which is not good either. 
    
    JIRA wanted single HDFS block per Parquet file that is larger than file system block size , without changing file system block size.  They had file system block size configured as 128MB. Lowering parquet block size (from default value of 512MB) to match with file system block size is creating too many files for them. For some other reasons, they are not able to change file system block size. 



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill issue #826: DRILL-5379: Set Hdfs Block Size based on Parquet Block Siz...

Posted by kkhatua <gi...@git.apache.org>.
Github user kkhatua commented on the issue:

    https://github.com/apache/drill/pull/826
  
    @ppadma , Khurram [~khfaraaz] and I were looking at the details in the PR and it's not very clear what new behavior does the PR allow. If you need to specify the block-size as described in the [comment ](https://issues.apache.org/jira/browse/DRILL-5379?focusedCommentId=15981366&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15981366)by @fmethot , doesn't Drill already do that? I thought Drill implicitly creates files with a single row-group anyway. 
    
    My understanding of the JIRA's problem statement was that if the Parquet block-size (i.e. the rowgroup size) is set to a large value that exceeds the HDFS block size, using the flag would allow Drill to ignore the larger value in the options and write with a parquet-blocksize that matches the target HDFS location. So, I could have {{store.parquet.block-size=1073741824}} (i.e. 1GB), but when writing an output worth 512MB, instead of 1 file... Drill would read the HDFS block-size (say 128GB) and apply that as the parquet-block-size and write 4 files. 
    
    @fmethot is that what you were looking for? An **automatic scaling down** of the parquet file's size to match (and be contained within) the HDFS block size?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #826: DRILL-5379: Set Hdfs Block Size based on Parquet Bl...

Posted by ppadma <gi...@git.apache.org>.
Github user ppadma commented on a diff in the pull request:

    https://github.com/apache/drill/pull/826#discussion_r117113119
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/ExecConstants.java ---
    @@ -160,6 +160,9 @@
       OptionValidator OUTPUT_FORMAT_VALIDATOR = new StringValidator(OUTPUT_FORMAT_OPTION, "parquet");
       String PARQUET_BLOCK_SIZE = "store.parquet.block-size";
       OptionValidator PARQUET_BLOCK_SIZE_VALIDATOR = new LongValidator(PARQUET_BLOCK_SIZE, 512*1024*1024);
    +  String PARQUET_WRITER_USE_CONFIGURED_BLOCK_SIZE = "store.parquet.writer.use_configured_block-size";
    --- End diff --
    
    ok, I went with "store.parquet.writer.use_single_fs_block".  Please review updated changes. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill issue #826: DRILL-5379: Set Hdfs Block Size based on Parquet Block Siz...

Posted by ppadma <gi...@git.apache.org>.
Github user ppadma commented on the issue:

    https://github.com/apache/drill/pull/826
  
    @parthchandra can you review this ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #826: DRILL-5379: Set Hdfs Block Size based on Parquet Bl...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/drill/pull/826


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #826: DRILL-5379: Set Hdfs Block Size based on Parquet Bl...

Posted by parthchandra <gi...@git.apache.org>.
Github user parthchandra commented on a diff in the pull request:

    https://github.com/apache/drill/pull/826#discussion_r117298449
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/ParquetRecordWriter.java ---
    @@ -380,14 +384,21 @@ public void endRecord() throws IOException {
     
           // since ParquetFileWriter will overwrite empty output file (append is not supported)
           // we need to re-apply file permission
    -      parquetFileWriter = new ParquetFileWriter(conf, schema, path, ParquetFileWriter.Mode.OVERWRITE);
    +      if (useSingleFSBlock) {
    +        // Round up blockSize to multiple of 64K.
    +        long writeBlockSize = ((long) ceil((double)blockSize/BLOCKSIZE_MULTIPLE)) * BLOCKSIZE_MULTIPLE;
    --- End diff --
    
    Would it be better to change the blockSize itself so that checkBlockSize also gets it if the option is set.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #826: DRILL-5379: Set Hdfs Block Size based on Parquet Bl...

Posted by parthchandra <gi...@git.apache.org>.
Github user parthchandra commented on a diff in the pull request:

    https://github.com/apache/drill/pull/826#discussion_r117297653
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/ParquetRecordWriter.java ---
    @@ -380,14 +384,21 @@ public void endRecord() throws IOException {
     
           // since ParquetFileWriter will overwrite empty output file (append is not supported)
           // we need to re-apply file permission
    -      parquetFileWriter = new ParquetFileWriter(conf, schema, path, ParquetFileWriter.Mode.OVERWRITE);
    +      if (useConfiguredBlockSize) {
    +        // Round up blockSize to multiple of 64K.
    +        long writeBlockSize = ((long) ceil((double)blockSize/BLOCKSIZE_MULTIPLE)) * BLOCKSIZE_MULTIPLE;
    +        // Passing writeBlockSize creates files with this blockSize instead of filesystem default blockSize.
    +        // Currently, this is supported only by filesystems included in
    +        // BLOCK_FS_SCHEMES (ParquetFileWriter.java in parquet-mr), which includes HDFS.
    +        // For other filesystems, it uses default blockSize configured for the file system.
    +        parquetFileWriter = new ParquetFileWriter(conf, schema, path, ParquetFileWriter.Mode.OVERWRITE, writeBlockSize, 0);
    +      } else {
    +        parquetFileWriter = new ParquetFileWriter(conf, schema, path, ParquetFileWriter.Mode.OVERWRITE);
    --- End diff --
    
    Right, it doesn't. The changing of the option name (and semantics) make this much clearer.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #826: DRILL-5379: Set Hdfs Block Size based on Parquet Bl...

Posted by ppadma <gi...@git.apache.org>.
Github user ppadma commented on a diff in the pull request:

    https://github.com/apache/drill/pull/826#discussion_r117112392
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/ParquetRecordWriter.java ---
    @@ -380,14 +384,21 @@ public void endRecord() throws IOException {
     
           // since ParquetFileWriter will overwrite empty output file (append is not supported)
           // we need to re-apply file permission
    -      parquetFileWriter = new ParquetFileWriter(conf, schema, path, ParquetFileWriter.Mode.OVERWRITE);
    +      if (useConfiguredBlockSize) {
    +        // Round up blockSize to multiple of 64K.
    +        long writeBlockSize = ((long) ceil((double)blockSize/BLOCKSIZE_MULTIPLE)) * BLOCKSIZE_MULTIPLE;
    --- End diff --
    
    Underlying FS API expects blockSize specified during create to be multiple of 64K.  So, I rounded it up.  This will be greater than (to nearest 64K) or equal to what is used in checkBlockSizeReached.  Should I change checkBlockSizeReached to do the same ? I was trying to avoid any changes to existing behavior/code. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #826: DRILL-5379: Set Hdfs Block Size based on Parquet Bl...

Posted by parthchandra <gi...@git.apache.org>.
Github user parthchandra commented on a diff in the pull request:

    https://github.com/apache/drill/pull/826#discussion_r117094528
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/ParquetRecordWriter.java ---
    @@ -380,14 +384,21 @@ public void endRecord() throws IOException {
     
           // since ParquetFileWriter will overwrite empty output file (append is not supported)
           // we need to re-apply file permission
    -      parquetFileWriter = new ParquetFileWriter(conf, schema, path, ParquetFileWriter.Mode.OVERWRITE);
    +      if (useConfiguredBlockSize) {
    +        // Round up blockSize to multiple of 64K.
    +        long writeBlockSize = ((long) ceil((double)blockSize/BLOCKSIZE_MULTIPLE)) * BLOCKSIZE_MULTIPLE;
    +        // Passing writeBlockSize creates files with this blockSize instead of filesystem default blockSize.
    +        // Currently, this is supported only by filesystems included in
    +        // BLOCK_FS_SCHEMES (ParquetFileWriter.java in parquet-mr), which includes HDFS.
    +        // For other filesystems, it uses default blockSize configured for the file system.
    +        parquetFileWriter = new ParquetFileWriter(conf, schema, path, ParquetFileWriter.Mode.OVERWRITE, writeBlockSize, 0);
    +      } else {
    +        parquetFileWriter = new ParquetFileWriter(conf, schema, path, ParquetFileWriter.Mode.OVERWRITE);
    --- End diff --
    
    This breaks previous behaviour. If a user had previously changed the parquet block size, they will not get Parquet RowGrougs of that size, but will get the default 512MB


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #826: DRILL-5379: Set Hdfs Block Size based on Parquet Bl...

Posted by parthchandra <gi...@git.apache.org>.
Github user parthchandra commented on a diff in the pull request:

    https://github.com/apache/drill/pull/826#discussion_r117094096
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/ParquetRecordWriter.java ---
    @@ -380,14 +384,21 @@ public void endRecord() throws IOException {
     
           // since ParquetFileWriter will overwrite empty output file (append is not supported)
           // we need to re-apply file permission
    -      parquetFileWriter = new ParquetFileWriter(conf, schema, path, ParquetFileWriter.Mode.OVERWRITE);
    +      if (useConfiguredBlockSize) {
    +        // Round up blockSize to multiple of 64K.
    +        long writeBlockSize = ((long) ceil((double)blockSize/BLOCKSIZE_MULTIPLE)) * BLOCKSIZE_MULTIPLE;
    --- End diff --
    
    This is not quite consistent with the use of block size in the `checkBlockSizeReached` function. You want to use the same size in both places.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #826: DRILL-5379: Set Hdfs Block Size based on Parquet Bl...

Posted by ppadma <gi...@git.apache.org>.
Github user ppadma commented on a diff in the pull request:

    https://github.com/apache/drill/pull/826#discussion_r117112378
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/ParquetRecordWriter.java ---
    @@ -380,14 +384,21 @@ public void endRecord() throws IOException {
     
           // since ParquetFileWriter will overwrite empty output file (append is not supported)
           // we need to re-apply file permission
    -      parquetFileWriter = new ParquetFileWriter(conf, schema, path, ParquetFileWriter.Mode.OVERWRITE);
    +      if (useConfiguredBlockSize) {
    +        // Round up blockSize to multiple of 64K.
    +        long writeBlockSize = ((long) ceil((double)blockSize/BLOCKSIZE_MULTIPLE)) * BLOCKSIZE_MULTIPLE;
    +        // Passing writeBlockSize creates files with this blockSize instead of filesystem default blockSize.
    +        // Currently, this is supported only by filesystems included in
    +        // BLOCK_FS_SCHEMES (ParquetFileWriter.java in parquet-mr), which includes HDFS.
    +        // For other filesystems, it uses default blockSize configured for the file system.
    +        parquetFileWriter = new ParquetFileWriter(conf, schema, path, ParquetFileWriter.Mode.OVERWRITE, writeBlockSize, 0);
    +      } else {
    +        parquetFileWriter = new ParquetFileWriter(conf, schema, path, ParquetFileWriter.Mode.OVERWRITE);
    --- End diff --
    
    No. It does not. If user had previously changed the parquet block size, that will be used in checkBlockSizeReached as before. So, parquet rowGroup size will be same as configured parquet blockSize (not default). This option only changes the behavior of how we write that parquet row group onto disk i.e. single block or not. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill issue #826: DRILL-5379: Set Hdfs Block Size based on Parquet Block Siz...

Posted by parthchandra <gi...@git.apache.org>.
Github user parthchandra commented on the issue:

    https://github.com/apache/drill/pull/826
  
    +1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #826: DRILL-5379: Set Hdfs Block Size based on Parquet Bl...

Posted by kkhatua <gi...@git.apache.org>.
Github user kkhatua commented on a diff in the pull request:

    https://github.com/apache/drill/pull/826#discussion_r117108770
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/ExecConstants.java ---
    @@ -160,6 +160,9 @@
       OptionValidator OUTPUT_FORMAT_VALIDATOR = new StringValidator(OUTPUT_FORMAT_OPTION, "parquet");
       String PARQUET_BLOCK_SIZE = "store.parquet.block-size";
       OptionValidator PARQUET_BLOCK_SIZE_VALIDATOR = new LongValidator(PARQUET_BLOCK_SIZE, 512*1024*1024);
    +  String PARQUET_WRITER_USE_CONFIGURED_BLOCK_SIZE = "store.parquet.writer.use_configured_block-size";
    --- End diff --
    
    Making my contribution to the review thread:
    `match_filesystem_block-size`
    The value would be the logical inverse of the original flag. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request #826: DRILL-5379: Set Hdfs Block Size based on Parquet Bl...

Posted by parthchandra <gi...@git.apache.org>.
Github user parthchandra commented on a diff in the pull request:

    https://github.com/apache/drill/pull/826#discussion_r116886162
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/ParquetRecordWriter.java ---
    @@ -380,14 +384,21 @@ public void endRecord() throws IOException {
     
           // since ParquetFileWriter will overwrite empty output file (append is not supported)
           // we need to re-apply file permission
    -      parquetFileWriter = new ParquetFileWriter(conf, schema, path, ParquetFileWriter.Mode.OVERWRITE);
    +      if (useConfiguredBlockSize) {
    --- End diff --
    
    The API `ParquetFileWriter(conf, schema, path, ParquetFileWriter.Mode.OVERWRITE)` will cause the Parquet file writer to set the file block size to the greater of the configured files system block size or 128 MB (the ParquetWriter's row group size). 
    Drill's Parquet writer will use the block size specified in Drill's options to create a new Parquet row group when the limit is reached (See `ParquetRecodWriter.checkBlockSizeReached()` ). If you set Drill's Parquet block size to the larger of the configured file system block size or 128 MB, you will get the row group to match the HDFS block size. 
    Which is what the current code does.
    Isn't this what the original JIRA wanted?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---