You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tajo.apache.org by Jinho Kim <jh...@apache.org> on 2014/03/29 11:50:49 UTC

Review Request 19821: TAJO-717: Improve file splitting for large number of splits

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19821/
-----------------------------------------------------------

Review request for Tajo.


Bugs: TAJO-717
    https://issues.apache.org/jira/browse/TAJO-717


Repository: tajo


Description
-------

In currently, The storageManager invoke the getFileBlockStorageLocations() per input path, it occurred too many rpc to the associated datanodes


Diffs
-----

  tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/querymaster/Repartitioner.java b2adaa4a2623b49022f4dac716dbf5457e93b71d 
  tajo-storage/pom.xml 9f144bb206820d23238532f158b511974f518592 
  tajo-storage/src/main/java/org/apache/tajo/storage/AbstractStorageManager.java a7ed9817749787bfddcb719913ab03021aa19c6c 
  tajo-storage/src/main/java/org/apache/tajo/storage/CSVFile.java 116e25ce6df8eb5d8c70ef68e2d78a17dca1ef5b 
  tajo-storage/src/main/java/org/apache/tajo/storage/fragment/FileFragment.java ea8bf9f16b93ac5d8f32bcbadd5e4359dac8ad9b 
  tajo-storage/src/main/java/org/apache/tajo/storage/rcfile/RCFile.java 2fd34554c2905ecb593555dd463a7e8290e4d0e6 
  tajo-storage/src/test/java/org/apache/tajo/storage/TestStorageManager.java 083670a27545c95b56dd2bece8d1c0620593a6f8 

Diff: https://reviews.apache.org/r/19821/diff/


Testing
-------


Thanks,

Jinho Kim


Re: Review Request 19821: TAJO-717: Improve file splitting for large number of splits

Posted by Hyunsik Choi <hy...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19821/#review39978
-----------------------------------------------------------

Ship it!


+1

This patch includes nice code optimization for splitting and enough unit tests. The patch looks good to me.

- Hyunsik Choi


On April 7, 2014, 5:30 p.m., Jinho Kim wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/19821/
> -----------------------------------------------------------
> 
> (Updated April 7, 2014, 5:30 p.m.)
> 
> 
> Review request for Tajo.
> 
> 
> Bugs: TAJO-717
>     https://issues.apache.org/jira/browse/TAJO-717
> 
> 
> Repository: tajo
> 
> 
> Description
> -------
> 
> In currently, The storageManager invoke the getFileBlockStorageLocations() per input path, it occurred too many rpc to the associated datanodes
> 
> 
> Diffs
> -----
> 
>   tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/DefaultTaskScheduler.java 61fa84e 
>   tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/querymaster/Repartitioner.java df8b31b 
>   tajo-storage/pom.xml b9a162a 
>   tajo-storage/src/main/java/org/apache/tajo/storage/AbstractStorageManager.java a7ed981 
>   tajo-storage/src/main/java/org/apache/tajo/storage/fragment/FileFragment.java ea8bf9f 
>   tajo-storage/src/main/java/org/apache/tajo/storage/rcfile/RCFile.java bbb9df1 
>   tajo-storage/src/test/java/org/apache/tajo/storage/TestStorageManager.java 083670a 
> 
> Diff: https://reviews.apache.org/r/19821/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Jinho Kim
> 
>


Re: Review Request 19821: TAJO-717: Improve file splitting for large number of splits

Posted by Jinho Kim <jh...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19821/
-----------------------------------------------------------

(Updated April 7, 2014, 8:30 a.m.)


Review request for Tajo.


Changes
-------

I've fixed 'Failed to connect to datanode'


Bugs: TAJO-717
    https://issues.apache.org/jira/browse/TAJO-717


Repository: tajo


Description
-------

In currently, The storageManager invoke the getFileBlockStorageLocations() per input path, it occurred too many rpc to the associated datanodes


Diffs (updated)
-----

  tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/DefaultTaskScheduler.java 61fa84e 
  tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/querymaster/Repartitioner.java df8b31b 
  tajo-storage/pom.xml b9a162a 
  tajo-storage/src/main/java/org/apache/tajo/storage/AbstractStorageManager.java a7ed981 
  tajo-storage/src/main/java/org/apache/tajo/storage/fragment/FileFragment.java ea8bf9f 
  tajo-storage/src/main/java/org/apache/tajo/storage/rcfile/RCFile.java bbb9df1 
  tajo-storage/src/test/java/org/apache/tajo/storage/TestStorageManager.java 083670a 

Diff: https://reviews.apache.org/r/19821/diff/


Testing
-------


Thanks,

Jinho Kim


Re: Review Request 19821: TAJO-717: Improve file splitting for large number of splits

Posted by Hyunsik Choi <hy...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19821/#review39516
-----------------------------------------------------------

Ship it!


+1

Your patch seem to make getSplit() more efficient by reducing the remote call to HDFS. The part becomes much better. Ship it!

- Hyunsik Choi


On April 4, 2014, 2:07 p.m., Jinho Kim wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/19821/
> -----------------------------------------------------------
> 
> (Updated April 4, 2014, 2:07 p.m.)
> 
> 
> Review request for Tajo.
> 
> 
> Bugs: TAJO-717
>     https://issues.apache.org/jira/browse/TAJO-717
> 
> 
> Repository: tajo
> 
> 
> Description
> -------
> 
> In currently, The storageManager invoke the getFileBlockStorageLocations() per input path, it occurred too many rpc to the associated datanodes
> 
> 
> Diffs
> -----
> 
>   tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/querymaster/Repartitioner.java b2adaa4a2623b49022f4dac716dbf5457e93b71d 
>   tajo-storage/pom.xml b9a162ad3bcfa46b42ff70a47972b991f5ed4283 
>   tajo-storage/src/main/java/org/apache/tajo/storage/AbstractStorageManager.java a7ed9817749787bfddcb719913ab03021aa19c6c 
>   tajo-storage/src/main/java/org/apache/tajo/storage/fragment/FileFragment.java ea8bf9f16b93ac5d8f32bcbadd5e4359dac8ad9b 
>   tajo-storage/src/main/java/org/apache/tajo/storage/rcfile/RCFile.java bbb9df1e5cce58fc236a8f372f654ec78aa24b85 
>   tajo-storage/src/test/java/org/apache/tajo/storage/TestStorageManager.java 083670a27545c95b56dd2bece8d1c0620593a6f8 
> 
> Diff: https://reviews.apache.org/r/19821/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Jinho Kim
> 
>


Re: Review Request 19821: TAJO-717: Improve file splitting for large number of splits

Posted by Jinho Kim <jh...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19821/
-----------------------------------------------------------

(Updated April 4, 2014, 5:07 a.m.)


Review request for Tajo.


Changes
-------

I have rebased on master


Bugs: TAJO-717
    https://issues.apache.org/jira/browse/TAJO-717


Repository: tajo


Description
-------

In currently, The storageManager invoke the getFileBlockStorageLocations() per input path, it occurred too many rpc to the associated datanodes


Diffs (updated)
-----

  tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/querymaster/Repartitioner.java b2adaa4a2623b49022f4dac716dbf5457e93b71d 
  tajo-storage/pom.xml b9a162ad3bcfa46b42ff70a47972b991f5ed4283 
  tajo-storage/src/main/java/org/apache/tajo/storage/AbstractStorageManager.java a7ed9817749787bfddcb719913ab03021aa19c6c 
  tajo-storage/src/main/java/org/apache/tajo/storage/fragment/FileFragment.java ea8bf9f16b93ac5d8f32bcbadd5e4359dac8ad9b 
  tajo-storage/src/main/java/org/apache/tajo/storage/rcfile/RCFile.java bbb9df1e5cce58fc236a8f372f654ec78aa24b85 
  tajo-storage/src/test/java/org/apache/tajo/storage/TestStorageManager.java 083670a27545c95b56dd2bece8d1c0620593a6f8 

Diff: https://reviews.apache.org/r/19821/diff/


Testing
-------


Thanks,

Jinho Kim