You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tajo.apache.org by Jinho Kim <jh...@apache.org> on 2014/03/29 11:50:49 UTC
Review Request 19821: TAJO-717: Improve file splitting for large number of
splits
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19821/
-----------------------------------------------------------
Review request for Tajo.
Bugs: TAJO-717
https://issues.apache.org/jira/browse/TAJO-717
Repository: tajo
Description
-------
In currently, The storageManager invoke the getFileBlockStorageLocations() per input path, it occurred too many rpc to the associated datanodes
Diffs
-----
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/querymaster/Repartitioner.java b2adaa4a2623b49022f4dac716dbf5457e93b71d
tajo-storage/pom.xml 9f144bb206820d23238532f158b511974f518592
tajo-storage/src/main/java/org/apache/tajo/storage/AbstractStorageManager.java a7ed9817749787bfddcb719913ab03021aa19c6c
tajo-storage/src/main/java/org/apache/tajo/storage/CSVFile.java 116e25ce6df8eb5d8c70ef68e2d78a17dca1ef5b
tajo-storage/src/main/java/org/apache/tajo/storage/fragment/FileFragment.java ea8bf9f16b93ac5d8f32bcbadd5e4359dac8ad9b
tajo-storage/src/main/java/org/apache/tajo/storage/rcfile/RCFile.java 2fd34554c2905ecb593555dd463a7e8290e4d0e6
tajo-storage/src/test/java/org/apache/tajo/storage/TestStorageManager.java 083670a27545c95b56dd2bece8d1c0620593a6f8
Diff: https://reviews.apache.org/r/19821/diff/
Testing
-------
Thanks,
Jinho Kim
Re: Review Request 19821: TAJO-717: Improve file splitting for large number
of splits
Posted by Hyunsik Choi <hy...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19821/#review39978
-----------------------------------------------------------
Ship it!
+1
This patch includes nice code optimization for splitting and enough unit tests. The patch looks good to me.
- Hyunsik Choi
On April 7, 2014, 5:30 p.m., Jinho Kim wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/19821/
> -----------------------------------------------------------
>
> (Updated April 7, 2014, 5:30 p.m.)
>
>
> Review request for Tajo.
>
>
> Bugs: TAJO-717
> https://issues.apache.org/jira/browse/TAJO-717
>
>
> Repository: tajo
>
>
> Description
> -------
>
> In currently, The storageManager invoke the getFileBlockStorageLocations() per input path, it occurred too many rpc to the associated datanodes
>
>
> Diffs
> -----
>
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/DefaultTaskScheduler.java 61fa84e
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/querymaster/Repartitioner.java df8b31b
> tajo-storage/pom.xml b9a162a
> tajo-storage/src/main/java/org/apache/tajo/storage/AbstractStorageManager.java a7ed981
> tajo-storage/src/main/java/org/apache/tajo/storage/fragment/FileFragment.java ea8bf9f
> tajo-storage/src/main/java/org/apache/tajo/storage/rcfile/RCFile.java bbb9df1
> tajo-storage/src/test/java/org/apache/tajo/storage/TestStorageManager.java 083670a
>
> Diff: https://reviews.apache.org/r/19821/diff/
>
>
> Testing
> -------
>
>
> Thanks,
>
> Jinho Kim
>
>
Re: Review Request 19821: TAJO-717: Improve file splitting for large number
of splits
Posted by Jinho Kim <jh...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19821/
-----------------------------------------------------------
(Updated April 7, 2014, 8:30 a.m.)
Review request for Tajo.
Changes
-------
I've fixed 'Failed to connect to datanode'
Bugs: TAJO-717
https://issues.apache.org/jira/browse/TAJO-717
Repository: tajo
Description
-------
In currently, The storageManager invoke the getFileBlockStorageLocations() per input path, it occurred too many rpc to the associated datanodes
Diffs (updated)
-----
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/DefaultTaskScheduler.java 61fa84e
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/querymaster/Repartitioner.java df8b31b
tajo-storage/pom.xml b9a162a
tajo-storage/src/main/java/org/apache/tajo/storage/AbstractStorageManager.java a7ed981
tajo-storage/src/main/java/org/apache/tajo/storage/fragment/FileFragment.java ea8bf9f
tajo-storage/src/main/java/org/apache/tajo/storage/rcfile/RCFile.java bbb9df1
tajo-storage/src/test/java/org/apache/tajo/storage/TestStorageManager.java 083670a
Diff: https://reviews.apache.org/r/19821/diff/
Testing
-------
Thanks,
Jinho Kim
Re: Review Request 19821: TAJO-717: Improve file splitting for large number
of splits
Posted by Hyunsik Choi <hy...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19821/#review39516
-----------------------------------------------------------
Ship it!
+1
Your patch seem to make getSplit() more efficient by reducing the remote call to HDFS. The part becomes much better. Ship it!
- Hyunsik Choi
On April 4, 2014, 2:07 p.m., Jinho Kim wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/19821/
> -----------------------------------------------------------
>
> (Updated April 4, 2014, 2:07 p.m.)
>
>
> Review request for Tajo.
>
>
> Bugs: TAJO-717
> https://issues.apache.org/jira/browse/TAJO-717
>
>
> Repository: tajo
>
>
> Description
> -------
>
> In currently, The storageManager invoke the getFileBlockStorageLocations() per input path, it occurred too many rpc to the associated datanodes
>
>
> Diffs
> -----
>
> tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/querymaster/Repartitioner.java b2adaa4a2623b49022f4dac716dbf5457e93b71d
> tajo-storage/pom.xml b9a162ad3bcfa46b42ff70a47972b991f5ed4283
> tajo-storage/src/main/java/org/apache/tajo/storage/AbstractStorageManager.java a7ed9817749787bfddcb719913ab03021aa19c6c
> tajo-storage/src/main/java/org/apache/tajo/storage/fragment/FileFragment.java ea8bf9f16b93ac5d8f32bcbadd5e4359dac8ad9b
> tajo-storage/src/main/java/org/apache/tajo/storage/rcfile/RCFile.java bbb9df1e5cce58fc236a8f372f654ec78aa24b85
> tajo-storage/src/test/java/org/apache/tajo/storage/TestStorageManager.java 083670a27545c95b56dd2bece8d1c0620593a6f8
>
> Diff: https://reviews.apache.org/r/19821/diff/
>
>
> Testing
> -------
>
>
> Thanks,
>
> Jinho Kim
>
>
Re: Review Request 19821: TAJO-717: Improve file splitting for large number
of splits
Posted by Jinho Kim <jh...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19821/
-----------------------------------------------------------
(Updated April 4, 2014, 5:07 a.m.)
Review request for Tajo.
Changes
-------
I have rebased on master
Bugs: TAJO-717
https://issues.apache.org/jira/browse/TAJO-717
Repository: tajo
Description
-------
In currently, The storageManager invoke the getFileBlockStorageLocations() per input path, it occurred too many rpc to the associated datanodes
Diffs (updated)
-----
tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/querymaster/Repartitioner.java b2adaa4a2623b49022f4dac716dbf5457e93b71d
tajo-storage/pom.xml b9a162ad3bcfa46b42ff70a47972b991f5ed4283
tajo-storage/src/main/java/org/apache/tajo/storage/AbstractStorageManager.java a7ed9817749787bfddcb719913ab03021aa19c6c
tajo-storage/src/main/java/org/apache/tajo/storage/fragment/FileFragment.java ea8bf9f16b93ac5d8f32bcbadd5e4359dac8ad9b
tajo-storage/src/main/java/org/apache/tajo/storage/rcfile/RCFile.java bbb9df1e5cce58fc236a8f372f654ec78aa24b85
tajo-storage/src/test/java/org/apache/tajo/storage/TestStorageManager.java 083670a27545c95b56dd2bece8d1c0620593a6f8
Diff: https://reviews.apache.org/r/19821/diff/
Testing
-------
Thanks,
Jinho Kim