You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "Kay Kay (JIRA)" <ji...@apache.org> on 2010/02/10 02:56:28 UTC
[jira] Created: (HBASE-2208) TableServers # processBatchOfRows -
converts from List to [ ] - Expensive copy
TableServers # processBatchOfRows - converts from List to [ ] - Expensive copy
------------------------------------------------------------------------------------
Key: HBASE-2208
URL: https://issues.apache.org/jira/browse/HBASE-2208
Project: Hadoop HBase
Issue Type: Improvement
Reporter: Kay Kay
With autoFlush to false and a large write buffer on HTable, when we write bulk puts - TableServer # processBatchOfRows , convert the input (List) to an [ ] , before sending down the wire.
With a write buffer as large as 20 MB , that becomes an expensive copy when we do - list.toArray(new T[ ] ).
May be - should we change the wire protocol to support List as well , and then revisit this to prevent the bulk copy ?
{code}
Batch b = new Batch(this) {
@Override
int doCall(final List<Row> currentList, final byte [] row,
final byte [] tableName)
throws IOException, RuntimeException {
*final Put [] puts = currentList.toArray(PUT_ARRAY_TYPE);*
return getRegionServerWithRetries(new ServerCallable<Integer>(this.c,
tableName, row) {
public Integer call() throws IOException {
return server.put(location.getRegionInfo().getRegionName(), puts);
}
});
}
{code}
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-2208) TableServers # processBatchOfRows
- converts from List to [ ] - Expensive copy
Posted by "stack (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-2208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12831853#action_12831853 ]
stack commented on HBASE-2208:
------------------------------
No, just that you'll be changing the interface. To do that you need to up the rpc version. Upping rpc version can only happen over in a major release, i.e. 0.21.0
> TableServers # processBatchOfRows - converts from List to [ ] - Expensive copy
> ------------------------------------------------------------------------------------
>
> Key: HBASE-2208
> URL: https://issues.apache.org/jira/browse/HBASE-2208
> Project: Hadoop HBase
> Issue Type: Improvement
> Reporter: Kay Kay
>
> With autoFlush to false and a large write buffer on HTable, when we write bulk puts - TableServer # processBatchOfRows , convert the input (List) to an [ ] , before sending down the wire.
> With a write buffer as large as 20 MB , that becomes an expensive copy when we do - list.toArray(new T[ ] ).
> May be - should we change the wire protocol to support List as well , and then revisit this to prevent the bulk copy ?
> {code}
> Batch b = new Batch(this) {
> @Override
> int doCall(final List<Row> currentList, final byte [] row,
> final byte [] tableName)
> throws IOException, RuntimeException {
> *final Put [] puts = currentList.toArray(PUT_ARRAY_TYPE);*
> return getRegionServerWithRetries(new ServerCallable<Integer>(this.c,
> tableName, row) {
> public Integer call() throws IOException {
> return server.put(location.getRegionInfo().getRegionName(), puts);
> }
> });
> }
> {code}
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-2208) TableServers # processBatchOfRows
- converts from List to [ ] - Expensive copy
Posted by "Kay Kay (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-2208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12831816#action_12831816 ]
Kay Kay commented on HBASE-2208:
--------------------------------
Syntactic sugar for this patch apart , is due to fundamental limitation of not allowing List<T> across the wire. HBASE-2209 tracks that separately.
> TableServers # processBatchOfRows - converts from List to [ ] - Expensive copy
> ------------------------------------------------------------------------------------
>
> Key: HBASE-2208
> URL: https://issues.apache.org/jira/browse/HBASE-2208
> Project: Hadoop HBase
> Issue Type: Improvement
> Reporter: Kay Kay
>
> With autoFlush to false and a large write buffer on HTable, when we write bulk puts - TableServer # processBatchOfRows , convert the input (List) to an [ ] , before sending down the wire.
> With a write buffer as large as 20 MB , that becomes an expensive copy when we do - list.toArray(new T[ ] ).
> May be - should we change the wire protocol to support List as well , and then revisit this to prevent the bulk copy ?
> {code}
> Batch b = new Batch(this) {
> @Override
> int doCall(final List<Row> currentList, final byte [] row,
> final byte [] tableName)
> throws IOException, RuntimeException {
> *final Put [] puts = currentList.toArray(PUT_ARRAY_TYPE);*
> return getRegionServerWithRetries(new ServerCallable<Integer>(this.c,
> tableName, row) {
> public Integer call() throws IOException {
> return server.put(location.getRegionInfo().getRegionName(), puts);
> }
> });
> }
> {code}
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-2208) TableServers # processBatchOfRows
- converts from List to [ ] - Expensive copy
Posted by "stack (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-2208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12832635#action_12832635 ]
stack commented on HBASE-2208:
------------------------------
FYI: hbase-2209 was committed.
> TableServers # processBatchOfRows - converts from List to [ ] - Expensive copy
> ------------------------------------------------------------------------------------
>
> Key: HBASE-2208
> URL: https://issues.apache.org/jira/browse/HBASE-2208
> Project: Hadoop HBase
> Issue Type: Improvement
> Reporter: Kay Kay
> Fix For: 0.21.0
>
>
> With autoFlush to false and a large write buffer on HTable, when we write bulk puts - TableServer # processBatchOfRows , convert the input (List) to an [ ] , before sending down the wire.
> With a write buffer as large as 20 MB , that becomes an expensive copy when we do - list.toArray(new T[ ] ).
> May be - should we change the wire protocol to support List as well , and then revisit this to prevent the bulk copy ?
> {code}
> Batch b = new Batch(this) {
> @Override
> int doCall(final List<Row> currentList, final byte [] row,
> final byte [] tableName)
> throws IOException, RuntimeException {
> *final Put [] puts = currentList.toArray(PUT_ARRAY_TYPE);*
> return getRegionServerWithRetries(new ServerCallable<Integer>(this.c,
> tableName, row) {
> public Integer call() throws IOException {
> return server.put(location.getRegionInfo().getRegionName(), puts);
> }
> });
> }
> {code}
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-2208) TableServers # processBatchOfRows
- converts from List to [ ] - Expensive copy
Posted by "Kay Kay (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-2208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12831857#action_12831857 ]
Kay Kay commented on HBASE-2208:
--------------------------------
{quote}
Upping rpc version can only happen over in a major release, i.e. 0.21.0
{quote}
Oh Yes. I did not mean to be in a minor release since I am aware this would break wire protocol compatibility. Let me change the targeted version on the jira as well.
> TableServers # processBatchOfRows - converts from List to [ ] - Expensive copy
> ------------------------------------------------------------------------------------
>
> Key: HBASE-2208
> URL: https://issues.apache.org/jira/browse/HBASE-2208
> Project: Hadoop HBase
> Issue Type: Improvement
> Reporter: Kay Kay
> Fix For: 0.21.0
>
>
> With autoFlush to false and a large write buffer on HTable, when we write bulk puts - TableServer # processBatchOfRows , convert the input (List) to an [ ] , before sending down the wire.
> With a write buffer as large as 20 MB , that becomes an expensive copy when we do - list.toArray(new T[ ] ).
> May be - should we change the wire protocol to support List as well , and then revisit this to prevent the bulk copy ?
> {code}
> Batch b = new Batch(this) {
> @Override
> int doCall(final List<Row> currentList, final byte [] row,
> final byte [] tableName)
> throws IOException, RuntimeException {
> *final Put [] puts = currentList.toArray(PUT_ARRAY_TYPE);*
> return getRegionServerWithRetries(new ServerCallable<Integer>(this.c,
> tableName, row) {
> public Integer call() throws IOException {
> return server.put(location.getRegionInfo().getRegionName(), puts);
> }
> });
> }
> {code}
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-2208) TableServers # processBatchOfRows
- converts from List to [ ] - Expensive copy
Posted by "stack (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-2208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12831842#action_12831842 ]
stack commented on HBASE-2208:
------------------------------
That'd be fine in 0.21
> TableServers # processBatchOfRows - converts from List to [ ] - Expensive copy
> ------------------------------------------------------------------------------------
>
> Key: HBASE-2208
> URL: https://issues.apache.org/jira/browse/HBASE-2208
> Project: Hadoop HBase
> Issue Type: Improvement
> Reporter: Kay Kay
>
> With autoFlush to false and a large write buffer on HTable, when we write bulk puts - TableServer # processBatchOfRows , convert the input (List) to an [ ] , before sending down the wire.
> With a write buffer as large as 20 MB , that becomes an expensive copy when we do - list.toArray(new T[ ] ).
> May be - should we change the wire protocol to support List as well , and then revisit this to prevent the bulk copy ?
> {code}
> Batch b = new Batch(this) {
> @Override
> int doCall(final List<Row> currentList, final byte [] row,
> final byte [] tableName)
> throws IOException, RuntimeException {
> *final Put [] puts = currentList.toArray(PUT_ARRAY_TYPE);*
> return getRegionServerWithRetries(new ServerCallable<Integer>(this.c,
> tableName, row) {
> public Integer call() throws IOException {
> return server.put(location.getRegionInfo().getRegionName(), puts);
> }
> });
> }
> {code}
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HBASE-2208) TableServers # processBatchOfRows -
converts from List to [ ] - Expensive copy
Posted by "Kay Kay (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-2208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Kay Kay updated HBASE-2208:
---------------------------
Fix Version/s: 0.21.0
> TableServers # processBatchOfRows - converts from List to [ ] - Expensive copy
> ------------------------------------------------------------------------------------
>
> Key: HBASE-2208
> URL: https://issues.apache.org/jira/browse/HBASE-2208
> Project: Hadoop HBase
> Issue Type: Improvement
> Reporter: Kay Kay
> Fix For: 0.21.0
>
>
> With autoFlush to false and a large write buffer on HTable, when we write bulk puts - TableServer # processBatchOfRows , convert the input (List) to an [ ] , before sending down the wire.
> With a write buffer as large as 20 MB , that becomes an expensive copy when we do - list.toArray(new T[ ] ).
> May be - should we change the wire protocol to support List as well , and then revisit this to prevent the bulk copy ?
> {code}
> Batch b = new Batch(this) {
> @Override
> int doCall(final List<Row> currentList, final byte [] row,
> final byte [] tableName)
> throws IOException, RuntimeException {
> *final Put [] puts = currentList.toArray(PUT_ARRAY_TYPE);*
> return getRegionServerWithRetries(new ServerCallable<Integer>(this.c,
> tableName, row) {
> public Integer call() throws IOException {
> return server.put(location.getRegionInfo().getRegionName(), puts);
> }
> });
> }
> {code}
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-2208) TableServers # processBatchOfRows
- converts from List to [ ] - Expensive copy
Posted by "Kay Kay (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-2208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12832639#action_12832639 ]
Kay Kay commented on HBASE-2208:
--------------------------------
Thanks stack for commiting hbase-2209. Will revisit this with the signature changes to prevent the copy. Sorry for the separate jira - me just gets nervous by big patches .
> TableServers # processBatchOfRows - converts from List to [ ] - Expensive copy
> ------------------------------------------------------------------------------------
>
> Key: HBASE-2208
> URL: https://issues.apache.org/jira/browse/HBASE-2208
> Project: Hadoop HBase
> Issue Type: Improvement
> Reporter: Kay Kay
> Fix For: 0.21.0
>
>
> With autoFlush to false and a large write buffer on HTable, when we write bulk puts - TableServer # processBatchOfRows , convert the input (List) to an [ ] , before sending down the wire.
> With a write buffer as large as 20 MB , that becomes an expensive copy when we do - list.toArray(new T[ ] ).
> May be - should we change the wire protocol to support List as well , and then revisit this to prevent the bulk copy ?
> {code}
> Batch b = new Batch(this) {
> @Override
> int doCall(final List<Row> currentList, final byte [] row,
> final byte [] tableName)
> throws IOException, RuntimeException {
> *final Put [] puts = currentList.toArray(PUT_ARRAY_TYPE);*
> return getRegionServerWithRetries(new ServerCallable<Integer>(this.c,
> tableName, row) {
> public Integer call() throws IOException {
> return server.put(location.getRegionInfo().getRegionName(), puts);
> }
> });
> }
> {code}
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-2208) TableServers # processBatchOfRows
- converts from List to [ ] - Expensive copy
Posted by "Kay Kay (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-2208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12831849#action_12831849 ]
Kay Kay commented on HBASE-2208:
--------------------------------
did you mean avro or s.th similar will replace the rpc as discussed by ryan in the irc ?
i was looking at trunk
> TableServers # processBatchOfRows - converts from List to [ ] - Expensive copy
> ------------------------------------------------------------------------------------
>
> Key: HBASE-2208
> URL: https://issues.apache.org/jira/browse/HBASE-2208
> Project: Hadoop HBase
> Issue Type: Improvement
> Reporter: Kay Kay
>
> With autoFlush to false and a large write buffer on HTable, when we write bulk puts - TableServer # processBatchOfRows , convert the input (List) to an [ ] , before sending down the wire.
> With a write buffer as large as 20 MB , that becomes an expensive copy when we do - list.toArray(new T[ ] ).
> May be - should we change the wire protocol to support List as well , and then revisit this to prevent the bulk copy ?
> {code}
> Batch b = new Batch(this) {
> @Override
> int doCall(final List<Row> currentList, final byte [] row,
> final byte [] tableName)
> throws IOException, RuntimeException {
> *final Put [] puts = currentList.toArray(PUT_ARRAY_TYPE);*
> return getRegionServerWithRetries(new ServerCallable<Integer>(this.c,
> tableName, row) {
> public Integer call() throws IOException {
> return server.put(location.getRegionInfo().getRegionName(), puts);
> }
> });
> }
> {code}
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HBASE-2208) TableServers # processBatchOfRows -
converts from List to [ ] - Expensive copy
Posted by "Kay Kay (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HBASE-2208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Kay Kay updated HBASE-2208:
---------------------------
*not backward compatible*
> TableServers # processBatchOfRows - converts from List to [ ] - Expensive copy
> ------------------------------------------------------------------------------------
>
> Key: HBASE-2208
> URL: https://issues.apache.org/jira/browse/HBASE-2208
> Project: Hadoop HBase
> Issue Type: Improvement
> Reporter: Kay Kay
> Fix For: 0.21.0
>
>
> With autoFlush to false and a large write buffer on HTable, when we write bulk puts - TableServer # processBatchOfRows , convert the input (List) to an [ ] , before sending down the wire.
> With a write buffer as large as 20 MB , that becomes an expensive copy when we do - list.toArray(new T[ ] ).
> May be - should we change the wire protocol to support List as well , and then revisit this to prevent the bulk copy ?
> {code}
> Batch b = new Batch(this) {
> @Override
> int doCall(final List<Row> currentList, final byte [] row,
> final byte [] tableName)
> throws IOException, RuntimeException {
> *final Put [] puts = currentList.toArray(PUT_ARRAY_TYPE);*
> return getRegionServerWithRetries(new ServerCallable<Integer>(this.c,
> tableName, row) {
> public Integer call() throws IOException {
> return server.put(location.getRegionInfo().getRegionName(), puts);
> }
> });
> }
> {code}
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.