You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@apex.apache.org by Bhupesh Chawda <bh...@datatorrent.com> on 2015/12/21 10:33:05 UTC

HBase Input Operators

Hi All,

Any reason why the HBase input operators in Apex Malhar - contrib are doing
the entire table scan in the emitTuples() method. Shouldn't this just
return a single row each time? The current way seems to be sending the
entire table in a single call of emitTuples().

Here is the code fragment from HBaseScanOperator:

  @Override
>   public void emitTuples()
>   {
>     try {
>       HTable table = getTable();
>       Scan scan = operationScan();
>       ResultScanner scanner = table.getScanner(scan);
>       for (Result result : scanner) {
>         //KeyValue[] kvs = result.raw();
>         //T t = getTuple(kvs);
>         T t = getTuple(result);
>         outputPort.emit(t);
>       }
>     } catch (Exception e) {
>       e.printStackTrace();
>     }
>   }
>
>
Thanks.
Bhupesh

Re: HBase Input Operators

Posted by Sandeep Deshmukh <sa...@datatorrent.com>.
Good point Bhupesh. Should we have maxTuplesPerWindow or emitBatchSize  ?

Input operators in Malhar have different configurations :
1) maxTuplesPerWindow as in kafaka input operator
2) emitBatchSize as in AbstractFileInputOperator

Going forward should we have some guidelines on what parameters should be
defined in input operators? Supporting data size instead of number of
tuples, like bandwidth control should also be part of input operators.

Regards,
Sandeep

On Mon, Dec 21, 2015 at 3:03 PM, Bhupesh Chawda <bh...@datatorrent.com>
wrote:

> Hi All,
>
> Any reason why the HBase input operators in Apex Malhar - contrib are doing
> the entire table scan in the emitTuples() method. Shouldn't this just
> return a single row each time? The current way seems to be sending the
> entire table in a single call of emitTuples().
>
> Here is the code fragment from HBaseScanOperator:
>
>   @Override
> >   public void emitTuples()
> >   {
> >     try {
> >       HTable table = getTable();
> >       Scan scan = operationScan();
> >       ResultScanner scanner = table.getScanner(scan);
> >       for (Result result : scanner) {
> >         //KeyValue[] kvs = result.raw();
> >         //T t = getTuple(kvs);
> >         T t = getTuple(result);
> >         outputPort.emit(t);
> >       }
> >     } catch (Exception e) {
> >       e.printStackTrace();
> >     }
> >   }
> >
> >
> Thanks.
> Bhupesh
>