You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Bing Jiang <ji...@gmail.com> on 2015/02/05 04:26:14 UTC
Re: Hbase scan using TIMERANGE
hi, Ted.
Do you know whether there is optimization on scan with TimeRange?
Actually, if set a sparse TimeRange and large scan cache, it will cause rpc
time out sometimes.
Actually, want to know whether it requires scanning each KV for checking
its timestamp?
Thanks,
-Bing
2014-06-28 21:25 GMT+08:00 Ted Yu <yu...@gmail.com>:
> Have you looked at the following method in AggregationClient ?
>
> long rowCount(final HTable table,
>
> final ColumnInterpreter<R, S, P, Q, T> ci, final Scan scan) throws
> Throwable {
>
> You can specify timerange through scan parameter.
>
> See this method of Scan:
>
> public Scan setTimeRange(long minStamp, long maxStamp)
>
> Cheers
>
>
> On Sat, Jun 28, 2014 at 3:42 AM, yogi <yo...@gmail.com> wrote:
>
> > Hi,
> >
> > I have a requirement where I have to make a shell script using which i
> need
> > to scan some 6 huge hbase tables and get the count of records present in
> > them. Also i need the counts per day wise where i pass the date parameter
> > to
> > the shell script which calls these scan commands. I did find a way to
> > convert the date to epoch time and pass it to scan command but the scan
> > keeps running forever. Can some one help me in making this faster.
> >
> > Note: I am scanning the tables based on TIMERANGE as all the tables have
> > this field.
> >
> > Thanks,
> > Yogi
> >
> >
> >
> > --
> > View this message in context:
> >
> http://apache-hbase.679495.n3.nabble.com/Hbase-scan-using-TIMERANGE-tp4060851.html
> > Sent from the HBase User mailing list archive at Nabble.com.
> >
>
Re: Hbase scan using TIMERANGE
Posted by Bing Jiang <ji...@gmail.com>.
Really thankful for Ted's points.
Yes, the tight time range will cause scan to be very slow to fill the cache.
I will investigate the hbase-5032 further, will report to you if there are
some progresses and improvements.
Thank you!
-Bing
2015-02-05 11:34 GMT+08:00 Ted Yu <yu...@gmail.com>:
> bq. set a sparse TimeRange
>
> You mean a TimeRange whose span is short ?
>
> bq. and large scan cache
>
> Can you try smaller number of rows for caching ?
>
> A preliminary search led me to HBASE-5032 'Add other DELETE type
> information into the delete bloom filter to optimize the time range query'
>
> Cheers
>
> On Wed, Feb 4, 2015 at 7:26 PM, Bing Jiang <ji...@gmail.com>
> wrote:
>
> > hi, Ted.
> >
> > Do you know whether there is optimization on scan with TimeRange?
> >
> > Actually, if set a sparse TimeRange and large scan cache, it will cause
> rpc
> > time out sometimes.
> >
> >
> > Actually, want to know whether it requires scanning each KV for checking
> > its timestamp?
> >
> > Thanks,
> > -Bing
> >
> > 2014-06-28 21:25 GMT+08:00 Ted Yu <yu...@gmail.com>:
> >
> > > Have you looked at the following method in AggregationClient ?
> > >
> > > long rowCount(final HTable table,
> > >
> > > final ColumnInterpreter<R, S, P, Q, T> ci, final Scan scan)
> throws
> > > Throwable {
> > >
> > > You can specify timerange through scan parameter.
> > >
> > > See this method of Scan:
> > >
> > > public Scan setTimeRange(long minStamp, long maxStamp)
> > >
> > > Cheers
> > >
> > >
> > > On Sat, Jun 28, 2014 at 3:42 AM, yogi <yo...@gmail.com> wrote:
> > >
> > > > Hi,
> > > >
> > > > I have a requirement where I have to make a shell script using which
> i
> > > need
> > > > to scan some 6 huge hbase tables and get the count of records present
> > in
> > > > them. Also i need the counts per day wise where i pass the date
> > parameter
> > > > to
> > > > the shell script which calls these scan commands. I did find a way to
> > > > convert the date to epoch time and pass it to scan command but the
> scan
> > > > keeps running forever. Can some one help me in making this faster.
> > > >
> > > > Note: I am scanning the tables based on TIMERANGE as all the tables
> > have
> > > > this field.
> > > >
> > > > Thanks,
> > > > Yogi
> > > >
> > > >
> > > >
> > > > --
> > > > View this message in context:
> > > >
> > >
> >
> http://apache-hbase.679495.n3.nabble.com/Hbase-scan-using-TIMERANGE-tp4060851.html
> > > > Sent from the HBase User mailing list archive at Nabble.com.
> > > >
> > >
> >
>
--
Bing Jiang
Re: Hbase scan using TIMERANGE
Posted by Ted Yu <yu...@gmail.com>.
bq. set a sparse TimeRange
You mean a TimeRange whose span is short ?
bq. and large scan cache
Can you try smaller number of rows for caching ?
A preliminary search led me to HBASE-5032 'Add other DELETE type
information into the delete bloom filter to optimize the time range query'
Cheers
On Wed, Feb 4, 2015 at 7:26 PM, Bing Jiang <ji...@gmail.com> wrote:
> hi, Ted.
>
> Do you know whether there is optimization on scan with TimeRange?
>
> Actually, if set a sparse TimeRange and large scan cache, it will cause rpc
> time out sometimes.
>
>
> Actually, want to know whether it requires scanning each KV for checking
> its timestamp?
>
> Thanks,
> -Bing
>
> 2014-06-28 21:25 GMT+08:00 Ted Yu <yu...@gmail.com>:
>
> > Have you looked at the following method in AggregationClient ?
> >
> > long rowCount(final HTable table,
> >
> > final ColumnInterpreter<R, S, P, Q, T> ci, final Scan scan) throws
> > Throwable {
> >
> > You can specify timerange through scan parameter.
> >
> > See this method of Scan:
> >
> > public Scan setTimeRange(long minStamp, long maxStamp)
> >
> > Cheers
> >
> >
> > On Sat, Jun 28, 2014 at 3:42 AM, yogi <yo...@gmail.com> wrote:
> >
> > > Hi,
> > >
> > > I have a requirement where I have to make a shell script using which i
> > need
> > > to scan some 6 huge hbase tables and get the count of records present
> in
> > > them. Also i need the counts per day wise where i pass the date
> parameter
> > > to
> > > the shell script which calls these scan commands. I did find a way to
> > > convert the date to epoch time and pass it to scan command but the scan
> > > keeps running forever. Can some one help me in making this faster.
> > >
> > > Note: I am scanning the tables based on TIMERANGE as all the tables
> have
> > > this field.
> > >
> > > Thanks,
> > > Yogi
> > >
> > >
> > >
> > > --
> > > View this message in context:
> > >
> >
> http://apache-hbase.679495.n3.nabble.com/Hbase-scan-using-TIMERANGE-tp4060851.html
> > > Sent from the HBase User mailing list archive at Nabble.com.
> > >
> >
>