You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Stuti Awasthi <st...@hcl.com> on 2011/11/16 08:41:03 UTC
Facing Issues with RowCounter
Hi,
I tried to use MR RowCounter to count the rows of a table with specific column family. But it is not displaying correct result.
Command (Only Table Name as argument ): Hbase/hbase-0.90.3/bin/hbase org.apache.hadoop.hbase.mapreduce.RowCounter Keyword
Output :
11/11/16 13:04:31 INFO mapred.TaskRunner: Task 'attempt_local_0001_m_000000_0' done.
11/11/16 13:04:32 INFO mapred.JobClient: map 100% reduce 0%
11/11/16 13:04:32 INFO mapred.JobClient: Job complete: job_local_0001
11/11/16 13:04:32 INFO mapred.JobClient: Counters: 6
11/11/16 13:04:32 INFO mapred.JobClient: org.apache.hadoop.hbase.mapreduce.RowCounter$RowCounterMapper$Counters
11/11/16 13:04:32 INFO mapred.JobClient: ROWS=7
11/11/16 13:04:32 INFO mapred.JobClient: FileSystemCounters
11/11/16 13:04:32 INFO mapred.JobClient: FILE_BYTES_READ=2373099
11/11/16 13:04:32 INFO mapred.JobClient: FILE_BYTES_WRITTEN=2411923
11/11/16 13:04:32 INFO mapred.JobClient: Map-Reduce Framework
11/11/16 13:04:32 INFO mapred.JobClient: Map input records=7
11/11/16 13:04:32 INFO mapred.JobClient: Spilled Records=0
11/11/16 13:04:32 INFO mapred.JobClient: Map output records=0
Command (TableName, ColumnFamily): Hbase/hbase-0.90.3/bin/hbase org.apache.hadoop.hbase.mapreduce.RowCounter Keyword Set
Output :
11/11/16 13:05:33 INFO mapred.TaskRunner: Task 'attempt_local_0001_m_000000_0' done.
11/11/16 13:05:34 INFO mapred.JobClient: map 100% reduce 0%
11/11/16 13:05:34 INFO mapred.JobClient: Job complete: job_local_0001
11/11/16 13:05:34 INFO mapred.JobClient: Counters: 5
11/11/16 13:05:34 INFO mapred.JobClient: FileSystemCounters
11/11/16 13:05:34 INFO mapred.JobClient: FILE_BYTES_READ=2373107
11/11/16 13:05:34 INFO mapred.JobClient: FILE_BYTES_WRITTEN=2411939
11/11/16 13:05:34 INFO mapred.JobClient: Map-Reduce Framework
11/11/16 13:05:34 INFO mapred.JobClient: Map input records=7
11/11/16 13:05:34 INFO mapred.JobClient: Spilled Records=0
11/11/16 13:05:34 INFO mapred.JobClient: Map output records=0
Table Describe command Output is :
TABLE => {{NAME => 'Keyword', FAMILIES => [{NAME => 'Info', BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', COMPRESSION => 'NONE', VERSIONS => '3', TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}, {NAME => 'Set', BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', COMPRESSION => 'NONE', VERSIONS => '3', TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}]}}
Am I executing in wrong way or this is some bug ?
Regards,
Stuti Awasthi
HCL Comnet Systems and Services Ltd
F-8/9 Basement, Sec-3,Noida.
________________________________
::DISCLAIMER::
-----------------------------------------------------------------------------------------------------------------------
The contents of this e-mail and any attachment(s) are confidential and intended for the named recipient(s) only.
It shall not attach any liability on the originator or HCL or its affiliates. Any views or opinions presented in
this email are solely those of the author and may not necessarily reflect the opinions of HCL or its affiliates.
Any form of reproduction, dissemination, copying, disclosure, modification, distribution and / or publication of
this message without the prior written consent of the author of this e-mail is strictly prohibited. If you have
received this email in error please delete it and notify the sender immediately. Before opening any mail and
attachments please check them for viruses and defect.
-----------------------------------------------------------------------------------------------------------------------
RE: Facing Issues with RowCounter
Posted by Stuti Awasthi <st...@hcl.com>.
Ok.
Thanks for update. Il check the patch else I can write my own MR for row count.
Cheers
Stuti
-----Original Message-----
From: jdcryans@gmail.com [mailto:jdcryans@gmail.com] On Behalf Of Jean-Daniel Cryans
Sent: Friday, November 18, 2011 3:37 AM
To: user@hbase.apache.org
Subject: Re: Facing Issues with RowCounter
Ah! Took me a moment to figure it out, it's:
https://issues.apache.org/jira/browse/HBASE-4295 "rowcounter does not return the correct number of rows in certain circumstances"
What made me think about it is that your counters do say that rows were taken into input, but none counted because the values are empty.
That was the problem in 4295.
The patch is currently only in the tip of the 0.90 branch, so unless you patch it yourself you'll have to wait for 0.90.5 (which may or may not get released, depends if someone wants to do it).
J-D
On Wed, Nov 16, 2011 at 9:27 PM, Stuti Awasthi <st...@hcl.com> wrote:
> Hi JD,
>
> Table 'Keyword' contains 'Set' column family with 7 rows. Here is the output of scan :
>
> hbase(main):001:0> scan 'Keyword',{COLUMNS=>['Set']} ROW
> COLUMN+CELL
> Apache column=Set:Fuse,
> timestamp=1321506922206, value=
> Apache column=Set:Hadoop,
> timestamp=1321506922206, value=
> Apache column=Set:Hive,
> timestamp=1321506922206, value=
> Apache column=Set:MySql,
> timestamp=1321506922206, value=
> Apache column=Set:PHP,
> timestamp=1321506922206, value=
> Fuse column=Set:Apache,
> timestamp=1321506922206, value=
> Fuse column=Set:Hdfs,
> timestamp=1321506922209, value=
> Hadoop column=Set:Apache,
> timestamp=1321506922209, value=
> Hadoop column=Set:Hive,
> timestamp=1321506922212, value=
> Hdfs column=Set:Fuse,
> timestamp=1321506922212, value=
> Hive column=Set:Apache,
> timestamp=1321506922212, value=
> Hive column=Set:Hadoop,
> timestamp=1321506922214, value=
> MySql column=Set:Apache,
> timestamp=1321506922214, value=
> MySql column=Set:PHP,
> timestamp=1321506922216, value=
> PHP column=Set:Apache,
> timestamp=1321506922216, value=
> PHP column=Set:MySql,
> timestamp=1321506922218, value=
> 7 row(s) in 0.4120 seconds
>
> This output is not shown in RowCounter MR job.
>
> -----Original Message-----
> From: jdcryans@gmail.com [mailto:jdcryans@gmail.com] On Behalf Of
> Jean-Daniel Cryans
> Sent: Wednesday, November 16, 2011 11:09 PM
> To: user@hbase.apache.org
> Subject: Re: Facing Issues with RowCounter
>
> What I can decrypt from those outputs is that you have a total of 7 rows, and none of them have data in the "Set" column family. Is it the case or not? Without more info from you, it's hard to tell.
>
> J-D
>
> On Tue, Nov 15, 2011 at 11:41 PM, Stuti Awasthi <st...@hcl.com> wrote:
>> Hi,
>> I tried to use MR RowCounter to count the rows of a table with specific column family. But it is not displaying correct result.
>>
>> Command (Only Table Name as argument ): Hbase/hbase-0.90.3/bin/hbase
>> org.apache.hadoop.hbase.mapreduce.RowCounter Keyword Output :
>> 11/11/16 13:04:31 INFO mapred.TaskRunner: Task 'attempt_local_0001_m_000000_0' done.
>> 11/11/16 13:04:32 INFO mapred.JobClient: map 100% reduce 0%
>> 11/11/16 13:04:32 INFO mapred.JobClient: Job complete: job_local_0001
>> 11/11/16 13:04:32 INFO mapred.JobClient: Counters: 6
>> 11/11/16 13:04:32 INFO mapred.JobClient:
>> org.apache.hadoop.hbase.mapreduce.RowCounter$RowCounterMapper$Counter
>> s
>> 11/11/16 13:04:32 INFO mapred.JobClient: ROWS=7
>> 11/11/16 13:04:32 INFO mapred.JobClient: FileSystemCounters
>> 11/11/16 13:04:32 INFO mapred.JobClient: FILE_BYTES_READ=2373099
>> 11/11/16 13:04:32 INFO mapred.JobClient:
>> FILE_BYTES_WRITTEN=2411923
>> 11/11/16 13:04:32 INFO mapred.JobClient: Map-Reduce Framework
>> 11/11/16 13:04:32 INFO mapred.JobClient: Map input records=7
>> 11/11/16 13:04:32 INFO mapred.JobClient: Spilled Records=0
>> 11/11/16 13:04:32 INFO mapred.JobClient: Map output records=0
>>
>> Command (TableName, ColumnFamily): Hbase/hbase-0.90.3/bin/hbase
>> org.apache.hadoop.hbase.mapreduce.RowCounter Keyword Set
>>
>> Output :
>> 11/11/16 13:05:33 INFO mapred.TaskRunner: Task 'attempt_local_0001_m_000000_0' done.
>> 11/11/16 13:05:34 INFO mapred.JobClient: map 100% reduce 0%
>> 11/11/16 13:05:34 INFO mapred.JobClient: Job complete: job_local_0001
>> 11/11/16 13:05:34 INFO mapred.JobClient: Counters: 5
>> 11/11/16 13:05:34 INFO mapred.JobClient: FileSystemCounters
>> 11/11/16 13:05:34 INFO mapred.JobClient: FILE_BYTES_READ=2373107
>> 11/11/16 13:05:34 INFO mapred.JobClient:
>> FILE_BYTES_WRITTEN=2411939
>> 11/11/16 13:05:34 INFO mapred.JobClient: Map-Reduce Framework
>> 11/11/16 13:05:34 INFO mapred.JobClient: Map input records=7
>> 11/11/16 13:05:34 INFO mapred.JobClient: Spilled Records=0
>> 11/11/16 13:05:34 INFO mapred.JobClient: Map output records=0
>>
>> Table Describe command Output is :
>> TABLE => {{NAME => 'Keyword', FAMILIES => [{NAME => 'Info',
>> BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', COMPRESSION =>
>> 'NONE', VERSIONS => '3', TTL => '2147483647', BLOCKSIZE => '65536',
>> IN_MEMORY => 'false', BLOCKCACHE => 'true'}, {NAME => 'Set',
>> BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', COMPRESSION =>
>> 'NONE', VERSIONS => '3', TTL => '2147483647', BLOCKSIZE => '65536',
>> IN_MEMORY => 'false', BLOCKCACHE => 'true'}]}}
>>
>> Am I executing in wrong way or this is some bug ?
>>
>> Regards,
>> Stuti Awasthi
>> HCL Comnet Systems and Services Ltd
>> F-8/9 Basement, Sec-3,Noida.
>>
>>
>> ________________________________
>> ::DISCLAIMER::
>> ---------------------------------------------------------------------
>> -
>> -------------------------------------------------
>>
>> The contents of this e-mail and any attachment(s) are confidential and intended for the named recipient(s) only.
>> It shall not attach any liability on the originator or HCL or its
>> affiliates. Any views or opinions presented in this email are solely those of the author and may not necessarily reflect the opinions of HCL or its affiliates.
>> Any form of reproduction, dissemination, copying, disclosure,
>> modification, distribution and / or publication of this message
>> without the prior written consent of the author of this e-mail is
>> strictly prohibited. If you have received this email in error please delete it and notify the sender immediately. Before opening any mail and attachments please check them for viruses and defect.
>>
>> ---------------------------------------------------------------------
>> -
>> -------------------------------------------------
>>
>
Re: Facing Issues with RowCounter
Posted by Jean-Daniel Cryans <jd...@apache.org>.
Awesome! Thanks for the feedback!
J-D
On Thu, Nov 17, 2011 at 11:07 PM, Stuti Awasthi <st...@hcl.com> wrote:
> Hi JD,
> I have applied the patch and tested it also, its working fine now. :) Thanks
>
> -----Original Message-----
> From: Stuti Awasthi
> Sent: Friday, November 18, 2011 11:27 AM
> To: user@hbase.apache.org
> Subject: RE: Facing Issues with RowCounter
>
> Ok.
> Thanks for update. Il check the patch else I can write my own MR for row count.
>
> Cheers
> Stuti
>
> -----Original Message-----
> From: jdcryans@gmail.com [mailto:jdcryans@gmail.com] On Behalf Of Jean-Daniel Cryans
> Sent: Friday, November 18, 2011 3:37 AM
> To: user@hbase.apache.org
> Subject: Re: Facing Issues with RowCounter
>
> Ah! Took me a moment to figure it out, it's:
>
> https://issues.apache.org/jira/browse/HBASE-4295 "rowcounter does not return the correct number of rows in certain circumstances"
>
> What made me think about it is that your counters do say that rows were taken into input, but none counted because the values are empty.
> That was the problem in 4295.
>
> The patch is currently only in the tip of the 0.90 branch, so unless you patch it yourself you'll have to wait for 0.90.5 (which may or may not get released, depends if someone wants to do it).
>
> J-D
>
> On Wed, Nov 16, 2011 at 9:27 PM, Stuti Awasthi <st...@hcl.com> wrote:
>> Hi JD,
>>
>> Table 'Keyword' contains 'Set' column family with 7 rows. Here is the output of scan :
>>
>> hbase(main):001:0> scan 'Keyword',{COLUMNS=>['Set']} ROW
>> COLUMN+CELL
>> Apache column=Set:Fuse,
>> timestamp=1321506922206, value=
>> Apache column=Set:Hadoop,
>> timestamp=1321506922206, value=
>> Apache column=Set:Hive,
>> timestamp=1321506922206, value=
>> Apache column=Set:MySql,
>> timestamp=1321506922206, value=
>> Apache column=Set:PHP,
>> timestamp=1321506922206, value=
>> Fuse column=Set:Apache,
>> timestamp=1321506922206, value=
>> Fuse column=Set:Hdfs,
>> timestamp=1321506922209, value=
>> Hadoop column=Set:Apache,
>> timestamp=1321506922209, value=
>> Hadoop column=Set:Hive,
>> timestamp=1321506922212, value=
>> Hdfs column=Set:Fuse,
>> timestamp=1321506922212, value=
>> Hive column=Set:Apache,
>> timestamp=1321506922212, value=
>> Hive column=Set:Hadoop,
>> timestamp=1321506922214, value=
>> MySql column=Set:Apache,
>> timestamp=1321506922214, value=
>> MySql column=Set:PHP,
>> timestamp=1321506922216, value=
>> PHP column=Set:Apache,
>> timestamp=1321506922216, value=
>> PHP column=Set:MySql,
>> timestamp=1321506922218, value=
>> 7 row(s) in 0.4120 seconds
>>
>> This output is not shown in RowCounter MR job.
>>
>> -----Original Message-----
>> From: jdcryans@gmail.com [mailto:jdcryans@gmail.com] On Behalf Of
>> Jean-Daniel Cryans
>> Sent: Wednesday, November 16, 2011 11:09 PM
>> To: user@hbase.apache.org
>> Subject: Re: Facing Issues with RowCounter
>>
>> What I can decrypt from those outputs is that you have a total of 7 rows, and none of them have data in the "Set" column family. Is it the case or not? Without more info from you, it's hard to tell.
>>
>> J-D
>>
>> On Tue, Nov 15, 2011 at 11:41 PM, Stuti Awasthi <st...@hcl.com> wrote:
>>> Hi,
>>> I tried to use MR RowCounter to count the rows of a table with specific column family. But it is not displaying correct result.
>>>
>>> Command (Only Table Name as argument ): Hbase/hbase-0.90.3/bin/hbase
>>> org.apache.hadoop.hbase.mapreduce.RowCounter Keyword Output :
>>> 11/11/16 13:04:31 INFO mapred.TaskRunner: Task 'attempt_local_0001_m_000000_0' done.
>>> 11/11/16 13:04:32 INFO mapred.JobClient: map 100% reduce 0%
>>> 11/11/16 13:04:32 INFO mapred.JobClient: Job complete: job_local_0001
>>> 11/11/16 13:04:32 INFO mapred.JobClient: Counters: 6
>>> 11/11/16 13:04:32 INFO mapred.JobClient:
>>> org.apache.hadoop.hbase.mapreduce.RowCounter$RowCounterMapper$Counter
>>> s
>>> 11/11/16 13:04:32 INFO mapred.JobClient: ROWS=7
>>> 11/11/16 13:04:32 INFO mapred.JobClient: FileSystemCounters
>>> 11/11/16 13:04:32 INFO mapred.JobClient: FILE_BYTES_READ=2373099
>>> 11/11/16 13:04:32 INFO mapred.JobClient:
>>> FILE_BYTES_WRITTEN=2411923
>>> 11/11/16 13:04:32 INFO mapred.JobClient: Map-Reduce Framework
>>> 11/11/16 13:04:32 INFO mapred.JobClient: Map input records=7
>>> 11/11/16 13:04:32 INFO mapred.JobClient: Spilled Records=0
>>> 11/11/16 13:04:32 INFO mapred.JobClient: Map output records=0
>>>
>>> Command (TableName, ColumnFamily): Hbase/hbase-0.90.3/bin/hbase
>>> org.apache.hadoop.hbase.mapreduce.RowCounter Keyword Set
>>>
>>> Output :
>>> 11/11/16 13:05:33 INFO mapred.TaskRunner: Task 'attempt_local_0001_m_000000_0' done.
>>> 11/11/16 13:05:34 INFO mapred.JobClient: map 100% reduce 0%
>>> 11/11/16 13:05:34 INFO mapred.JobClient: Job complete: job_local_0001
>>> 11/11/16 13:05:34 INFO mapred.JobClient: Counters: 5
>>> 11/11/16 13:05:34 INFO mapred.JobClient: FileSystemCounters
>>> 11/11/16 13:05:34 INFO mapred.JobClient: FILE_BYTES_READ=2373107
>>> 11/11/16 13:05:34 INFO mapred.JobClient:
>>> FILE_BYTES_WRITTEN=2411939
>>> 11/11/16 13:05:34 INFO mapred.JobClient: Map-Reduce Framework
>>> 11/11/16 13:05:34 INFO mapred.JobClient: Map input records=7
>>> 11/11/16 13:05:34 INFO mapred.JobClient: Spilled Records=0
>>> 11/11/16 13:05:34 INFO mapred.JobClient: Map output records=0
>>>
>>> Table Describe command Output is :
>>> TABLE => {{NAME => 'Keyword', FAMILIES => [{NAME => 'Info',
>>> BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', COMPRESSION =>
>>> 'NONE', VERSIONS => '3', TTL => '2147483647', BLOCKSIZE => '65536',
>>> IN_MEMORY => 'false', BLOCKCACHE => 'true'}, {NAME => 'Set',
>>> BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', COMPRESSION =>
>>> 'NONE', VERSIONS => '3', TTL => '2147483647', BLOCKSIZE => '65536',
>>> IN_MEMORY => 'false', BLOCKCACHE => 'true'}]}}
>>>
>>> Am I executing in wrong way or this is some bug ?
>>>
>>> Regards,
>>> Stuti Awasthi
>>> HCL Comnet Systems and Services Ltd
>>> F-8/9 Basement, Sec-3,Noida.
>>>
>>>
>>> ________________________________
>>> ::DISCLAIMER::
>>> ---------------------------------------------------------------------
>>> -
>>> -------------------------------------------------
>>>
>>> The contents of this e-mail and any attachment(s) are confidential and intended for the named recipient(s) only.
>>> It shall not attach any liability on the originator or HCL or its
>>> affiliates. Any views or opinions presented in this email are solely those of the author and may not necessarily reflect the opinions of HCL or its affiliates.
>>> Any form of reproduction, dissemination, copying, disclosure,
>>> modification, distribution and / or publication of this message
>>> without the prior written consent of the author of this e-mail is
>>> strictly prohibited. If you have received this email in error please delete it and notify the sender immediately. Before opening any mail and attachments please check them for viruses and defect.
>>>
>>> ---------------------------------------------------------------------
>>> -
>>> -------------------------------------------------
>>>
>>
>
RE: Facing Issues with RowCounter
Posted by Stuti Awasthi <st...@hcl.com>.
Hi JD,
I have applied the patch and tested it also, its working fine now. :) Thanks
-----Original Message-----
From: Stuti Awasthi
Sent: Friday, November 18, 2011 11:27 AM
To: user@hbase.apache.org
Subject: RE: Facing Issues with RowCounter
Ok.
Thanks for update. Il check the patch else I can write my own MR for row count.
Cheers
Stuti
-----Original Message-----
From: jdcryans@gmail.com [mailto:jdcryans@gmail.com] On Behalf Of Jean-Daniel Cryans
Sent: Friday, November 18, 2011 3:37 AM
To: user@hbase.apache.org
Subject: Re: Facing Issues with RowCounter
Ah! Took me a moment to figure it out, it's:
https://issues.apache.org/jira/browse/HBASE-4295 "rowcounter does not return the correct number of rows in certain circumstances"
What made me think about it is that your counters do say that rows were taken into input, but none counted because the values are empty.
That was the problem in 4295.
The patch is currently only in the tip of the 0.90 branch, so unless you patch it yourself you'll have to wait for 0.90.5 (which may or may not get released, depends if someone wants to do it).
J-D
On Wed, Nov 16, 2011 at 9:27 PM, Stuti Awasthi <st...@hcl.com> wrote:
> Hi JD,
>
> Table 'Keyword' contains 'Set' column family with 7 rows. Here is the output of scan :
>
> hbase(main):001:0> scan 'Keyword',{COLUMNS=>['Set']} ROW
> COLUMN+CELL
> Apache column=Set:Fuse,
> timestamp=1321506922206, value=
> Apache column=Set:Hadoop,
> timestamp=1321506922206, value=
> Apache column=Set:Hive,
> timestamp=1321506922206, value=
> Apache column=Set:MySql,
> timestamp=1321506922206, value=
> Apache column=Set:PHP,
> timestamp=1321506922206, value=
> Fuse column=Set:Apache,
> timestamp=1321506922206, value=
> Fuse column=Set:Hdfs,
> timestamp=1321506922209, value=
> Hadoop column=Set:Apache,
> timestamp=1321506922209, value=
> Hadoop column=Set:Hive,
> timestamp=1321506922212, value=
> Hdfs column=Set:Fuse,
> timestamp=1321506922212, value=
> Hive column=Set:Apache,
> timestamp=1321506922212, value=
> Hive column=Set:Hadoop,
> timestamp=1321506922214, value=
> MySql column=Set:Apache,
> timestamp=1321506922214, value=
> MySql column=Set:PHP,
> timestamp=1321506922216, value=
> PHP column=Set:Apache,
> timestamp=1321506922216, value=
> PHP column=Set:MySql,
> timestamp=1321506922218, value=
> 7 row(s) in 0.4120 seconds
>
> This output is not shown in RowCounter MR job.
>
> -----Original Message-----
> From: jdcryans@gmail.com [mailto:jdcryans@gmail.com] On Behalf Of
> Jean-Daniel Cryans
> Sent: Wednesday, November 16, 2011 11:09 PM
> To: user@hbase.apache.org
> Subject: Re: Facing Issues with RowCounter
>
> What I can decrypt from those outputs is that you have a total of 7 rows, and none of them have data in the "Set" column family. Is it the case or not? Without more info from you, it's hard to tell.
>
> J-D
>
> On Tue, Nov 15, 2011 at 11:41 PM, Stuti Awasthi <st...@hcl.com> wrote:
>> Hi,
>> I tried to use MR RowCounter to count the rows of a table with specific column family. But it is not displaying correct result.
>>
>> Command (Only Table Name as argument ): Hbase/hbase-0.90.3/bin/hbase
>> org.apache.hadoop.hbase.mapreduce.RowCounter Keyword Output :
>> 11/11/16 13:04:31 INFO mapred.TaskRunner: Task 'attempt_local_0001_m_000000_0' done.
>> 11/11/16 13:04:32 INFO mapred.JobClient: map 100% reduce 0%
>> 11/11/16 13:04:32 INFO mapred.JobClient: Job complete: job_local_0001
>> 11/11/16 13:04:32 INFO mapred.JobClient: Counters: 6
>> 11/11/16 13:04:32 INFO mapred.JobClient:
>> org.apache.hadoop.hbase.mapreduce.RowCounter$RowCounterMapper$Counter
>> s
>> 11/11/16 13:04:32 INFO mapred.JobClient: ROWS=7
>> 11/11/16 13:04:32 INFO mapred.JobClient: FileSystemCounters
>> 11/11/16 13:04:32 INFO mapred.JobClient: FILE_BYTES_READ=2373099
>> 11/11/16 13:04:32 INFO mapred.JobClient:
>> FILE_BYTES_WRITTEN=2411923
>> 11/11/16 13:04:32 INFO mapred.JobClient: Map-Reduce Framework
>> 11/11/16 13:04:32 INFO mapred.JobClient: Map input records=7
>> 11/11/16 13:04:32 INFO mapred.JobClient: Spilled Records=0
>> 11/11/16 13:04:32 INFO mapred.JobClient: Map output records=0
>>
>> Command (TableName, ColumnFamily): Hbase/hbase-0.90.3/bin/hbase
>> org.apache.hadoop.hbase.mapreduce.RowCounter Keyword Set
>>
>> Output :
>> 11/11/16 13:05:33 INFO mapred.TaskRunner: Task 'attempt_local_0001_m_000000_0' done.
>> 11/11/16 13:05:34 INFO mapred.JobClient: map 100% reduce 0%
>> 11/11/16 13:05:34 INFO mapred.JobClient: Job complete: job_local_0001
>> 11/11/16 13:05:34 INFO mapred.JobClient: Counters: 5
>> 11/11/16 13:05:34 INFO mapred.JobClient: FileSystemCounters
>> 11/11/16 13:05:34 INFO mapred.JobClient: FILE_BYTES_READ=2373107
>> 11/11/16 13:05:34 INFO mapred.JobClient:
>> FILE_BYTES_WRITTEN=2411939
>> 11/11/16 13:05:34 INFO mapred.JobClient: Map-Reduce Framework
>> 11/11/16 13:05:34 INFO mapred.JobClient: Map input records=7
>> 11/11/16 13:05:34 INFO mapred.JobClient: Spilled Records=0
>> 11/11/16 13:05:34 INFO mapred.JobClient: Map output records=0
>>
>> Table Describe command Output is :
>> TABLE => {{NAME => 'Keyword', FAMILIES => [{NAME => 'Info',
>> BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', COMPRESSION =>
>> 'NONE', VERSIONS => '3', TTL => '2147483647', BLOCKSIZE => '65536',
>> IN_MEMORY => 'false', BLOCKCACHE => 'true'}, {NAME => 'Set',
>> BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', COMPRESSION =>
>> 'NONE', VERSIONS => '3', TTL => '2147483647', BLOCKSIZE => '65536',
>> IN_MEMORY => 'false', BLOCKCACHE => 'true'}]}}
>>
>> Am I executing in wrong way or this is some bug ?
>>
>> Regards,
>> Stuti Awasthi
>> HCL Comnet Systems and Services Ltd
>> F-8/9 Basement, Sec-3,Noida.
>>
>>
>> ________________________________
>> ::DISCLAIMER::
>> ---------------------------------------------------------------------
>> -
>> -------------------------------------------------
>>
>> The contents of this e-mail and any attachment(s) are confidential and intended for the named recipient(s) only.
>> It shall not attach any liability on the originator or HCL or its
>> affiliates. Any views or opinions presented in this email are solely those of the author and may not necessarily reflect the opinions of HCL or its affiliates.
>> Any form of reproduction, dissemination, copying, disclosure,
>> modification, distribution and / or publication of this message
>> without the prior written consent of the author of this e-mail is
>> strictly prohibited. If you have received this email in error please delete it and notify the sender immediately. Before opening any mail and attachments please check them for viruses and defect.
>>
>> ---------------------------------------------------------------------
>> -
>> -------------------------------------------------
>>
>
Re: Facing Issues with RowCounter
Posted by Jean-Daniel Cryans <jd...@apache.org>.
Ah! Took me a moment to figure it out, it's:
https://issues.apache.org/jira/browse/HBASE-4295 "rowcounter does not
return the correct number of rows in certain circumstances"
What made me think about it is that your counters do say that rows
were taken into input, but none counted because the values are empty.
That was the problem in 4295.
The patch is currently only in the tip of the 0.90 branch, so unless
you patch it yourself you'll have to wait for 0.90.5 (which may or may
not get released, depends if someone wants to do it).
J-D
On Wed, Nov 16, 2011 at 9:27 PM, Stuti Awasthi <st...@hcl.com> wrote:
> Hi JD,
>
> Table 'Keyword' contains 'Set' column family with 7 rows. Here is the output of scan :
>
> hbase(main):001:0> scan 'Keyword',{COLUMNS=>['Set']}
> ROW COLUMN+CELL
> Apache column=Set:Fuse, timestamp=1321506922206, value=
> Apache column=Set:Hadoop, timestamp=1321506922206, value=
> Apache column=Set:Hive, timestamp=1321506922206, value=
> Apache column=Set:MySql, timestamp=1321506922206, value=
> Apache column=Set:PHP, timestamp=1321506922206, value=
> Fuse column=Set:Apache, timestamp=1321506922206, value=
> Fuse column=Set:Hdfs, timestamp=1321506922209, value=
> Hadoop column=Set:Apache, timestamp=1321506922209, value=
> Hadoop column=Set:Hive, timestamp=1321506922212, value=
> Hdfs column=Set:Fuse, timestamp=1321506922212, value=
> Hive column=Set:Apache, timestamp=1321506922212, value=
> Hive column=Set:Hadoop, timestamp=1321506922214, value=
> MySql column=Set:Apache, timestamp=1321506922214, value=
> MySql column=Set:PHP, timestamp=1321506922216, value=
> PHP column=Set:Apache, timestamp=1321506922216, value=
> PHP column=Set:MySql, timestamp=1321506922218, value=
> 7 row(s) in 0.4120 seconds
>
> This output is not shown in RowCounter MR job.
>
> -----Original Message-----
> From: jdcryans@gmail.com [mailto:jdcryans@gmail.com] On Behalf Of Jean-Daniel Cryans
> Sent: Wednesday, November 16, 2011 11:09 PM
> To: user@hbase.apache.org
> Subject: Re: Facing Issues with RowCounter
>
> What I can decrypt from those outputs is that you have a total of 7 rows, and none of them have data in the "Set" column family. Is it the case or not? Without more info from you, it's hard to tell.
>
> J-D
>
> On Tue, Nov 15, 2011 at 11:41 PM, Stuti Awasthi <st...@hcl.com> wrote:
>> Hi,
>> I tried to use MR RowCounter to count the rows of a table with specific column family. But it is not displaying correct result.
>>
>> Command (Only Table Name as argument ): Hbase/hbase-0.90.3/bin/hbase
>> org.apache.hadoop.hbase.mapreduce.RowCounter Keyword Output :
>> 11/11/16 13:04:31 INFO mapred.TaskRunner: Task 'attempt_local_0001_m_000000_0' done.
>> 11/11/16 13:04:32 INFO mapred.JobClient: map 100% reduce 0%
>> 11/11/16 13:04:32 INFO mapred.JobClient: Job complete: job_local_0001
>> 11/11/16 13:04:32 INFO mapred.JobClient: Counters: 6
>> 11/11/16 13:04:32 INFO mapred.JobClient:
>> org.apache.hadoop.hbase.mapreduce.RowCounter$RowCounterMapper$Counters
>> 11/11/16 13:04:32 INFO mapred.JobClient: ROWS=7
>> 11/11/16 13:04:32 INFO mapred.JobClient: FileSystemCounters
>> 11/11/16 13:04:32 INFO mapred.JobClient: FILE_BYTES_READ=2373099
>> 11/11/16 13:04:32 INFO mapred.JobClient:
>> FILE_BYTES_WRITTEN=2411923
>> 11/11/16 13:04:32 INFO mapred.JobClient: Map-Reduce Framework
>> 11/11/16 13:04:32 INFO mapred.JobClient: Map input records=7
>> 11/11/16 13:04:32 INFO mapred.JobClient: Spilled Records=0
>> 11/11/16 13:04:32 INFO mapred.JobClient: Map output records=0
>>
>> Command (TableName, ColumnFamily): Hbase/hbase-0.90.3/bin/hbase
>> org.apache.hadoop.hbase.mapreduce.RowCounter Keyword Set
>>
>> Output :
>> 11/11/16 13:05:33 INFO mapred.TaskRunner: Task 'attempt_local_0001_m_000000_0' done.
>> 11/11/16 13:05:34 INFO mapred.JobClient: map 100% reduce 0%
>> 11/11/16 13:05:34 INFO mapred.JobClient: Job complete: job_local_0001
>> 11/11/16 13:05:34 INFO mapred.JobClient: Counters: 5
>> 11/11/16 13:05:34 INFO mapred.JobClient: FileSystemCounters
>> 11/11/16 13:05:34 INFO mapred.JobClient: FILE_BYTES_READ=2373107
>> 11/11/16 13:05:34 INFO mapred.JobClient:
>> FILE_BYTES_WRITTEN=2411939
>> 11/11/16 13:05:34 INFO mapred.JobClient: Map-Reduce Framework
>> 11/11/16 13:05:34 INFO mapred.JobClient: Map input records=7
>> 11/11/16 13:05:34 INFO mapred.JobClient: Spilled Records=0
>> 11/11/16 13:05:34 INFO mapred.JobClient: Map output records=0
>>
>> Table Describe command Output is :
>> TABLE => {{NAME => 'Keyword', FAMILIES => [{NAME => 'Info',
>> BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', COMPRESSION =>
>> 'NONE', VERSIONS => '3', TTL => '2147483647', BLOCKSIZE => '65536',
>> IN_MEMORY => 'false', BLOCKCACHE => 'true'}, {NAME => 'Set',
>> BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', COMPRESSION =>
>> 'NONE', VERSIONS => '3', TTL => '2147483647', BLOCKSIZE => '65536',
>> IN_MEMORY => 'false', BLOCKCACHE => 'true'}]}}
>>
>> Am I executing in wrong way or this is some bug ?
>>
>> Regards,
>> Stuti Awasthi
>> HCL Comnet Systems and Services Ltd
>> F-8/9 Basement, Sec-3,Noida.
>>
>>
>> ________________________________
>> ::DISCLAIMER::
>> ----------------------------------------------------------------------
>> -------------------------------------------------
>>
>> The contents of this e-mail and any attachment(s) are confidential and intended for the named recipient(s) only.
>> It shall not attach any liability on the originator or HCL or its
>> affiliates. Any views or opinions presented in this email are solely those of the author and may not necessarily reflect the opinions of HCL or its affiliates.
>> Any form of reproduction, dissemination, copying, disclosure,
>> modification, distribution and / or publication of this message
>> without the prior written consent of the author of this e-mail is
>> strictly prohibited. If you have received this email in error please delete it and notify the sender immediately. Before opening any mail and attachments please check them for viruses and defect.
>>
>> ----------------------------------------------------------------------
>> -------------------------------------------------
>>
>
RE: Facing Issues with RowCounter
Posted by Stuti Awasthi <st...@hcl.com>.
Hi JD,
Table 'Keyword' contains 'Set' column family with 7 rows. Here is the output of scan :
hbase(main):001:0> scan 'Keyword',{COLUMNS=>['Set']}
ROW COLUMN+CELL
Apache column=Set:Fuse, timestamp=1321506922206, value=
Apache column=Set:Hadoop, timestamp=1321506922206, value=
Apache column=Set:Hive, timestamp=1321506922206, value=
Apache column=Set:MySql, timestamp=1321506922206, value=
Apache column=Set:PHP, timestamp=1321506922206, value=
Fuse column=Set:Apache, timestamp=1321506922206, value=
Fuse column=Set:Hdfs, timestamp=1321506922209, value=
Hadoop column=Set:Apache, timestamp=1321506922209, value=
Hadoop column=Set:Hive, timestamp=1321506922212, value=
Hdfs column=Set:Fuse, timestamp=1321506922212, value=
Hive column=Set:Apache, timestamp=1321506922212, value=
Hive column=Set:Hadoop, timestamp=1321506922214, value=
MySql column=Set:Apache, timestamp=1321506922214, value=
MySql column=Set:PHP, timestamp=1321506922216, value=
PHP column=Set:Apache, timestamp=1321506922216, value=
PHP column=Set:MySql, timestamp=1321506922218, value=
7 row(s) in 0.4120 seconds
This output is not shown in RowCounter MR job.
-----Original Message-----
From: jdcryans@gmail.com [mailto:jdcryans@gmail.com] On Behalf Of Jean-Daniel Cryans
Sent: Wednesday, November 16, 2011 11:09 PM
To: user@hbase.apache.org
Subject: Re: Facing Issues with RowCounter
What I can decrypt from those outputs is that you have a total of 7 rows, and none of them have data in the "Set" column family. Is it the case or not? Without more info from you, it's hard to tell.
J-D
On Tue, Nov 15, 2011 at 11:41 PM, Stuti Awasthi <st...@hcl.com> wrote:
> Hi,
> I tried to use MR RowCounter to count the rows of a table with specific column family. But it is not displaying correct result.
>
> Command (Only Table Name as argument ): Hbase/hbase-0.90.3/bin/hbase
> org.apache.hadoop.hbase.mapreduce.RowCounter Keyword Output :
> 11/11/16 13:04:31 INFO mapred.TaskRunner: Task 'attempt_local_0001_m_000000_0' done.
> 11/11/16 13:04:32 INFO mapred.JobClient: map 100% reduce 0%
> 11/11/16 13:04:32 INFO mapred.JobClient: Job complete: job_local_0001
> 11/11/16 13:04:32 INFO mapred.JobClient: Counters: 6
> 11/11/16 13:04:32 INFO mapred.JobClient:
> org.apache.hadoop.hbase.mapreduce.RowCounter$RowCounterMapper$Counters
> 11/11/16 13:04:32 INFO mapred.JobClient: ROWS=7
> 11/11/16 13:04:32 INFO mapred.JobClient: FileSystemCounters
> 11/11/16 13:04:32 INFO mapred.JobClient: FILE_BYTES_READ=2373099
> 11/11/16 13:04:32 INFO mapred.JobClient:
> FILE_BYTES_WRITTEN=2411923
> 11/11/16 13:04:32 INFO mapred.JobClient: Map-Reduce Framework
> 11/11/16 13:04:32 INFO mapred.JobClient: Map input records=7
> 11/11/16 13:04:32 INFO mapred.JobClient: Spilled Records=0
> 11/11/16 13:04:32 INFO mapred.JobClient: Map output records=0
>
> Command (TableName, ColumnFamily): Hbase/hbase-0.90.3/bin/hbase
> org.apache.hadoop.hbase.mapreduce.RowCounter Keyword Set
>
> Output :
> 11/11/16 13:05:33 INFO mapred.TaskRunner: Task 'attempt_local_0001_m_000000_0' done.
> 11/11/16 13:05:34 INFO mapred.JobClient: map 100% reduce 0%
> 11/11/16 13:05:34 INFO mapred.JobClient: Job complete: job_local_0001
> 11/11/16 13:05:34 INFO mapred.JobClient: Counters: 5
> 11/11/16 13:05:34 INFO mapred.JobClient: FileSystemCounters
> 11/11/16 13:05:34 INFO mapred.JobClient: FILE_BYTES_READ=2373107
> 11/11/16 13:05:34 INFO mapred.JobClient:
> FILE_BYTES_WRITTEN=2411939
> 11/11/16 13:05:34 INFO mapred.JobClient: Map-Reduce Framework
> 11/11/16 13:05:34 INFO mapred.JobClient: Map input records=7
> 11/11/16 13:05:34 INFO mapred.JobClient: Spilled Records=0
> 11/11/16 13:05:34 INFO mapred.JobClient: Map output records=0
>
> Table Describe command Output is :
> TABLE => {{NAME => 'Keyword', FAMILIES => [{NAME => 'Info',
> BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', COMPRESSION =>
> 'NONE', VERSIONS => '3', TTL => '2147483647', BLOCKSIZE => '65536',
> IN_MEMORY => 'false', BLOCKCACHE => 'true'}, {NAME => 'Set',
> BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', COMPRESSION =>
> 'NONE', VERSIONS => '3', TTL => '2147483647', BLOCKSIZE => '65536',
> IN_MEMORY => 'false', BLOCKCACHE => 'true'}]}}
>
> Am I executing in wrong way or this is some bug ?
>
> Regards,
> Stuti Awasthi
> HCL Comnet Systems and Services Ltd
> F-8/9 Basement, Sec-3,Noida.
>
>
> ________________________________
> ::DISCLAIMER::
> ----------------------------------------------------------------------
> -------------------------------------------------
>
> The contents of this e-mail and any attachment(s) are confidential and intended for the named recipient(s) only.
> It shall not attach any liability on the originator or HCL or its
> affiliates. Any views or opinions presented in this email are solely those of the author and may not necessarily reflect the opinions of HCL or its affiliates.
> Any form of reproduction, dissemination, copying, disclosure,
> modification, distribution and / or publication of this message
> without the prior written consent of the author of this e-mail is
> strictly prohibited. If you have received this email in error please delete it and notify the sender immediately. Before opening any mail and attachments please check them for viruses and defect.
>
> ----------------------------------------------------------------------
> -------------------------------------------------
>
Re: Facing Issues with RowCounter
Posted by Jean-Daniel Cryans <jd...@apache.org>.
What I can decrypt from those outputs is that you have a total of 7
rows, and none of them have data in the "Set" column family. Is it the
case or not? Without more info from you, it's hard to tell.
J-D
On Tue, Nov 15, 2011 at 11:41 PM, Stuti Awasthi <st...@hcl.com> wrote:
> Hi,
> I tried to use MR RowCounter to count the rows of a table with specific column family. But it is not displaying correct result.
>
> Command (Only Table Name as argument ): Hbase/hbase-0.90.3/bin/hbase org.apache.hadoop.hbase.mapreduce.RowCounter Keyword
> Output :
> 11/11/16 13:04:31 INFO mapred.TaskRunner: Task 'attempt_local_0001_m_000000_0' done.
> 11/11/16 13:04:32 INFO mapred.JobClient: map 100% reduce 0%
> 11/11/16 13:04:32 INFO mapred.JobClient: Job complete: job_local_0001
> 11/11/16 13:04:32 INFO mapred.JobClient: Counters: 6
> 11/11/16 13:04:32 INFO mapred.JobClient: org.apache.hadoop.hbase.mapreduce.RowCounter$RowCounterMapper$Counters
> 11/11/16 13:04:32 INFO mapred.JobClient: ROWS=7
> 11/11/16 13:04:32 INFO mapred.JobClient: FileSystemCounters
> 11/11/16 13:04:32 INFO mapred.JobClient: FILE_BYTES_READ=2373099
> 11/11/16 13:04:32 INFO mapred.JobClient: FILE_BYTES_WRITTEN=2411923
> 11/11/16 13:04:32 INFO mapred.JobClient: Map-Reduce Framework
> 11/11/16 13:04:32 INFO mapred.JobClient: Map input records=7
> 11/11/16 13:04:32 INFO mapred.JobClient: Spilled Records=0
> 11/11/16 13:04:32 INFO mapred.JobClient: Map output records=0
>
> Command (TableName, ColumnFamily): Hbase/hbase-0.90.3/bin/hbase org.apache.hadoop.hbase.mapreduce.RowCounter Keyword Set
>
> Output :
> 11/11/16 13:05:33 INFO mapred.TaskRunner: Task 'attempt_local_0001_m_000000_0' done.
> 11/11/16 13:05:34 INFO mapred.JobClient: map 100% reduce 0%
> 11/11/16 13:05:34 INFO mapred.JobClient: Job complete: job_local_0001
> 11/11/16 13:05:34 INFO mapred.JobClient: Counters: 5
> 11/11/16 13:05:34 INFO mapred.JobClient: FileSystemCounters
> 11/11/16 13:05:34 INFO mapred.JobClient: FILE_BYTES_READ=2373107
> 11/11/16 13:05:34 INFO mapred.JobClient: FILE_BYTES_WRITTEN=2411939
> 11/11/16 13:05:34 INFO mapred.JobClient: Map-Reduce Framework
> 11/11/16 13:05:34 INFO mapred.JobClient: Map input records=7
> 11/11/16 13:05:34 INFO mapred.JobClient: Spilled Records=0
> 11/11/16 13:05:34 INFO mapred.JobClient: Map output records=0
>
> Table Describe command Output is :
> TABLE => {{NAME => 'Keyword', FAMILIES => [{NAME => 'Info', BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', COMPRESSION => 'NONE', VERSIONS => '3', TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}, {NAME => 'Set', BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', COMPRESSION => 'NONE', VERSIONS => '3', TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}]}}
>
> Am I executing in wrong way or this is some bug ?
>
> Regards,
> Stuti Awasthi
> HCL Comnet Systems and Services Ltd
> F-8/9 Basement, Sec-3,Noida.
>
>
> ________________________________
> ::DISCLAIMER::
> -----------------------------------------------------------------------------------------------------------------------
>
> The contents of this e-mail and any attachment(s) are confidential and intended for the named recipient(s) only.
> It shall not attach any liability on the originator or HCL or its affiliates. Any views or opinions presented in
> this email are solely those of the author and may not necessarily reflect the opinions of HCL or its affiliates.
> Any form of reproduction, dissemination, copying, disclosure, modification, distribution and / or publication of
> this message without the prior written consent of the author of this e-mail is strictly prohibited. If you have
> received this email in error please delete it and notify the sender immediately. Before opening any mail and
> attachments please check them for viruses and defect.
>
> -----------------------------------------------------------------------------------------------------------------------
>