You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hbase.apache.org by Timo Schaepe <ti...@timoschaepe.de> on 2013/12/13 10:22:31 UTC

Problems with hbase.hregion.max.filesize

Hello,

during the loading of data in our cluster I noticed some strange behavior of some regions, that I don't understand. 

Scenario:
We convert data from a mysql database to HBase. The data is inserted with a put to the specific HBase table. The row key is a timestamp. I know the problem with timestamp keys, but in our requirement it works quiet well. The problem is now, that there are some regions, which are growing and growing.

For example the table on the picture [1]. First, all data was distributed over regions and node. And now, the data is written into only one region, which is growing and I can see no splitting at all. Actually the size of the big region is nearly 60 GB.

HBase version is 0.94.11. I cannot understand, why the splitting is not happening. In hbase-site.xml I limit the hbase.hregion.max.filesize to 2 GB and HBase accepted this value.

<property>
	<!--Loaded from hbase-site.xml-->
	<name>hbase.hregion.max.filesize</name>
	<value>2147483648</value>
</property>

First mystery: Hannibal shows me the split size is 10 GB (see screenshot).
Second mystery: HBase is not splitting some regions neither at 2 GB nor 10 GB.

Any ideas? Could be the timestamp rowkey cause this problem?

Thanks,

	Timo

[1] https://www.dropbox.com/s/lm286xkcpglnj1t/big_region.png

Re: Problems with hbase.hregion.max.filesize

Posted by Timo Schaepe <ti...@timoschaepe.de>.

Hey,

sorry for the answer delay, I had a flight to San Francisco and fighting with the jetleg. I am here on vacation, maybe I can visit some interesting talks about HBase/Hadoop :).

Am 14.12.2013 um 13:14 schrieb lars hofhansl <la...@apache.org>:

> Did you observe anything interesting with such a large Java heap?

Not really. From the beginning we worked with such a big heap so I haven't any big experience with less heap. But so far, this large heap space works well.

> You said you have 3G for the memstore, most of the rest is for the block cache I assume.

Not exactely. Memstore is 2.4 GB and the rest is Blockcache. The relevant values in hbase-site.xml values are:

hfile.block.cache.size = 0.77
hbase.regionserver.global.memstore.upperLimit  = 0.03

> Any long GC pauses, or other strange behavior?

Sometimes we have long GC pauses. For example yesterday we had a 213 Seconds GC pause on one region server and he passed out. But those long pauses are very, very rare. For GCing we use the default value in hbase-env.xml, I think:
export HBASE_OPTS="-XX:+UseConcMarkSweepGC"

I cannot see other strange behavior, except the problem with hbase.hregion.max.filesize and why HBase is not splitting automatically at 2 GB.

Thanks,

	Timo

> 
> Thanks.
> 
> -- Lars
> 
> 
> 
> ________________________________
> From: Timo Schaepe <ti...@timoschaepe.de>
> To: user@hbase.apache.org 
> Sent: Saturday, December 14, 2013 5:27 AM
> Subject: Re: Problems with hbase.hregion.max.filesize
> 
> 
> Sorry, forgot our Hardwareconfiguration…
> 
> 1 NameNode/SecondaryNameNode/HBase master
> 31 Datanodes/Regionserver
> 
> All of them with
> 2x XEON E5-2640 2.5 GHz
> 128 GB RAM 
> /dev/sda 90 GB 
> /dev/sdb 1.1 TB 
> /dev/sdc 1.1 TB
> 
> where sda are SSDs disks for system and sdb and sdc are disks for HDFS/HBase
> 
> Heapsize for Regionserver: 80 GB
> 
> bye,
> 
>     Timo
> 
> 
> 
> Am 14.12.2013 um 14:21 schrieb Timo Schaepe <ti...@timoschaepe.de>:
> 
>> Hey,
>> 
>> @JM: Thanks for the hint with hbase.regionserver.fileSplitTimeout. At the moment (the import is actually working) and after I splittet the specific regions manually, we do not have growing regions anymore.
>> 
>> hbase hbck says, all things are going fine.
>> 0 inconsistencies detected.
>> Status: OK
>> 
>> @Ted Yu: Sure, have a look here: http://pastebin.com/2ANFVZEU
>> The relevant tablename ist data_1091.
>> 
>> Thanks for your time.
>> 
>>     Timo
>> 
>> Am 13.12.2013 um 20:18 schrieb Ted Yu <yu...@gmail.com>:
>> 
>>> Timo:
>>> Can you pastebin regionserver log around 2013-12-12 13:54:20 so that we can
>>> see what happened ?
>>> 
>>> Thanks
>>> 
>>> 
>>> On Fri, Dec 13, 2013 at 11:02 AM, Jean-Marc Spaggiari <
>>> jean-marc@spaggiari.org> wrote:
>>> 
>>>> Try to increase hbase.regionserver.fileSplitTimeout but put it back to its
>>>> default value after.
>>>> 
>>>> Default value is 30 seconds. I think it's not normal for a split to take
>>>> more than that.
>>>> 
>>>> What is your hardware configuration?
>>>> 
>>>> Have you run hbck to see if everything is correct?
>>>> 
>>>> JM
>>>> 
>>>> 
>>>> 2013/12/13 Timo Schaepe <ti...@timoschaepe.de>
>>>> 
>>>>> Hello again,
>>>>> 
>>>>> digging in the logs of the specific regionserver shows me that:
>>>>> 
>>>>> 2013-12-12 13:54:20,194 INFO
>>>>> org.apache.hadoop.hbase.regionserver.SplitRequest: Running
>>>> rollback/cleanup
>>>>> of failed split of
>>>>> 
>>>> data,OR\x83\xCF\x02\x82\xAE\xF3U,1386851456415.d04cadb1b2252dafc476c138e9651ca7.;
>>>>> Took too long to split the files and create the references, aborting
>>>> split
>>>>> 
>>>>> This message appears two time, so it seems, that HBase tried to split the
>>>>> region but it failed. I don't know why. How is the behaviour of HBase,
>>>> if a
>>>>> region split fails? Are there more tries to split this region again? I
>>>>> didn't find any new tries in the log. Now I split the big regions
>>>> manually
>>>>> and this works. And also it seems, that HBase split the new regions again
>>>>> to crunch they down to the given limit.
>>>>> 
>>>>> But also it is a mystery for me, why the split size in Hannibal shows me
>>>>> 10 GB and in base-site.xml I put 2 GB…
>>>>> 
>>>>> Thanks,
>>>>> 
>>>>>        Timo
>>>>> 
>>>>> 
>>>>> Am 13.12.2013 um 10:22 schrieb Timo Schaepe <ti...@timoschaepe.de>:
>>>>> 
>>>>>> Hello,
>>>>>> 
>>>>>> during the loading of data in our cluster I noticed some strange
>>>>> behavior of some regions, that I don't understand.
>>>>>> 
>>>>>> Scenario:
>>>>>> We convert data from a mysql database to HBase. The data is inserted
>>>>> with a put to the specific HBase table. The row key is a timestamp. I
>>>> know
>>>>> the problem with timestamp keys, but in our requirement it works quiet
>>>>> well. The problem is now, that there are some regions, which are growing
>>>>> and growing.
>>>>>> 
>>>>>> For example the table on the picture [1]. First, all data was
>>>>> distributed over regions and node. And now, the data is written into only
>>>>> one region, which is growing and I can see no splitting at all. Actually
>>>>> the size of the big region is nearly 60 GB.
>>>>>> 
>>>>>> HBase version is 0.94.11. I cannot understand, why the splitting is not
>>>>> happening. In hbase-site.xml I limit the hbase.hregion.max.filesize to 2
>>>> GB
>>>>> and HBase accepted this value.
>>>>>> 
>>>>>> <property>
>>>>>>      <!--Loaded from hbase-site.xml-->
>>>>>>      <name>hbase.hregion.max.filesize</name>
>>>>>>      <value>2147483648</value>
>>>>>> </property>
>>>>>> 
>>>>>> First mystery: Hannibal shows me the split size is 10 GB (see
>>>>> screenshot).
>>>>>> Second mystery: HBase is not splitting some regions neither at 2 GB nor
>>>>> 10 GB.
>>>>>> 
>>>>>> Any ideas? Could be the timestamp rowkey cause this problem?
>>>>>> 
>>>>>> Thanks,
>>>>>> 
>>>>>>      Timo
>>>>>> 
>>>>>> [1] https://www.dropbox.com/s/lm286xkcpglnj1t/big_region.png
>>>>> 
>>>>> 
>>>>

Re: Problems with hbase.hregion.max.filesize

Posted by lars hofhansl <la...@apache.org>.

Did you observe anything interesting with such a large Java heap?
You said you have 3G for the memstore, most of the rest is for the block cache I assume.
Any long GC pauses, or other strange behavior?

Thanks.

-- Lars



________________________________
 From: Timo Schaepe <ti...@timoschaepe.de>
To: user@hbase.apache.org 
Sent: Saturday, December 14, 2013 5:27 AM
Subject: Re: Problems with hbase.hregion.max.filesize
 

Sorry, forgot our Hardwareconfiguration…

1 NameNode/SecondaryNameNode/HBase master
31 Datanodes/Regionserver

All of them with
2x XEON E5-2640 2.5 GHz
128 GB RAM 
/dev/sda 90 GB 
/dev/sdb 1.1 TB 
/dev/sdc 1.1 TB

where sda are SSDs disks for system and sdb and sdc are disks for HDFS/HBase

Heapsize for Regionserver: 80 GB

bye,

    Timo



Am 14.12.2013 um 14:21 schrieb Timo Schaepe <ti...@timoschaepe.de>:

> Hey,
> 
> @JM: Thanks for the hint with hbase.regionserver.fileSplitTimeout. At the moment (the import is actually working) and after I splittet the specific regions manually, we do not have growing regions anymore.
> 
> hbase hbck says, all things are going fine.
> 0 inconsistencies detected.
> Status: OK
> 
> @Ted Yu: Sure, have a look here: http://pastebin.com/2ANFVZEU
> The relevant tablename ist data_1091.
> 
> Thanks for your time.
> 
>     Timo
> 
> Am 13.12.2013 um 20:18 schrieb Ted Yu <yu...@gmail.com>:
> 
>> Timo:
>> Can you pastebin regionserver log around 2013-12-12 13:54:20 so that we can
>> see what happened ?
>> 
>> Thanks
>> 
>> 
>> On Fri, Dec 13, 2013 at 11:02 AM, Jean-Marc Spaggiari <
>> jean-marc@spaggiari.org> wrote:
>> 
>>> Try to increase hbase.regionserver.fileSplitTimeout but put it back to its
>>> default value after.
>>> 
>>> Default value is 30 seconds. I think it's not normal for a split to take
>>> more than that.
>>> 
>>> What is your hardware configuration?
>>> 
>>> Have you run hbck to see if everything is correct?
>>> 
>>> JM
>>> 
>>> 
>>> 2013/12/13 Timo Schaepe <ti...@timoschaepe.de>
>>> 
>>>> Hello again,
>>>> 
>>>> digging in the logs of the specific regionserver shows me that:
>>>> 
>>>> 2013-12-12 13:54:20,194 INFO
>>>> org.apache.hadoop.hbase.regionserver.SplitRequest: Running
>>> rollback/cleanup
>>>> of failed split of
>>>> 
>>> data,OR\x83\xCF\x02\x82\xAE\xF3U,1386851456415.d04cadb1b2252dafc476c138e9651ca7.;
>>>> Took too long to split the files and create the references, aborting
>>> split
>>>> 
>>>> This message appears two time, so it seems, that HBase tried to split the
>>>> region but it failed. I don't know why. How is the behaviour of HBase,
>>> if a
>>>> region split fails? Are there more tries to split this region again? I
>>>> didn't find any new tries in the log. Now I split the big regions
>>> manually
>>>> and this works. And also it seems, that HBase split the new regions again
>>>> to crunch they down to the given limit.
>>>> 
>>>> But also it is a mystery for me, why the split size in Hannibal shows me
>>>> 10 GB and in base-site.xml I put 2 GB…
>>>> 
>>>> Thanks,
>>>> 
>>>>       Timo
>>>> 
>>>> 
>>>> Am 13.12.2013 um 10:22 schrieb Timo Schaepe <ti...@timoschaepe.de>:
>>>> 
>>>>> Hello,
>>>>> 
>>>>> during the loading of data in our cluster I noticed some strange
>>>> behavior of some regions, that I don't understand.
>>>>> 
>>>>> Scenario:
>>>>> We convert data from a mysql database to HBase. The data is inserted
>>>> with a put to the specific HBase table. The row key is a timestamp. I
>>> know
>>>> the problem with timestamp keys, but in our requirement it works quiet
>>>> well. The problem is now, that there are some regions, which are growing
>>>> and growing.
>>>>> 
>>>>> For example the table on the picture [1]. First, all data was
>>>> distributed over regions and node. And now, the data is written into only
>>>> one region, which is growing and I can see no splitting at all. Actually
>>>> the size of the big region is nearly 60 GB.
>>>>> 
>>>>> HBase version is 0.94.11. I cannot understand, why the splitting is not
>>>> happening. In hbase-site.xml I limit the hbase.hregion.max.filesize to 2
>>> GB
>>>> and HBase accepted this value.
>>>>> 
>>>>> <property>
>>>>>     <!--Loaded from hbase-site.xml-->
>>>>>     <name>hbase.hregion.max.filesize</name>
>>>>>     <value>2147483648</value>
>>>>> </property>
>>>>> 
>>>>> First mystery: Hannibal shows me the split size is 10 GB (see
>>>> screenshot).
>>>>> Second mystery: HBase is not splitting some regions neither at 2 GB nor
>>>> 10 GB.
>>>>> 
>>>>> Any ideas? Could be the timestamp rowkey cause this problem?
>>>>> 
>>>>> Thanks,
>>>>> 
>>>>>     Timo
>>>>> 
>>>>> [1] https://www.dropbox.com/s/lm286xkcpglnj1t/big_region.png
>>>> 
>>>> 
>>> 
>

Re: Problems with hbase.hregion.max.filesize

Posted by Timo Schaepe <ti...@timoschaepe.de>.

Hey Azuryy Yu,

yep, checked the GC log, nothing there.

I think, there is no special JVM configuration:

export HBASE_OPTS="-XX:+UseConcMarkSweepGC"
export SERVER_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=1 -XX:GCLogFileSize=512M -Xloggc:/home/hadoop/logs/g
c-hbase.log $HBASE_GC_OPTS"

Thank,

	Timo

Am 14.12.2013 um 15:45 schrieb Azuryy Yu <az...@gmail.com>:

> such a large java heap, did you check gc log? how did you configured jvm
> options?
> On 2013-12-14 9:27 PM, "Timo Schaepe" <ti...@timoschaepe.de> wrote:
> 
>> Sorry, forgot our Hardwareconfiguration…
>> 
>> 1 NameNode/SecondaryNameNode/HBase master
>> 31 Datanodes/Regionserver
>> 
>> All of them with
>> 2x XEON E5-2640 2.5 GHz
>> 128 GB RAM
>> /dev/sda 90 GB
>> /dev/sdb 1.1 TB
>> /dev/sdc 1.1 TB
>> 
>> where sda are SSDs disks for system and sdb and sdc are disks for
>> HDFS/HBase
>> 
>> Heapsize for Regionserver: 80 GB
>> 
>> bye,
>> 
>>        Timo
>> 
>> 
>> Am 14.12.2013 um 14:21 schrieb Timo Schaepe <ti...@timoschaepe.de>:
>> 
>>> Hey,
>>> 
>>> @JM: Thanks for the hint with hbase.regionserver.fileSplitTimeout. At
>> the moment (the import is actually working) and after I splittet the
>> specific regions manually, we do not have growing regions anymore.
>>> 
>>> hbase hbck says, all things are going fine.
>>> 0 inconsistencies detected.
>>> Status: OK
>>> 
>>> @Ted Yu: Sure, have a look here: http://pastebin.com/2ANFVZEU
>>> The relevant tablename ist data_1091.
>>> 
>>> Thanks for your time.
>>> 
>>>      Timo
>>> 
>>> Am 13.12.2013 um 20:18 schrieb Ted Yu <yu...@gmail.com>:
>>> 
>>>> Timo:
>>>> Can you pastebin regionserver log around 2013-12-12 13:54:20 so that we
>> can
>>>> see what happened ?
>>>> 
>>>> Thanks
>>>> 
>>>> 
>>>> On Fri, Dec 13, 2013 at 11:02 AM, Jean-Marc Spaggiari <
>>>> jean-marc@spaggiari.org> wrote:
>>>> 
>>>>> Try to increase hbase.regionserver.fileSplitTimeout but put it back to
>> its
>>>>> default value after.
>>>>> 
>>>>> Default value is 30 seconds. I think it's not normal for a split to
>> take
>>>>> more than that.
>>>>> 
>>>>> What is your hardware configuration?
>>>>> 
>>>>> Have you run hbck to see if everything is correct?
>>>>> 
>>>>> JM
>>>>> 
>>>>> 
>>>>> 2013/12/13 Timo Schaepe <ti...@timoschaepe.de>
>>>>> 
>>>>>> Hello again,
>>>>>> 
>>>>>> digging in the logs of the specific regionserver shows me that:
>>>>>> 
>>>>>> 2013-12-12 13:54:20,194 INFO
>>>>>> org.apache.hadoop.hbase.regionserver.SplitRequest: Running
>>>>> rollback/cleanup
>>>>>> of failed split of
>>>>>> 
>>>>> 
>> data,OR\x83\xCF\x02\x82\xAE\xF3U,1386851456415.d04cadb1b2252dafc476c138e9651ca7.;
>>>>>> Took too long to split the files and create the references, aborting
>>>>> split
>>>>>> 
>>>>>> This message appears two time, so it seems, that HBase tried to split
>> the
>>>>>> region but it failed. I don't know why. How is the behaviour of HBase,
>>>>> if a
>>>>>> region split fails? Are there more tries to split this region again? I
>>>>>> didn't find any new tries in the log. Now I split the big regions
>>>>> manually
>>>>>> and this works. And also it seems, that HBase split the new regions
>> again
>>>>>> to crunch they down to the given limit.
>>>>>> 
>>>>>> But also it is a mystery for me, why the split size in Hannibal shows
>> me
>>>>>> 10 GB and in base-site.xml I put 2 GB…
>>>>>> 
>>>>>> Thanks,
>>>>>> 
>>>>>>      Timo
>>>>>> 
>>>>>> 
>>>>>> Am 13.12.2013 um 10:22 schrieb Timo Schaepe <ti...@timoschaepe.de>:
>>>>>> 
>>>>>>> Hello,
>>>>>>> 
>>>>>>> during the loading of data in our cluster I noticed some strange
>>>>>> behavior of some regions, that I don't understand.
>>>>>>> 
>>>>>>> Scenario:
>>>>>>> We convert data from a mysql database to HBase. The data is inserted
>>>>>> with a put to the specific HBase table. The row key is a timestamp. I
>>>>> know
>>>>>> the problem with timestamp keys, but in our requirement it works quiet
>>>>>> well. The problem is now, that there are some regions, which are
>> growing
>>>>>> and growing.
>>>>>>> 
>>>>>>> For example the table on the picture [1]. First, all data was
>>>>>> distributed over regions and node. And now, the data is written into
>> only
>>>>>> one region, which is growing and I can see no splitting at all.
>> Actually
>>>>>> the size of the big region is nearly 60 GB.
>>>>>>> 
>>>>>>> HBase version is 0.94.11. I cannot understand, why the splitting is
>> not
>>>>>> happening. In hbase-site.xml I limit the hbase.hregion.max.filesize
>> to 2
>>>>> GB
>>>>>> and HBase accepted this value.
>>>>>>> 
>>>>>>> <property>
>>>>>>>    <!--Loaded from hbase-site.xml-->
>>>>>>>    <name>hbase.hregion.max.filesize</name>
>>>>>>>    <value>2147483648</value>
>>>>>>> </property>
>>>>>>> 
>>>>>>> First mystery: Hannibal shows me the split size is 10 GB (see
>>>>>> screenshot).
>>>>>>> Second mystery: HBase is not splitting some regions neither at 2 GB
>> nor
>>>>>> 10 GB.
>>>>>>> 
>>>>>>> Any ideas? Could be the timestamp rowkey cause this problem?
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> 
>>>>>>>    Timo
>>>>>>> 
>>>>>>> [1] https://www.dropbox.com/s/lm286xkcpglnj1t/big_region.png
>>>>>> 
>>>>>> 
>>>>> 
>>> 
>> 
>>

Re: Problems with hbase.hregion.max.filesize

Posted by Azuryy Yu <az...@gmail.com>.

such a large java heap, did you check gc log? how did you configured jvm
options?
On 2013-12-14 9:27 PM, "Timo Schaepe" <ti...@timoschaepe.de> wrote:

> Sorry, forgot our Hardwareconfiguration…
>
> 1 NameNode/SecondaryNameNode/HBase master
> 31 Datanodes/Regionserver
>
> All of them with
> 2x XEON E5-2640 2.5 GHz
> 128 GB RAM
> /dev/sda 90 GB
> /dev/sdb 1.1 TB
> /dev/sdc 1.1 TB
>
> where sda are SSDs disks for system and sdb and sdc are disks for
> HDFS/HBase
>
> Heapsize for Regionserver: 80 GB
>
> bye,
>
>         Timo
>
>
> Am 14.12.2013 um 14:21 schrieb Timo Schaepe <ti...@timoschaepe.de>:
>
> > Hey,
> >
> > @JM: Thanks for the hint with hbase.regionserver.fileSplitTimeout. At
> the moment (the import is actually working) and after I splittet the
> specific regions manually, we do not have growing regions anymore.
> >
> > hbase hbck says, all things are going fine.
> > 0 inconsistencies detected.
> > Status: OK
> >
> > @Ted Yu: Sure, have a look here: http://pastebin.com/2ANFVZEU
> > The relevant tablename ist data_1091.
> >
> > Thanks for your time.
> >
> >       Timo
> >
> > Am 13.12.2013 um 20:18 schrieb Ted Yu <yu...@gmail.com>:
> >
> >> Timo:
> >> Can you pastebin regionserver log around 2013-12-12 13:54:20 so that we
> can
> >> see what happened ?
> >>
> >> Thanks
> >>
> >>
> >> On Fri, Dec 13, 2013 at 11:02 AM, Jean-Marc Spaggiari <
> >> jean-marc@spaggiari.org> wrote:
> >>
> >>> Try to increase hbase.regionserver.fileSplitTimeout but put it back to
> its
> >>> default value after.
> >>>
> >>> Default value is 30 seconds. I think it's not normal for a split to
> take
> >>> more than that.
> >>>
> >>> What is your hardware configuration?
> >>>
> >>> Have you run hbck to see if everything is correct?
> >>>
> >>> JM
> >>>
> >>>
> >>> 2013/12/13 Timo Schaepe <ti...@timoschaepe.de>
> >>>
> >>>> Hello again,
> >>>>
> >>>> digging in the logs of the specific regionserver shows me that:
> >>>>
> >>>> 2013-12-12 13:54:20,194 INFO
> >>>> org.apache.hadoop.hbase.regionserver.SplitRequest: Running
> >>> rollback/cleanup
> >>>> of failed split of
> >>>>
> >>>
> data,OR\x83\xCF\x02\x82\xAE\xF3U,1386851456415.d04cadb1b2252dafc476c138e9651ca7.;
> >>>> Took too long to split the files and create the references, aborting
> >>> split
> >>>>
> >>>> This message appears two time, so it seems, that HBase tried to split
> the
> >>>> region but it failed. I don't know why. How is the behaviour of HBase,
> >>> if a
> >>>> region split fails? Are there more tries to split this region again? I
> >>>> didn't find any new tries in the log. Now I split the big regions
> >>> manually
> >>>> and this works. And also it seems, that HBase split the new regions
> again
> >>>> to crunch they down to the given limit.
> >>>>
> >>>> But also it is a mystery for me, why the split size in Hannibal shows
> me
> >>>> 10 GB and in base-site.xml I put 2 GB…
> >>>>
> >>>> Thanks,
> >>>>
> >>>>       Timo
> >>>>
> >>>>
> >>>> Am 13.12.2013 um 10:22 schrieb Timo Schaepe <ti...@timoschaepe.de>:
> >>>>
> >>>>> Hello,
> >>>>>
> >>>>> during the loading of data in our cluster I noticed some strange
> >>>> behavior of some regions, that I don't understand.
> >>>>>
> >>>>> Scenario:
> >>>>> We convert data from a mysql database to HBase. The data is inserted
> >>>> with a put to the specific HBase table. The row key is a timestamp. I
> >>> know
> >>>> the problem with timestamp keys, but in our requirement it works quiet
> >>>> well. The problem is now, that there are some regions, which are
> growing
> >>>> and growing.
> >>>>>
> >>>>> For example the table on the picture [1]. First, all data was
> >>>> distributed over regions and node. And now, the data is written into
> only
> >>>> one region, which is growing and I can see no splitting at all.
> Actually
> >>>> the size of the big region is nearly 60 GB.
> >>>>>
> >>>>> HBase version is 0.94.11. I cannot understand, why the splitting is
> not
> >>>> happening. In hbase-site.xml I limit the hbase.hregion.max.filesize
> to 2
> >>> GB
> >>>> and HBase accepted this value.
> >>>>>
> >>>>> <property>
> >>>>>     <!--Loaded from hbase-site.xml-->
> >>>>>     <name>hbase.hregion.max.filesize</name>
> >>>>>     <value>2147483648</value>
> >>>>> </property>
> >>>>>
> >>>>> First mystery: Hannibal shows me the split size is 10 GB (see
> >>>> screenshot).
> >>>>> Second mystery: HBase is not splitting some regions neither at 2 GB
> nor
> >>>> 10 GB.
> >>>>>
> >>>>> Any ideas? Could be the timestamp rowkey cause this problem?
> >>>>>
> >>>>> Thanks,
> >>>>>
> >>>>>     Timo
> >>>>>
> >>>>> [1] https://www.dropbox.com/s/lm286xkcpglnj1t/big_region.png
> >>>>
> >>>>
> >>>
> >
>
>

Re: Problems with hbase.hregion.max.filesize

Posted by Timo Schaepe <ti...@timoschaepe.de>.

Sorry, forgot our Hardwareconfiguration…

1 NameNode/SecondaryNameNode/HBase master
31 Datanodes/Regionserver

All of them with
2x XEON E5-2640 2.5 GHz
128 GB RAM 
/dev/sda 90 GB 
/dev/sdb 1.1 TB 
/dev/sdc 1.1 TB

where sda are SSDs disks for system and sdb and sdc are disks for HDFS/HBase

Heapsize for Regionserver: 80 GB

bye,

	Timo


Am 14.12.2013 um 14:21 schrieb Timo Schaepe <ti...@timoschaepe.de>:

> Hey,
> 
> @JM: Thanks for the hint with hbase.regionserver.fileSplitTimeout. At the moment (the import is actually working) and after I splittet the specific regions manually, we do not have growing regions anymore.
> 
> hbase hbck says, all things are going fine.
> 0 inconsistencies detected.
> Status: OK
> 
> @Ted Yu: Sure, have a look here: http://pastebin.com/2ANFVZEU
> The relevant tablename ist data_1091.
> 
> Thanks for your time.
> 
> 	Timo
> 
> Am 13.12.2013 um 20:18 schrieb Ted Yu <yu...@gmail.com>:
> 
>> Timo:
>> Can you pastebin regionserver log around 2013-12-12 13:54:20 so that we can
>> see what happened ?
>> 
>> Thanks
>> 
>> 
>> On Fri, Dec 13, 2013 at 11:02 AM, Jean-Marc Spaggiari <
>> jean-marc@spaggiari.org> wrote:
>> 
>>> Try to increase hbase.regionserver.fileSplitTimeout but put it back to its
>>> default value after.
>>> 
>>> Default value is 30 seconds. I think it's not normal for a split to take
>>> more than that.
>>> 
>>> What is your hardware configuration?
>>> 
>>> Have you run hbck to see if everything is correct?
>>> 
>>> JM
>>> 
>>> 
>>> 2013/12/13 Timo Schaepe <ti...@timoschaepe.de>
>>> 
>>>> Hello again,
>>>> 
>>>> digging in the logs of the specific regionserver shows me that:
>>>> 
>>>> 2013-12-12 13:54:20,194 INFO
>>>> org.apache.hadoop.hbase.regionserver.SplitRequest: Running
>>> rollback/cleanup
>>>> of failed split of
>>>> 
>>> data,OR\x83\xCF\x02\x82\xAE\xF3U,1386851456415.d04cadb1b2252dafc476c138e9651ca7.;
>>>> Took too long to split the files and create the references, aborting
>>> split
>>>> 
>>>> This message appears two time, so it seems, that HBase tried to split the
>>>> region but it failed. I don't know why. How is the behaviour of HBase,
>>> if a
>>>> region split fails? Are there more tries to split this region again? I
>>>> didn't find any new tries in the log. Now I split the big regions
>>> manually
>>>> and this works. And also it seems, that HBase split the new regions again
>>>> to crunch they down to the given limit.
>>>> 
>>>> But also it is a mystery for me, why the split size in Hannibal shows me
>>>> 10 GB and in base-site.xml I put 2 GB…
>>>> 
>>>> Thanks,
>>>> 
>>>>       Timo
>>>> 
>>>> 
>>>> Am 13.12.2013 um 10:22 schrieb Timo Schaepe <ti...@timoschaepe.de>:
>>>> 
>>>>> Hello,
>>>>> 
>>>>> during the loading of data in our cluster I noticed some strange
>>>> behavior of some regions, that I don't understand.
>>>>> 
>>>>> Scenario:
>>>>> We convert data from a mysql database to HBase. The data is inserted
>>>> with a put to the specific HBase table. The row key is a timestamp. I
>>> know
>>>> the problem with timestamp keys, but in our requirement it works quiet
>>>> well. The problem is now, that there are some regions, which are growing
>>>> and growing.
>>>>> 
>>>>> For example the table on the picture [1]. First, all data was
>>>> distributed over regions and node. And now, the data is written into only
>>>> one region, which is growing and I can see no splitting at all. Actually
>>>> the size of the big region is nearly 60 GB.
>>>>> 
>>>>> HBase version is 0.94.11. I cannot understand, why the splitting is not
>>>> happening. In hbase-site.xml I limit the hbase.hregion.max.filesize to 2
>>> GB
>>>> and HBase accepted this value.
>>>>> 
>>>>> <property>
>>>>>     <!--Loaded from hbase-site.xml-->
>>>>>     <name>hbase.hregion.max.filesize</name>
>>>>>     <value>2147483648</value>
>>>>> </property>
>>>>> 
>>>>> First mystery: Hannibal shows me the split size is 10 GB (see
>>>> screenshot).
>>>>> Second mystery: HBase is not splitting some regions neither at 2 GB nor
>>>> 10 GB.
>>>>> 
>>>>> Any ideas? Could be the timestamp rowkey cause this problem?
>>>>> 
>>>>> Thanks,
>>>>> 
>>>>>     Timo
>>>>> 
>>>>> [1] https://www.dropbox.com/s/lm286xkcpglnj1t/big_region.png
>>>> 
>>>> 
>>> 
>

Re: Problems with hbase.hregion.max.filesize

Posted by Timo Schaepe <ti...@timoschaepe.de>.

Hey Enis,

thanks for the hint. I checked the logs and all flushes just before the splitting were successfull. Also all compactions works fine.

I made another interesting notice. When I disable a table and than enable it again, HBase starts to split the big regions automatically.

bye,

	Timo


Am 19.12.2013 um 18:33 schrieb Enis Söztutar <en...@apache.org>:

> If the split takes too long (longer than 30 secs), I would say you may have
> too many store files in the region. Split has to write two tiny files per
> store file. The other thing may be the region has to be closed before
> split. Thus it has to do a flush. If it cannot complete the flush in time,
> it might cancel the split as well. Did you check that? Does your
> compactions working as intended?
> 
> Enis
> 
> 
> On Wed, Dec 18, 2013 at 10:06 AM, Timo Schaepe <ti...@timoschaepe.de> wrote:
> 
>> @Ted Yu:
>> Yep, nevertheless thanks a lot!
>> 
>> 
>> Am 18.12.2013 um 10:03 schrieb Ted Yu <yu...@gmail.com>:
>> 
>>> Timo:
>>> I went through namenode log and didn't find much clue.
>>> 
>>> Cheers
>>> 
>>> 
>>> On Tue, Dec 17, 2013 at 9:37 PM, Timo Schaepe <ti...@timoschaepe.de>
>> wrote:
>>> 
>>>> Hey Ted Yu,
>>>> 
>>>> I had digging the name node log and so far I've found nothing special.
>> No
>>>> Exception, FATAL or ERROR message nor anything other peculiarities.
>>>> Only I see a lot of messages like this:
>>>> 
>>>> 2013-12-12 13:53:22,541 INFO org.apache.hadoop.hdfs.StateChange:
>> Removing
>>>> lease on
>>>> 
>> /hbase/Sessions_1091/d04cadb1b2252dafc476c138e9651ca7/.splits/9717de41277e207c24359a18dae72cd3/l/58ab2c11ca9b4b4994ce54bac0bb4c68.d04cadb1b2252dafc476c138e9651ca7
>>>> from client DFSClient_hb_rs_baur-hbase7.baur.boreus.de
>>>> ,60020,1386712527761_1295065721_26
>>>> 2013-12-12 13:53:22,541 INFO org.apache.hadoop.hdfs.StateChange: DIR*
>>>> completeFile:
>>>> 
>> /hbase/Sessions_1091/d04cadb1b2252dafc476c138e9651ca7/.splits/9717de41277e207c24359a18dae72cd3/l/58ab2c11ca9b4b4994ce54bac0bb4c68.d04cadb1b2252dafc476c138e9651ca7
>>>> is closed by DFSClient_hb_rs_baur-hbase7.baur.boreus.de
>>>> ,60020,1386712527761_1295065721_26
>>>> 
>>>> But maybe that is normal. If you wanna have a look, you can find the log
>>>> snippet at
>>>> 
>> https://www.dropbox.com/s/8sls714knn4yqp3/hadoop-hadoop-namenode-baur-hbase1.log.2013-12-12.snip
>>>> 
>>>> Thanks,
>>>> 
>>>>       Timo
>>>> 
>>>> 
>>>> 
>>>> Am 14.12.2013 um 09:12 schrieb Ted Yu <yu...@gmail.com>:
>>>> 
>>>>> Timo:
>>>>> Other than two occurrences of 'Took too long to split the files'
>>>>> @ 13:54:20,194 and 13:55:10,533, I don't find much clue from the posted
>>>> log.
>>>>> 
>>>>> If you have time, mind checking namenode log for 1 minute interval
>>>> leading
>>>>> up to 13:54:20,194 and 13:55:10,533, respectively ?
>>>>> 
>>>>> Thanks
>>>>> 
>>>>> 
>>>>> On Sat, Dec 14, 2013 at 5:21 AM, Timo Schaepe <ti...@timoschaepe.de>
>>>> wrote:
>>>>> 
>>>>>> Hey,
>>>>>> 
>>>>>> @JM: Thanks for the hint with hbase.regionserver.fileSplitTimeout. At
>>>> the
>>>>>> moment (the import is actually working) and after I splittet the
>>>> specific
>>>>>> regions manually, we do not have growing regions anymore.
>>>>>> 
>>>>>> hbase hbck says, all things are going fine.
>>>>>> 0 inconsistencies detected.
>>>>>> Status: OK
>>>>>> 
>>>>>> @Ted Yu: Sure, have a look here: http://pastebin.com/2ANFVZEU
>>>>>> The relevant tablename ist data_1091.
>>>>>> 
>>>>>> Thanks for your time.
>>>>>> 
>>>>>>      Timo
>>>>>> 
>>>>>> Am 13.12.2013 um 20:18 schrieb Ted Yu <yu...@gmail.com>:
>>>>>> 
>>>>>>> Timo:
>>>>>>> Can you pastebin regionserver log around 2013-12-12 13:54:20 so that
>> we
>>>>>> can
>>>>>>> see what happened ?
>>>>>>> 
>>>>>>> Thanks
>>>>>>> 
>>>>>>> 
>>>>>>> On Fri, Dec 13, 2013 at 11:02 AM, Jean-Marc Spaggiari <
>>>>>>> jean-marc@spaggiari.org> wrote:
>>>>>>> 
>>>>>>>> Try to increase hbase.regionserver.fileSplitTimeout but put it back
>> to
>>>>>> its
>>>>>>>> default value after.
>>>>>>>> 
>>>>>>>> Default value is 30 seconds. I think it's not normal for a split to
>>>> take
>>>>>>>> more than that.
>>>>>>>> 
>>>>>>>> What is your hardware configuration?
>>>>>>>> 
>>>>>>>> Have you run hbck to see if everything is correct?
>>>>>>>> 
>>>>>>>> JM
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 2013/12/13 Timo Schaepe <ti...@timoschaepe.de>
>>>>>>>> 
>>>>>>>>> Hello again,
>>>>>>>>> 
>>>>>>>>> digging in the logs of the specific regionserver shows me that:
>>>>>>>>> 
>>>>>>>>> 2013-12-12 13:54:20,194 INFO
>>>>>>>>> org.apache.hadoop.hbase.regionserver.SplitRequest: Running
>>>>>>>> rollback/cleanup
>>>>>>>>> of failed split of
>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>> 
>> data,OR\x83\xCF\x02\x82\xAE\xF3U,1386851456415.d04cadb1b2252dafc476c138e9651ca7.;
>>>>>>>>> Took too long to split the files and create the references,
>> aborting
>>>>>>>> split
>>>>>>>>> 
>>>>>>>>> This message appears two time, so it seems, that HBase tried to
>> split
>>>>>> the
>>>>>>>>> region but it failed. I don't know why. How is the behaviour of
>>>> HBase,
>>>>>>>> if a
>>>>>>>>> region split fails? Are there more tries to split this region
>> again?
>>>> I
>>>>>>>>> didn't find any new tries in the log. Now I split the big regions
>>>>>>>> manually
>>>>>>>>> and this works. And also it seems, that HBase split the new regions
>>>>>> again
>>>>>>>>> to crunch they down to the given limit.
>>>>>>>>> 
>>>>>>>>> But also it is a mystery for me, why the split size in Hannibal
>> shows
>>>>>> me
>>>>>>>>> 10 GB and in base-site.xml I put 2 GB…
>>>>>>>>> 
>>>>>>>>> Thanks,
>>>>>>>>> 
>>>>>>>>>     Timo
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Am 13.12.2013 um 10:22 schrieb Timo Schaepe <ti...@timoschaepe.de>:
>>>>>>>>> 
>>>>>>>>>> Hello,
>>>>>>>>>> 
>>>>>>>>>> during the loading of data in our cluster I noticed some strange
>>>>>>>>> behavior of some regions, that I don't understand.
>>>>>>>>>> 
>>>>>>>>>> Scenario:
>>>>>>>>>> We convert data from a mysql database to HBase. The data is
>> inserted
>>>>>>>>> with a put to the specific HBase table. The row key is a
>> timestamp. I
>>>>>>>> know
>>>>>>>>> the problem with timestamp keys, but in our requirement it works
>>>> quiet
>>>>>>>>> well. The problem is now, that there are some regions, which are
>>>>>> growing
>>>>>>>>> and growing.
>>>>>>>>>> 
>>>>>>>>>> For example the table on the picture [1]. First, all data was
>>>>>>>>> distributed over regions and node. And now, the data is written
>> into
>>>>>> only
>>>>>>>>> one region, which is growing and I can see no splitting at all.
>>>>>> Actually
>>>>>>>>> the size of the big region is nearly 60 GB.
>>>>>>>>>> 
>>>>>>>>>> HBase version is 0.94.11. I cannot understand, why the splitting
>> is
>>>>>> not
>>>>>>>>> happening. In hbase-site.xml I limit the hbase.hregion.max.filesize
>>>> to
>>>>>> 2
>>>>>>>> GB
>>>>>>>>> and HBase accepted this value.
>>>>>>>>>> 
>>>>>>>>>> <property>
>>>>>>>>>>   <!--Loaded from hbase-site.xml-->
>>>>>>>>>>   <name>hbase.hregion.max.filesize</name>
>>>>>>>>>>   <value>2147483648</value>
>>>>>>>>>> </property>
>>>>>>>>>> 
>>>>>>>>>> First mystery: Hannibal shows me the split size is 10 GB (see
>>>>>>>>> screenshot).
>>>>>>>>>> Second mystery: HBase is not splitting some regions neither at 2
>> GB
>>>>>> nor
>>>>>>>>> 10 GB.
>>>>>>>>>> 
>>>>>>>>>> Any ideas? Could be the timestamp rowkey cause this problem?
>>>>>>>>>> 
>>>>>>>>>> Thanks,
>>>>>>>>>> 
>>>>>>>>>>   Timo
>>>>>>>>>> 
>>>>>>>>>> [1] https://www.dropbox.com/s/lm286xkcpglnj1t/big_region.png
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>>>> 
>>>> 
>>>> 
>> 
>>

Re: Problems with hbase.hregion.max.filesize

Posted by Enis Söztutar <en...@apache.org>.

If the split takes too long (longer than 30 secs), I would say you may have
too many store files in the region. Split has to write two tiny files per
store file. The other thing may be the region has to be closed before
split. Thus it has to do a flush. If it cannot complete the flush in time,
it might cancel the split as well. Did you check that? Does your
compactions working as intended?

Enis


On Wed, Dec 18, 2013 at 10:06 AM, Timo Schaepe <ti...@timoschaepe.de> wrote:

> @Ted Yu:
> Yep, nevertheless thanks a lot!
>
>
> Am 18.12.2013 um 10:03 schrieb Ted Yu <yu...@gmail.com>:
>
> > Timo:
> > I went through namenode log and didn't find much clue.
> >
> > Cheers
> >
> >
> > On Tue, Dec 17, 2013 at 9:37 PM, Timo Schaepe <ti...@timoschaepe.de>
> wrote:
> >
> >> Hey Ted Yu,
> >>
> >> I had digging the name node log and so far I've found nothing special.
> No
> >> Exception, FATAL or ERROR message nor anything other peculiarities.
> >> Only I see a lot of messages like this:
> >>
> >> 2013-12-12 13:53:22,541 INFO org.apache.hadoop.hdfs.StateChange:
> Removing
> >> lease on
> >>
> /hbase/Sessions_1091/d04cadb1b2252dafc476c138e9651ca7/.splits/9717de41277e207c24359a18dae72cd3/l/58ab2c11ca9b4b4994ce54bac0bb4c68.d04cadb1b2252dafc476c138e9651ca7
> >> from client DFSClient_hb_rs_baur-hbase7.baur.boreus.de
> >> ,60020,1386712527761_1295065721_26
> >> 2013-12-12 13:53:22,541 INFO org.apache.hadoop.hdfs.StateChange: DIR*
> >> completeFile:
> >>
> /hbase/Sessions_1091/d04cadb1b2252dafc476c138e9651ca7/.splits/9717de41277e207c24359a18dae72cd3/l/58ab2c11ca9b4b4994ce54bac0bb4c68.d04cadb1b2252dafc476c138e9651ca7
> >> is closed by DFSClient_hb_rs_baur-hbase7.baur.boreus.de
> >> ,60020,1386712527761_1295065721_26
> >>
> >> But maybe that is normal. If you wanna have a look, you can find the log
> >> snippet at
> >>
> https://www.dropbox.com/s/8sls714knn4yqp3/hadoop-hadoop-namenode-baur-hbase1.log.2013-12-12.snip
> >>
> >> Thanks,
> >>
> >>        Timo
> >>
> >>
> >>
> >> Am 14.12.2013 um 09:12 schrieb Ted Yu <yu...@gmail.com>:
> >>
> >>> Timo:
> >>> Other than two occurrences of 'Took too long to split the files'
> >>> @ 13:54:20,194 and 13:55:10,533, I don't find much clue from the posted
> >> log.
> >>>
> >>> If you have time, mind checking namenode log for 1 minute interval
> >> leading
> >>> up to 13:54:20,194 and 13:55:10,533, respectively ?
> >>>
> >>> Thanks
> >>>
> >>>
> >>> On Sat, Dec 14, 2013 at 5:21 AM, Timo Schaepe <ti...@timoschaepe.de>
> >> wrote:
> >>>
> >>>> Hey,
> >>>>
> >>>> @JM: Thanks for the hint with hbase.regionserver.fileSplitTimeout. At
> >> the
> >>>> moment (the import is actually working) and after I splittet the
> >> specific
> >>>> regions manually, we do not have growing regions anymore.
> >>>>
> >>>> hbase hbck says, all things are going fine.
> >>>> 0 inconsistencies detected.
> >>>> Status: OK
> >>>>
> >>>> @Ted Yu: Sure, have a look here: http://pastebin.com/2ANFVZEU
> >>>> The relevant tablename ist data_1091.
> >>>>
> >>>> Thanks for your time.
> >>>>
> >>>>       Timo
> >>>>
> >>>> Am 13.12.2013 um 20:18 schrieb Ted Yu <yu...@gmail.com>:
> >>>>
> >>>>> Timo:
> >>>>> Can you pastebin regionserver log around 2013-12-12 13:54:20 so that
> we
> >>>> can
> >>>>> see what happened ?
> >>>>>
> >>>>> Thanks
> >>>>>
> >>>>>
> >>>>> On Fri, Dec 13, 2013 at 11:02 AM, Jean-Marc Spaggiari <
> >>>>> jean-marc@spaggiari.org> wrote:
> >>>>>
> >>>>>> Try to increase hbase.regionserver.fileSplitTimeout but put it back
> to
> >>>> its
> >>>>>> default value after.
> >>>>>>
> >>>>>> Default value is 30 seconds. I think it's not normal for a split to
> >> take
> >>>>>> more than that.
> >>>>>>
> >>>>>> What is your hardware configuration?
> >>>>>>
> >>>>>> Have you run hbck to see if everything is correct?
> >>>>>>
> >>>>>> JM
> >>>>>>
> >>>>>>
> >>>>>> 2013/12/13 Timo Schaepe <ti...@timoschaepe.de>
> >>>>>>
> >>>>>>> Hello again,
> >>>>>>>
> >>>>>>> digging in the logs of the specific regionserver shows me that:
> >>>>>>>
> >>>>>>> 2013-12-12 13:54:20,194 INFO
> >>>>>>> org.apache.hadoop.hbase.regionserver.SplitRequest: Running
> >>>>>> rollback/cleanup
> >>>>>>> of failed split of
> >>>>>>>
> >>>>>>
> >>>>
> >>
> data,OR\x83\xCF\x02\x82\xAE\xF3U,1386851456415.d04cadb1b2252dafc476c138e9651ca7.;
> >>>>>>> Took too long to split the files and create the references,
> aborting
> >>>>>> split
> >>>>>>>
> >>>>>>> This message appears two time, so it seems, that HBase tried to
> split
> >>>> the
> >>>>>>> region but it failed. I don't know why. How is the behaviour of
> >> HBase,
> >>>>>> if a
> >>>>>>> region split fails? Are there more tries to split this region
> again?
> >> I
> >>>>>>> didn't find any new tries in the log. Now I split the big regions
> >>>>>> manually
> >>>>>>> and this works. And also it seems, that HBase split the new regions
> >>>> again
> >>>>>>> to crunch they down to the given limit.
> >>>>>>>
> >>>>>>> But also it is a mystery for me, why the split size in Hannibal
> shows
> >>>> me
> >>>>>>> 10 GB and in base-site.xml I put 2 GB…
> >>>>>>>
> >>>>>>> Thanks,
> >>>>>>>
> >>>>>>>      Timo
> >>>>>>>
> >>>>>>>
> >>>>>>> Am 13.12.2013 um 10:22 schrieb Timo Schaepe <ti...@timoschaepe.de>:
> >>>>>>>
> >>>>>>>> Hello,
> >>>>>>>>
> >>>>>>>> during the loading of data in our cluster I noticed some strange
> >>>>>>> behavior of some regions, that I don't understand.
> >>>>>>>>
> >>>>>>>> Scenario:
> >>>>>>>> We convert data from a mysql database to HBase. The data is
> inserted
> >>>>>>> with a put to the specific HBase table. The row key is a
> timestamp. I
> >>>>>> know
> >>>>>>> the problem with timestamp keys, but in our requirement it works
> >> quiet
> >>>>>>> well. The problem is now, that there are some regions, which are
> >>>> growing
> >>>>>>> and growing.
> >>>>>>>>
> >>>>>>>> For example the table on the picture [1]. First, all data was
> >>>>>>> distributed over regions and node. And now, the data is written
> into
> >>>> only
> >>>>>>> one region, which is growing and I can see no splitting at all.
> >>>> Actually
> >>>>>>> the size of the big region is nearly 60 GB.
> >>>>>>>>
> >>>>>>>> HBase version is 0.94.11. I cannot understand, why the splitting
> is
> >>>> not
> >>>>>>> happening. In hbase-site.xml I limit the hbase.hregion.max.filesize
> >> to
> >>>> 2
> >>>>>> GB
> >>>>>>> and HBase accepted this value.
> >>>>>>>>
> >>>>>>>> <property>
> >>>>>>>>    <!--Loaded from hbase-site.xml-->
> >>>>>>>>    <name>hbase.hregion.max.filesize</name>
> >>>>>>>>    <value>2147483648</value>
> >>>>>>>> </property>
> >>>>>>>>
> >>>>>>>> First mystery: Hannibal shows me the split size is 10 GB (see
> >>>>>>> screenshot).
> >>>>>>>> Second mystery: HBase is not splitting some regions neither at 2
> GB
> >>>> nor
> >>>>>>> 10 GB.
> >>>>>>>>
> >>>>>>>> Any ideas? Could be the timestamp rowkey cause this problem?
> >>>>>>>>
> >>>>>>>> Thanks,
> >>>>>>>>
> >>>>>>>>    Timo
> >>>>>>>>
> >>>>>>>> [1] https://www.dropbox.com/s/lm286xkcpglnj1t/big_region.png
> >>>>>>>
> >>>>>>>
> >>>>>>
> >>>>
> >>>>
> >>
> >>
>
>

Re: Problems with hbase.hregion.max.filesize

Posted by Timo Schaepe <ti...@timoschaepe.de>.

@Ted Yu:
Yep, nevertheless thanks a lot!


Am 18.12.2013 um 10:03 schrieb Ted Yu <yu...@gmail.com>:

> Timo:
> I went through namenode log and didn't find much clue.
> 
> Cheers
> 
> 
> On Tue, Dec 17, 2013 at 9:37 PM, Timo Schaepe <ti...@timoschaepe.de> wrote:
> 
>> Hey Ted Yu,
>> 
>> I had digging the name node log and so far I've found nothing special. No
>> Exception, FATAL or ERROR message nor anything other peculiarities.
>> Only I see a lot of messages like this:
>> 
>> 2013-12-12 13:53:22,541 INFO org.apache.hadoop.hdfs.StateChange: Removing
>> lease on
>> /hbase/Sessions_1091/d04cadb1b2252dafc476c138e9651ca7/.splits/9717de41277e207c24359a18dae72cd3/l/58ab2c11ca9b4b4994ce54bac0bb4c68.d04cadb1b2252dafc476c138e9651ca7
>> from client DFSClient_hb_rs_baur-hbase7.baur.boreus.de
>> ,60020,1386712527761_1295065721_26
>> 2013-12-12 13:53:22,541 INFO org.apache.hadoop.hdfs.StateChange: DIR*
>> completeFile:
>> /hbase/Sessions_1091/d04cadb1b2252dafc476c138e9651ca7/.splits/9717de41277e207c24359a18dae72cd3/l/58ab2c11ca9b4b4994ce54bac0bb4c68.d04cadb1b2252dafc476c138e9651ca7
>> is closed by DFSClient_hb_rs_baur-hbase7.baur.boreus.de
>> ,60020,1386712527761_1295065721_26
>> 
>> But maybe that is normal. If you wanna have a look, you can find the log
>> snippet at
>> https://www.dropbox.com/s/8sls714knn4yqp3/hadoop-hadoop-namenode-baur-hbase1.log.2013-12-12.snip
>> 
>> Thanks,
>> 
>>        Timo
>> 
>> 
>> 
>> Am 14.12.2013 um 09:12 schrieb Ted Yu <yu...@gmail.com>:
>> 
>>> Timo:
>>> Other than two occurrences of 'Took too long to split the files'
>>> @ 13:54:20,194 and 13:55:10,533, I don't find much clue from the posted
>> log.
>>> 
>>> If you have time, mind checking namenode log for 1 minute interval
>> leading
>>> up to 13:54:20,194 and 13:55:10,533, respectively ?
>>> 
>>> Thanks
>>> 
>>> 
>>> On Sat, Dec 14, 2013 at 5:21 AM, Timo Schaepe <ti...@timoschaepe.de>
>> wrote:
>>> 
>>>> Hey,
>>>> 
>>>> @JM: Thanks for the hint with hbase.regionserver.fileSplitTimeout. At
>> the
>>>> moment (the import is actually working) and after I splittet the
>> specific
>>>> regions manually, we do not have growing regions anymore.
>>>> 
>>>> hbase hbck says, all things are going fine.
>>>> 0 inconsistencies detected.
>>>> Status: OK
>>>> 
>>>> @Ted Yu: Sure, have a look here: http://pastebin.com/2ANFVZEU
>>>> The relevant tablename ist data_1091.
>>>> 
>>>> Thanks for your time.
>>>> 
>>>>       Timo
>>>> 
>>>> Am 13.12.2013 um 20:18 schrieb Ted Yu <yu...@gmail.com>:
>>>> 
>>>>> Timo:
>>>>> Can you pastebin regionserver log around 2013-12-12 13:54:20 so that we
>>>> can
>>>>> see what happened ?
>>>>> 
>>>>> Thanks
>>>>> 
>>>>> 
>>>>> On Fri, Dec 13, 2013 at 11:02 AM, Jean-Marc Spaggiari <
>>>>> jean-marc@spaggiari.org> wrote:
>>>>> 
>>>>>> Try to increase hbase.regionserver.fileSplitTimeout but put it back to
>>>> its
>>>>>> default value after.
>>>>>> 
>>>>>> Default value is 30 seconds. I think it's not normal for a split to
>> take
>>>>>> more than that.
>>>>>> 
>>>>>> What is your hardware configuration?
>>>>>> 
>>>>>> Have you run hbck to see if everything is correct?
>>>>>> 
>>>>>> JM
>>>>>> 
>>>>>> 
>>>>>> 2013/12/13 Timo Schaepe <ti...@timoschaepe.de>
>>>>>> 
>>>>>>> Hello again,
>>>>>>> 
>>>>>>> digging in the logs of the specific regionserver shows me that:
>>>>>>> 
>>>>>>> 2013-12-12 13:54:20,194 INFO
>>>>>>> org.apache.hadoop.hbase.regionserver.SplitRequest: Running
>>>>>> rollback/cleanup
>>>>>>> of failed split of
>>>>>>> 
>>>>>> 
>>>> 
>> data,OR\x83\xCF\x02\x82\xAE\xF3U,1386851456415.d04cadb1b2252dafc476c138e9651ca7.;
>>>>>>> Took too long to split the files and create the references, aborting
>>>>>> split
>>>>>>> 
>>>>>>> This message appears two time, so it seems, that HBase tried to split
>>>> the
>>>>>>> region but it failed. I don't know why. How is the behaviour of
>> HBase,
>>>>>> if a
>>>>>>> region split fails? Are there more tries to split this region again?
>> I
>>>>>>> didn't find any new tries in the log. Now I split the big regions
>>>>>> manually
>>>>>>> and this works. And also it seems, that HBase split the new regions
>>>> again
>>>>>>> to crunch they down to the given limit.
>>>>>>> 
>>>>>>> But also it is a mystery for me, why the split size in Hannibal shows
>>>> me
>>>>>>> 10 GB and in base-site.xml I put 2 GB…
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> 
>>>>>>>      Timo
>>>>>>> 
>>>>>>> 
>>>>>>> Am 13.12.2013 um 10:22 schrieb Timo Schaepe <ti...@timoschaepe.de>:
>>>>>>> 
>>>>>>>> Hello,
>>>>>>>> 
>>>>>>>> during the loading of data in our cluster I noticed some strange
>>>>>>> behavior of some regions, that I don't understand.
>>>>>>>> 
>>>>>>>> Scenario:
>>>>>>>> We convert data from a mysql database to HBase. The data is inserted
>>>>>>> with a put to the specific HBase table. The row key is a timestamp. I
>>>>>> know
>>>>>>> the problem with timestamp keys, but in our requirement it works
>> quiet
>>>>>>> well. The problem is now, that there are some regions, which are
>>>> growing
>>>>>>> and growing.
>>>>>>>> 
>>>>>>>> For example the table on the picture [1]. First, all data was
>>>>>>> distributed over regions and node. And now, the data is written into
>>>> only
>>>>>>> one region, which is growing and I can see no splitting at all.
>>>> Actually
>>>>>>> the size of the big region is nearly 60 GB.
>>>>>>>> 
>>>>>>>> HBase version is 0.94.11. I cannot understand, why the splitting is
>>>> not
>>>>>>> happening. In hbase-site.xml I limit the hbase.hregion.max.filesize
>> to
>>>> 2
>>>>>> GB
>>>>>>> and HBase accepted this value.
>>>>>>>> 
>>>>>>>> <property>
>>>>>>>>    <!--Loaded from hbase-site.xml-->
>>>>>>>>    <name>hbase.hregion.max.filesize</name>
>>>>>>>>    <value>2147483648</value>
>>>>>>>> </property>
>>>>>>>> 
>>>>>>>> First mystery: Hannibal shows me the split size is 10 GB (see
>>>>>>> screenshot).
>>>>>>>> Second mystery: HBase is not splitting some regions neither at 2 GB
>>>> nor
>>>>>>> 10 GB.
>>>>>>>> 
>>>>>>>> Any ideas? Could be the timestamp rowkey cause this problem?
>>>>>>>> 
>>>>>>>> Thanks,
>>>>>>>> 
>>>>>>>>    Timo
>>>>>>>> 
>>>>>>>> [1] https://www.dropbox.com/s/lm286xkcpglnj1t/big_region.png
>>>>>>> 
>>>>>>> 
>>>>>> 
>>>> 
>>>> 
>> 
>>

Re: Problems with hbase.hregion.max.filesize

Posted by Ted Yu <yu...@gmail.com>.

Timo:
I went through namenode log and didn't find much clue.

Cheers


On Tue, Dec 17, 2013 at 9:37 PM, Timo Schaepe <ti...@timoschaepe.de> wrote:

> Hey Ted Yu,
>
> I had digging the name node log and so far I've found nothing special. No
> Exception, FATAL or ERROR message nor anything other peculiarities.
> Only I see a lot of messages like this:
>
> 2013-12-12 13:53:22,541 INFO org.apache.hadoop.hdfs.StateChange: Removing
> lease on
>  /hbase/Sessions_1091/d04cadb1b2252dafc476c138e9651ca7/.splits/9717de41277e207c24359a18dae72cd3/l/58ab2c11ca9b4b4994ce54bac0bb4c68.d04cadb1b2252dafc476c138e9651ca7
> from client DFSClient_hb_rs_baur-hbase7.baur.boreus.de
> ,60020,1386712527761_1295065721_26
> 2013-12-12 13:53:22,541 INFO org.apache.hadoop.hdfs.StateChange: DIR*
> completeFile:
> /hbase/Sessions_1091/d04cadb1b2252dafc476c138e9651ca7/.splits/9717de41277e207c24359a18dae72cd3/l/58ab2c11ca9b4b4994ce54bac0bb4c68.d04cadb1b2252dafc476c138e9651ca7
> is closed by DFSClient_hb_rs_baur-hbase7.baur.boreus.de
> ,60020,1386712527761_1295065721_26
>
> But maybe that is normal. If you wanna have a look, you can find the log
> snippet at
> https://www.dropbox.com/s/8sls714knn4yqp3/hadoop-hadoop-namenode-baur-hbase1.log.2013-12-12.snip
>
> Thanks,
>
>         Timo
>
>
>
> Am 14.12.2013 um 09:12 schrieb Ted Yu <yu...@gmail.com>:
>
> > Timo:
> > Other than two occurrences of 'Took too long to split the files'
> > @ 13:54:20,194 and 13:55:10,533, I don't find much clue from the posted
> log.
> >
> > If you have time, mind checking namenode log for 1 minute interval
> leading
> > up to 13:54:20,194 and 13:55:10,533, respectively ?
> >
> > Thanks
> >
> >
> > On Sat, Dec 14, 2013 at 5:21 AM, Timo Schaepe <ti...@timoschaepe.de>
> wrote:
> >
> >> Hey,
> >>
> >> @JM: Thanks for the hint with hbase.regionserver.fileSplitTimeout. At
> the
> >> moment (the import is actually working) and after I splittet the
> specific
> >> regions manually, we do not have growing regions anymore.
> >>
> >> hbase hbck says, all things are going fine.
> >> 0 inconsistencies detected.
> >> Status: OK
> >>
> >> @Ted Yu: Sure, have a look here: http://pastebin.com/2ANFVZEU
> >> The relevant tablename ist data_1091.
> >>
> >> Thanks for your time.
> >>
> >>        Timo
> >>
> >> Am 13.12.2013 um 20:18 schrieb Ted Yu <yu...@gmail.com>:
> >>
> >>> Timo:
> >>> Can you pastebin regionserver log around 2013-12-12 13:54:20 so that we
> >> can
> >>> see what happened ?
> >>>
> >>> Thanks
> >>>
> >>>
> >>> On Fri, Dec 13, 2013 at 11:02 AM, Jean-Marc Spaggiari <
> >>> jean-marc@spaggiari.org> wrote:
> >>>
> >>>> Try to increase hbase.regionserver.fileSplitTimeout but put it back to
> >> its
> >>>> default value after.
> >>>>
> >>>> Default value is 30 seconds. I think it's not normal for a split to
> take
> >>>> more than that.
> >>>>
> >>>> What is your hardware configuration?
> >>>>
> >>>> Have you run hbck to see if everything is correct?
> >>>>
> >>>> JM
> >>>>
> >>>>
> >>>> 2013/12/13 Timo Schaepe <ti...@timoschaepe.de>
> >>>>
> >>>>> Hello again,
> >>>>>
> >>>>> digging in the logs of the specific regionserver shows me that:
> >>>>>
> >>>>> 2013-12-12 13:54:20,194 INFO
> >>>>> org.apache.hadoop.hbase.regionserver.SplitRequest: Running
> >>>> rollback/cleanup
> >>>>> of failed split of
> >>>>>
> >>>>
> >>
> data,OR\x83\xCF\x02\x82\xAE\xF3U,1386851456415.d04cadb1b2252dafc476c138e9651ca7.;
> >>>>> Took too long to split the files and create the references, aborting
> >>>> split
> >>>>>
> >>>>> This message appears two time, so it seems, that HBase tried to split
> >> the
> >>>>> region but it failed. I don't know why. How is the behaviour of
> HBase,
> >>>> if a
> >>>>> region split fails? Are there more tries to split this region again?
> I
> >>>>> didn't find any new tries in the log. Now I split the big regions
> >>>> manually
> >>>>> and this works. And also it seems, that HBase split the new regions
> >> again
> >>>>> to crunch they down to the given limit.
> >>>>>
> >>>>> But also it is a mystery for me, why the split size in Hannibal shows
> >> me
> >>>>> 10 GB and in base-site.xml I put 2 GB…
> >>>>>
> >>>>> Thanks,
> >>>>>
> >>>>>       Timo
> >>>>>
> >>>>>
> >>>>> Am 13.12.2013 um 10:22 schrieb Timo Schaepe <ti...@timoschaepe.de>:
> >>>>>
> >>>>>> Hello,
> >>>>>>
> >>>>>> during the loading of data in our cluster I noticed some strange
> >>>>> behavior of some regions, that I don't understand.
> >>>>>>
> >>>>>> Scenario:
> >>>>>> We convert data from a mysql database to HBase. The data is inserted
> >>>>> with a put to the specific HBase table. The row key is a timestamp. I
> >>>> know
> >>>>> the problem with timestamp keys, but in our requirement it works
> quiet
> >>>>> well. The problem is now, that there are some regions, which are
> >> growing
> >>>>> and growing.
> >>>>>>
> >>>>>> For example the table on the picture [1]. First, all data was
> >>>>> distributed over regions and node. And now, the data is written into
> >> only
> >>>>> one region, which is growing and I can see no splitting at all.
> >> Actually
> >>>>> the size of the big region is nearly 60 GB.
> >>>>>>
> >>>>>> HBase version is 0.94.11. I cannot understand, why the splitting is
> >> not
> >>>>> happening. In hbase-site.xml I limit the hbase.hregion.max.filesize
> to
> >> 2
> >>>> GB
> >>>>> and HBase accepted this value.
> >>>>>>
> >>>>>> <property>
> >>>>>>     <!--Loaded from hbase-site.xml-->
> >>>>>>     <name>hbase.hregion.max.filesize</name>
> >>>>>>     <value>2147483648</value>
> >>>>>> </property>
> >>>>>>
> >>>>>> First mystery: Hannibal shows me the split size is 10 GB (see
> >>>>> screenshot).
> >>>>>> Second mystery: HBase is not splitting some regions neither at 2 GB
> >> nor
> >>>>> 10 GB.
> >>>>>>
> >>>>>> Any ideas? Could be the timestamp rowkey cause this problem?
> >>>>>>
> >>>>>> Thanks,
> >>>>>>
> >>>>>>     Timo
> >>>>>>
> >>>>>> [1] https://www.dropbox.com/s/lm286xkcpglnj1t/big_region.png
> >>>>>
> >>>>>
> >>>>
> >>
> >>
>
>

Re: Problems with hbase.hregion.max.filesize

Posted by Timo Schaepe <ti...@timoschaepe.de>.

Hey Ted Yu,

I had digging the name node log and so far I've found nothing special. No Exception, FATAL or ERROR message nor anything other peculiarities.
Only I see a lot of messages like this:

2013-12-12 13:53:22,541 INFO org.apache.hadoop.hdfs.StateChange: Removing lease on  /hbase/Sessions_1091/d04cadb1b2252dafc476c138e9651ca7/.splits/9717de41277e207c24359a18dae72cd3/l/58ab2c11ca9b4b4994ce54bac0bb4c68.d04cadb1b2252dafc476c138e9651ca7 from client DFSClient_hb_rs_baur-hbase7.baur.boreus.de,60020,1386712527761_1295065721_26
2013-12-12 13:53:22,541 INFO org.apache.hadoop.hdfs.StateChange: DIR* completeFile: /hbase/Sessions_1091/d04cadb1b2252dafc476c138e9651ca7/.splits/9717de41277e207c24359a18dae72cd3/l/58ab2c11ca9b4b4994ce54bac0bb4c68.d04cadb1b2252dafc476c138e9651ca7 is closed by DFSClient_hb_rs_baur-hbase7.baur.boreus.de,60020,1386712527761_1295065721_26

But maybe that is normal. If you wanna have a look, you can find the log snippet at https://www.dropbox.com/s/8sls714knn4yqp3/hadoop-hadoop-namenode-baur-hbase1.log.2013-12-12.snip

Thanks,

	Timo



Am 14.12.2013 um 09:12 schrieb Ted Yu <yu...@gmail.com>:

> Timo:
> Other than two occurrences of 'Took too long to split the files'
> @ 13:54:20,194 and 13:55:10,533, I don't find much clue from the posted log.
> 
> If you have time, mind checking namenode log for 1 minute interval leading
> up to 13:54:20,194 and 13:55:10,533, respectively ?
> 
> Thanks
> 
> 
> On Sat, Dec 14, 2013 at 5:21 AM, Timo Schaepe <ti...@timoschaepe.de> wrote:
> 
>> Hey,
>> 
>> @JM: Thanks for the hint with hbase.regionserver.fileSplitTimeout. At the
>> moment (the import is actually working) and after I splittet the specific
>> regions manually, we do not have growing regions anymore.
>> 
>> hbase hbck says, all things are going fine.
>> 0 inconsistencies detected.
>> Status: OK
>> 
>> @Ted Yu: Sure, have a look here: http://pastebin.com/2ANFVZEU
>> The relevant tablename ist data_1091.
>> 
>> Thanks for your time.
>> 
>>        Timo
>> 
>> Am 13.12.2013 um 20:18 schrieb Ted Yu <yu...@gmail.com>:
>> 
>>> Timo:
>>> Can you pastebin regionserver log around 2013-12-12 13:54:20 so that we
>> can
>>> see what happened ?
>>> 
>>> Thanks
>>> 
>>> 
>>> On Fri, Dec 13, 2013 at 11:02 AM, Jean-Marc Spaggiari <
>>> jean-marc@spaggiari.org> wrote:
>>> 
>>>> Try to increase hbase.regionserver.fileSplitTimeout but put it back to
>> its
>>>> default value after.
>>>> 
>>>> Default value is 30 seconds. I think it's not normal for a split to take
>>>> more than that.
>>>> 
>>>> What is your hardware configuration?
>>>> 
>>>> Have you run hbck to see if everything is correct?
>>>> 
>>>> JM
>>>> 
>>>> 
>>>> 2013/12/13 Timo Schaepe <ti...@timoschaepe.de>
>>>> 
>>>>> Hello again,
>>>>> 
>>>>> digging in the logs of the specific regionserver shows me that:
>>>>> 
>>>>> 2013-12-12 13:54:20,194 INFO
>>>>> org.apache.hadoop.hbase.regionserver.SplitRequest: Running
>>>> rollback/cleanup
>>>>> of failed split of
>>>>> 
>>>> 
>> data,OR\x83\xCF\x02\x82\xAE\xF3U,1386851456415.d04cadb1b2252dafc476c138e9651ca7.;
>>>>> Took too long to split the files and create the references, aborting
>>>> split
>>>>> 
>>>>> This message appears two time, so it seems, that HBase tried to split
>> the
>>>>> region but it failed. I don't know why. How is the behaviour of HBase,
>>>> if a
>>>>> region split fails? Are there more tries to split this region again? I
>>>>> didn't find any new tries in the log. Now I split the big regions
>>>> manually
>>>>> and this works. And also it seems, that HBase split the new regions
>> again
>>>>> to crunch they down to the given limit.
>>>>> 
>>>>> But also it is a mystery for me, why the split size in Hannibal shows
>> me
>>>>> 10 GB and in base-site.xml I put 2 GB…
>>>>> 
>>>>> Thanks,
>>>>> 
>>>>>       Timo
>>>>> 
>>>>> 
>>>>> Am 13.12.2013 um 10:22 schrieb Timo Schaepe <ti...@timoschaepe.de>:
>>>>> 
>>>>>> Hello,
>>>>>> 
>>>>>> during the loading of data in our cluster I noticed some strange
>>>>> behavior of some regions, that I don't understand.
>>>>>> 
>>>>>> Scenario:
>>>>>> We convert data from a mysql database to HBase. The data is inserted
>>>>> with a put to the specific HBase table. The row key is a timestamp. I
>>>> know
>>>>> the problem with timestamp keys, but in our requirement it works quiet
>>>>> well. The problem is now, that there are some regions, which are
>> growing
>>>>> and growing.
>>>>>> 
>>>>>> For example the table on the picture [1]. First, all data was
>>>>> distributed over regions and node. And now, the data is written into
>> only
>>>>> one region, which is growing and I can see no splitting at all.
>> Actually
>>>>> the size of the big region is nearly 60 GB.
>>>>>> 
>>>>>> HBase version is 0.94.11. I cannot understand, why the splitting is
>> not
>>>>> happening. In hbase-site.xml I limit the hbase.hregion.max.filesize to
>> 2
>>>> GB
>>>>> and HBase accepted this value.
>>>>>> 
>>>>>> <property>
>>>>>>     <!--Loaded from hbase-site.xml-->
>>>>>>     <name>hbase.hregion.max.filesize</name>
>>>>>>     <value>2147483648</value>
>>>>>> </property>
>>>>>> 
>>>>>> First mystery: Hannibal shows me the split size is 10 GB (see
>>>>> screenshot).
>>>>>> Second mystery: HBase is not splitting some regions neither at 2 GB
>> nor
>>>>> 10 GB.
>>>>>> 
>>>>>> Any ideas? Could be the timestamp rowkey cause this problem?
>>>>>> 
>>>>>> Thanks,
>>>>>> 
>>>>>>     Timo
>>>>>> 
>>>>>> [1] https://www.dropbox.com/s/lm286xkcpglnj1t/big_region.png
>>>>> 
>>>>> 
>>>> 
>> 
>>

Re: Problems with hbase.hregion.max.filesize

Posted by Ted Yu <yu...@gmail.com>.

Timo:
Other than two occurrences of 'Took too long to split the files'
@ 13:54:20,194 and 13:55:10,533, I don't find much clue from the posted log.

If you have time, mind checking namenode log for 1 minute interval leading
up to 13:54:20,194 and 13:55:10,533, respectively ?

Thanks


On Sat, Dec 14, 2013 at 5:21 AM, Timo Schaepe <ti...@timoschaepe.de> wrote:

> Hey,
>
> @JM: Thanks for the hint with hbase.regionserver.fileSplitTimeout. At the
> moment (the import is actually working) and after I splittet the specific
> regions manually, we do not have growing regions anymore.
>
> hbase hbck says, all things are going fine.
> 0 inconsistencies detected.
> Status: OK
>
> @Ted Yu: Sure, have a look here: http://pastebin.com/2ANFVZEU
> The relevant tablename ist data_1091.
>
> Thanks for your time.
>
>         Timo
>
> Am 13.12.2013 um 20:18 schrieb Ted Yu <yu...@gmail.com>:
>
> > Timo:
> > Can you pastebin regionserver log around 2013-12-12 13:54:20 so that we
> can
> > see what happened ?
> >
> > Thanks
> >
> >
> > On Fri, Dec 13, 2013 at 11:02 AM, Jean-Marc Spaggiari <
> > jean-marc@spaggiari.org> wrote:
> >
> >> Try to increase hbase.regionserver.fileSplitTimeout but put it back to
> its
> >> default value after.
> >>
> >> Default value is 30 seconds. I think it's not normal for a split to take
> >> more than that.
> >>
> >> What is your hardware configuration?
> >>
> >> Have you run hbck to see if everything is correct?
> >>
> >> JM
> >>
> >>
> >> 2013/12/13 Timo Schaepe <ti...@timoschaepe.de>
> >>
> >>> Hello again,
> >>>
> >>> digging in the logs of the specific regionserver shows me that:
> >>>
> >>> 2013-12-12 13:54:20,194 INFO
> >>> org.apache.hadoop.hbase.regionserver.SplitRequest: Running
> >> rollback/cleanup
> >>> of failed split of
> >>>
> >>
> data,OR\x83\xCF\x02\x82\xAE\xF3U,1386851456415.d04cadb1b2252dafc476c138e9651ca7.;
> >>> Took too long to split the files and create the references, aborting
> >> split
> >>>
> >>> This message appears two time, so it seems, that HBase tried to split
> the
> >>> region but it failed. I don't know why. How is the behaviour of HBase,
> >> if a
> >>> region split fails? Are there more tries to split this region again? I
> >>> didn't find any new tries in the log. Now I split the big regions
> >> manually
> >>> and this works. And also it seems, that HBase split the new regions
> again
> >>> to crunch they down to the given limit.
> >>>
> >>> But also it is a mystery for me, why the split size in Hannibal shows
> me
> >>> 10 GB and in base-site.xml I put 2 GB…
> >>>
> >>> Thanks,
> >>>
> >>>        Timo
> >>>
> >>>
> >>> Am 13.12.2013 um 10:22 schrieb Timo Schaepe <ti...@timoschaepe.de>:
> >>>
> >>>> Hello,
> >>>>
> >>>> during the loading of data in our cluster I noticed some strange
> >>> behavior of some regions, that I don't understand.
> >>>>
> >>>> Scenario:
> >>>> We convert data from a mysql database to HBase. The data is inserted
> >>> with a put to the specific HBase table. The row key is a timestamp. I
> >> know
> >>> the problem with timestamp keys, but in our requirement it works quiet
> >>> well. The problem is now, that there are some regions, which are
> growing
> >>> and growing.
> >>>>
> >>>> For example the table on the picture [1]. First, all data was
> >>> distributed over regions and node. And now, the data is written into
> only
> >>> one region, which is growing and I can see no splitting at all.
> Actually
> >>> the size of the big region is nearly 60 GB.
> >>>>
> >>>> HBase version is 0.94.11. I cannot understand, why the splitting is
> not
> >>> happening. In hbase-site.xml I limit the hbase.hregion.max.filesize to
> 2
> >> GB
> >>> and HBase accepted this value.
> >>>>
> >>>> <property>
> >>>>      <!--Loaded from hbase-site.xml-->
> >>>>      <name>hbase.hregion.max.filesize</name>
> >>>>      <value>2147483648</value>
> >>>> </property>
> >>>>
> >>>> First mystery: Hannibal shows me the split size is 10 GB (see
> >>> screenshot).
> >>>> Second mystery: HBase is not splitting some regions neither at 2 GB
> nor
> >>> 10 GB.
> >>>>
> >>>> Any ideas? Could be the timestamp rowkey cause this problem?
> >>>>
> >>>> Thanks,
> >>>>
> >>>>      Timo
> >>>>
> >>>> [1] https://www.dropbox.com/s/lm286xkcpglnj1t/big_region.png
> >>>
> >>>
> >>
>
>

Re: Problems with hbase.hregion.max.filesize

Posted by Timo Schaepe <ti...@timoschaepe.de>.

Hey,

@JM: Thanks for the hint with hbase.regionserver.fileSplitTimeout. At the moment (the import is actually working) and after I splittet the specific regions manually, we do not have growing regions anymore.

hbase hbck says, all things are going fine.
0 inconsistencies detected.
Status: OK

@Ted Yu: Sure, have a look here: http://pastebin.com/2ANFVZEU
The relevant tablename ist data_1091.

Thanks for your time.

	Timo

Am 13.12.2013 um 20:18 schrieb Ted Yu <yu...@gmail.com>:

> Timo:
> Can you pastebin regionserver log around 2013-12-12 13:54:20 so that we can
> see what happened ?
> 
> Thanks
> 
> 
> On Fri, Dec 13, 2013 at 11:02 AM, Jean-Marc Spaggiari <
> jean-marc@spaggiari.org> wrote:
> 
>> Try to increase hbase.regionserver.fileSplitTimeout but put it back to its
>> default value after.
>> 
>> Default value is 30 seconds. I think it's not normal for a split to take
>> more than that.
>> 
>> What is your hardware configuration?
>> 
>> Have you run hbck to see if everything is correct?
>> 
>> JM
>> 
>> 
>> 2013/12/13 Timo Schaepe <ti...@timoschaepe.de>
>> 
>>> Hello again,
>>> 
>>> digging in the logs of the specific regionserver shows me that:
>>> 
>>> 2013-12-12 13:54:20,194 INFO
>>> org.apache.hadoop.hbase.regionserver.SplitRequest: Running
>> rollback/cleanup
>>> of failed split of
>>> 
>> data,OR\x83\xCF\x02\x82\xAE\xF3U,1386851456415.d04cadb1b2252dafc476c138e9651ca7.;
>>> Took too long to split the files and create the references, aborting
>> split
>>> 
>>> This message appears two time, so it seems, that HBase tried to split the
>>> region but it failed. I don't know why. How is the behaviour of HBase,
>> if a
>>> region split fails? Are there more tries to split this region again? I
>>> didn't find any new tries in the log. Now I split the big regions
>> manually
>>> and this works. And also it seems, that HBase split the new regions again
>>> to crunch they down to the given limit.
>>> 
>>> But also it is a mystery for me, why the split size in Hannibal shows me
>>> 10 GB and in base-site.xml I put 2 GB…
>>> 
>>> Thanks,
>>> 
>>>        Timo
>>> 
>>> 
>>> Am 13.12.2013 um 10:22 schrieb Timo Schaepe <ti...@timoschaepe.de>:
>>> 
>>>> Hello,
>>>> 
>>>> during the loading of data in our cluster I noticed some strange
>>> behavior of some regions, that I don't understand.
>>>> 
>>>> Scenario:
>>>> We convert data from a mysql database to HBase. The data is inserted
>>> with a put to the specific HBase table. The row key is a timestamp. I
>> know
>>> the problem with timestamp keys, but in our requirement it works quiet
>>> well. The problem is now, that there are some regions, which are growing
>>> and growing.
>>>> 
>>>> For example the table on the picture [1]. First, all data was
>>> distributed over regions and node. And now, the data is written into only
>>> one region, which is growing and I can see no splitting at all. Actually
>>> the size of the big region is nearly 60 GB.
>>>> 
>>>> HBase version is 0.94.11. I cannot understand, why the splitting is not
>>> happening. In hbase-site.xml I limit the hbase.hregion.max.filesize to 2
>> GB
>>> and HBase accepted this value.
>>>> 
>>>> <property>
>>>>      <!--Loaded from hbase-site.xml-->
>>>>      <name>hbase.hregion.max.filesize</name>
>>>>      <value>2147483648</value>
>>>> </property>
>>>> 
>>>> First mystery: Hannibal shows me the split size is 10 GB (see
>>> screenshot).
>>>> Second mystery: HBase is not splitting some regions neither at 2 GB nor
>>> 10 GB.
>>>> 
>>>> Any ideas? Could be the timestamp rowkey cause this problem?
>>>> 
>>>> Thanks,
>>>> 
>>>>      Timo
>>>> 
>>>> [1] https://www.dropbox.com/s/lm286xkcpglnj1t/big_region.png
>>> 
>>> 
>>

Re: Problems with hbase.hregion.max.filesize

Posted by Ted Yu <yu...@gmail.com>.

Timo:
Can you pastebin regionserver log around 2013-12-12 13:54:20 so that we can
see what happened ?

Thanks


On Fri, Dec 13, 2013 at 11:02 AM, Jean-Marc Spaggiari <
jean-marc@spaggiari.org> wrote:

> Try to increase hbase.regionserver.fileSplitTimeout but put it back to its
> default value after.
>
> Default value is 30 seconds. I think it's not normal for a split to take
> more than that.
>
> What is your hardware configuration?
>
> Have you run hbck to see if everything is correct?
>
> JM
>
>
> 2013/12/13 Timo Schaepe <ti...@timoschaepe.de>
>
> > Hello again,
> >
> > digging in the logs of the specific regionserver shows me that:
> >
> > 2013-12-12 13:54:20,194 INFO
> > org.apache.hadoop.hbase.regionserver.SplitRequest: Running
> rollback/cleanup
> > of failed split of
> >
> data,OR\x83\xCF\x02\x82\xAE\xF3U,1386851456415.d04cadb1b2252dafc476c138e9651ca7.;
> > Took too long to split the files and create the references, aborting
> split
> >
> > This message appears two time, so it seems, that HBase tried to split the
> > region but it failed. I don't know why. How is the behaviour of HBase,
> if a
> > region split fails? Are there more tries to split this region again? I
> > didn't find any new tries in the log. Now I split the big regions
> manually
> > and this works. And also it seems, that HBase split the new regions again
> > to crunch they down to the given limit.
> >
> > But also it is a mystery for me, why the split size in Hannibal shows me
> > 10 GB and in base-site.xml I put 2 GB…
> >
> > Thanks,
> >
> >         Timo
> >
> >
> > Am 13.12.2013 um 10:22 schrieb Timo Schaepe <ti...@timoschaepe.de>:
> >
> > > Hello,
> > >
> > > during the loading of data in our cluster I noticed some strange
> > behavior of some regions, that I don't understand.
> > >
> > > Scenario:
> > > We convert data from a mysql database to HBase. The data is inserted
> > with a put to the specific HBase table. The row key is a timestamp. I
> know
> > the problem with timestamp keys, but in our requirement it works quiet
> > well. The problem is now, that there are some regions, which are growing
> > and growing.
> > >
> > > For example the table on the picture [1]. First, all data was
> > distributed over regions and node. And now, the data is written into only
> > one region, which is growing and I can see no splitting at all. Actually
> > the size of the big region is nearly 60 GB.
> > >
> > > HBase version is 0.94.11. I cannot understand, why the splitting is not
> > happening. In hbase-site.xml I limit the hbase.hregion.max.filesize to 2
> GB
> > and HBase accepted this value.
> > >
> > > <property>
> > >       <!--Loaded from hbase-site.xml-->
> > >       <name>hbase.hregion.max.filesize</name>
> > >       <value>2147483648</value>
> > > </property>
> > >
> > > First mystery: Hannibal shows me the split size is 10 GB (see
> > screenshot).
> > > Second mystery: HBase is not splitting some regions neither at 2 GB nor
> > 10 GB.
> > >
> > > Any ideas? Could be the timestamp rowkey cause this problem?
> > >
> > > Thanks,
> > >
> > >       Timo
> > >
> > > [1] https://www.dropbox.com/s/lm286xkcpglnj1t/big_region.png
> >
> >
>

Re: Problems with hbase.hregion.max.filesize

Posted by Jean-Marc Spaggiari <je...@spaggiari.org>.

Try to increase hbase.regionserver.fileSplitTimeout but put it back to its
default value after.

Default value is 30 seconds. I think it's not normal for a split to take
more than that.

What is your hardware configuration?

Have you run hbck to see if everything is correct?

JM


2013/12/13 Timo Schaepe <ti...@timoschaepe.de>

> Hello again,
>
> digging in the logs of the specific regionserver shows me that:
>
> 2013-12-12 13:54:20,194 INFO
> org.apache.hadoop.hbase.regionserver.SplitRequest: Running rollback/cleanup
> of failed split of
> data,OR\x83\xCF\x02\x82\xAE\xF3U,1386851456415.d04cadb1b2252dafc476c138e9651ca7.;
> Took too long to split the files and create the references, aborting split
>
> This message appears two time, so it seems, that HBase tried to split the
> region but it failed. I don't know why. How is the behaviour of HBase, if a
> region split fails? Are there more tries to split this region again? I
> didn't find any new tries in the log. Now I split the big regions manually
> and this works. And also it seems, that HBase split the new regions again
> to crunch they down to the given limit.
>
> But also it is a mystery for me, why the split size in Hannibal shows me
> 10 GB and in base-site.xml I put 2 GB…
>
> Thanks,
>
>         Timo
>
>
> Am 13.12.2013 um 10:22 schrieb Timo Schaepe <ti...@timoschaepe.de>:
>
> > Hello,
> >
> > during the loading of data in our cluster I noticed some strange
> behavior of some regions, that I don't understand.
> >
> > Scenario:
> > We convert data from a mysql database to HBase. The data is inserted
> with a put to the specific HBase table. The row key is a timestamp. I know
> the problem with timestamp keys, but in our requirement it works quiet
> well. The problem is now, that there are some regions, which are growing
> and growing.
> >
> > For example the table on the picture [1]. First, all data was
> distributed over regions and node. And now, the data is written into only
> one region, which is growing and I can see no splitting at all. Actually
> the size of the big region is nearly 60 GB.
> >
> > HBase version is 0.94.11. I cannot understand, why the splitting is not
> happening. In hbase-site.xml I limit the hbase.hregion.max.filesize to 2 GB
> and HBase accepted this value.
> >
> > <property>
> >       <!--Loaded from hbase-site.xml-->
> >       <name>hbase.hregion.max.filesize</name>
> >       <value>2147483648</value>
> > </property>
> >
> > First mystery: Hannibal shows me the split size is 10 GB (see
> screenshot).
> > Second mystery: HBase is not splitting some regions neither at 2 GB nor
> 10 GB.
> >
> > Any ideas? Could be the timestamp rowkey cause this problem?
> >
> > Thanks,
> >
> >       Timo
> >
> > [1] https://www.dropbox.com/s/lm286xkcpglnj1t/big_region.png
>
>

Re: Problems with hbase.hregion.max.filesize

Posted by Timo Schaepe <ti...@timoschaepe.de>.

Hello again,

digging in the logs of the specific regionserver shows me that:

2013-12-12 13:54:20,194 INFO org.apache.hadoop.hbase.regionserver.SplitRequest: Running rollback/cleanup of failed split of data,OR\x83\xCF\x02\x82\xAE\xF3U,1386851456415.d04cadb1b2252dafc476c138e9651ca7.; Took too long to split the files and create the references, aborting split

This message appears two time, so it seems, that HBase tried to split the region but it failed. I don't know why. How is the behaviour of HBase, if a region split fails? Are there more tries to split this region again? I didn't find any new tries in the log. Now I split the big regions manually and this works. And also it seems, that HBase split the new regions again to crunch they down to the given limit.

But also it is a mystery for me, why the split size in Hannibal shows me 10 GB and in base-site.xml I put 2 GB…

Thanks,

	Timo


Am 13.12.2013 um 10:22 schrieb Timo Schaepe <ti...@timoschaepe.de>:

> Hello,
> 
> during the loading of data in our cluster I noticed some strange behavior of some regions, that I don't understand. 
> 
> Scenario:
> We convert data from a mysql database to HBase. The data is inserted with a put to the specific HBase table. The row key is a timestamp. I know the problem with timestamp keys, but in our requirement it works quiet well. The problem is now, that there are some regions, which are growing and growing.
> 
> For example the table on the picture [1]. First, all data was distributed over regions and node. And now, the data is written into only one region, which is growing and I can see no splitting at all. Actually the size of the big region is nearly 60 GB.
> 
> HBase version is 0.94.11. I cannot understand, why the splitting is not happening. In hbase-site.xml I limit the hbase.hregion.max.filesize to 2 GB and HBase accepted this value.
> 
> <property>
> 	<!--Loaded from hbase-site.xml-->
> 	<name>hbase.hregion.max.filesize</name>
> 	<value>2147483648</value>
> </property>
> 
> First mystery: Hannibal shows me the split size is 10 GB (see screenshot).
> Second mystery: HBase is not splitting some regions neither at 2 GB nor 10 GB.
> 
> Any ideas? Could be the timestamp rowkey cause this problem?
> 
> Thanks,
> 
> 	Timo
> 
> [1] https://www.dropbox.com/s/lm286xkcpglnj1t/big_region.png