You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by "Taylor, Ronald C" <ro...@pnl.gov> on 2009/04/09 21:00:32 UTC

RE: setting xciever number limit

 
Hi Lars,

I looked up the Troubleshooting entry, as you suggested. It says
"dfs.datanode.max.xcievers (sic)". Um ... isn't the "(sic)" supposed to
indicate that the *wrong* spelling is being used in the previous phrase?
Or is the "(sic)" being used here to signify that yes, we know this is
an unexpected spelling, but use it anyway?
Ron

-----Original Message-----
From: Lars George [mailto:lars@worldlingo.com] 
Sent: Thursday, April 09, 2009 1:09 AM
To: hbase-user@hadoop.apache.org
Cc: Taylor, Ronald C
Subject: Re: Still need help with data upload into HBase

Hi Ron,

The syntax is like this (sic):

    <property>
        <name>dfs.datanode.max.xcievers</name>
        <value>4096</value>
    </property>

and it is documented on the HBase wiki here: 
http://wiki.apache.org/hadoop/Hbase/Troubleshooting

Regards,
Lars


Taylor, Ronald C wrote:
>  
> Hi Ryan,
>
> Thanks for the suggestion on checking whether the number of file 
> handles allowed actually gets increased after I make the change the 
> /etc/security/limits.conf.
>
> Turns out it was not. I had to check with one of our sysadmins so that

> the new 32K number of handles setting actually gets used on my Red Hat

> box.
>
> With that, and with one other change which I'll get to in a moment, I 
> finally was able to read in all the rows that I wanted, instead of the

> program breaking before finishing. Checked the table by scanning it - 
> looks OK. So - it looks like things are working as they should.
>
> Thank you very much for the help.
>
> Now as to the other parameter that needed changing: I found that the 
> xceivers (xcievers?) limit was not being bumped up - I was crashing on

> that. I went to add what Ryan suggested in hadoop-site.xml, i.e.,
>
> <property>
> <name>dfs.datanode.max.xcievers</name>
> <value>2047</value>
> </property>
>
> and discovered that I did not know whether to use 
> "dfs.datanode.max.xcievers" or "dfs.datanode.max.xceivers", where the 
> "i" and "e" switch. I was getting error msgs in the log files with
>
>   "xceiverCount 257 exceeds the limit of concurrent xcievers 256"
>
>  with BOTH spelling variants employed within the same error msg. Very 
> confusing. So I added property entries for both spellings in the 
> hadoop-site.xml file. Figured one of them would take effect. That 
> appears to work fine. But I would like get the correct spelling. Did a

> Google search and the spelling keeps popping up both ways, so I remain

> confused.
>
> I think the Hbase getting started documentation could use some 
> enhancement on file handle settings, xceiver (xciever?) settings, and 
> datanode handler count settings.
>  Ron
>
> ___________________________________________
> Ronald Taylor, Ph.D.
> Computational Biology & Bioinformatics Group Pacific Northwest 
> National Laboratory
> 902 Battelle Boulevard
> P.O. Box 999, MSIN K7-90
> Richland, WA  99352 USA
> Office:  509-372-6568
> Email: ronald.taylor@pnl.gov
> www.pnl.gov
>
> -----Original Message-----
> From: Ryan Rawson [mailto:ryanobjc@gmail.com]
> Sent: Monday, April 06, 2009 6:47 PM
> To: Taylor, Ronald C
> Cc: hbase-user@hadoop.apache.org
> Subject: Re: Still need help with data upload into HBase
>
> I ran into a problem on ubuntu where /etc/security/limits.conf wasnt 
> being honored due to a missing line in /etc/pam.d/common-session:
> "session required        pam_limits.so"
>
> this prevented the ulimits from being run.
>
> can you sudo to the hadoop/hbase user and verify with ulimit -a ?
>
>
>
> On Mon, Apr 6, 2009 at 5:07 PM, Taylor, Ronald C
> <ro...@pnl.gov>wrote:
>
>   
>>  Hello Ryan and the list,
>>
>> Well, I am still stuck. In addition to making the changes recommended

>> by Ryan to my hadoop-site.xml file (see below), I also added a line 
>> for HBase to /etc/security/limits.conf and had the fs.file-max hugely

>> increased, to hopefully handle any file handle limit problem. Still 
>> no
>>     
>
>   
>> luck with my upload program. It fails about where it did before, 
>> around the loading of  the 160,000th row into the one table that I 
>> create in Hbase. Didn't  the "too many file open" msg, but did get 
>> "handleConnectionFailure"  in the same place in the upload.
>>
>> I then tried a complete reinstall of Hbase and Hadoop, upgrading from

>> 0.19.0 to 0.19.1. Used the same config parameters as before, and 
>> reran
>>     
>
>   
>> the program. It fails again, at about the same number of rows 
>> uploaded
>>     
>
>   
>> - and I'm back to getting "too many files open" as what I think is 
>> the
>>     
>
>   
>> principal error msg.
>>
>> So - does anybody have any suggestions? I am running a
>>     
> "pseudo-distributed"
>   
>> installation of Hadoop on one Red Hat Linux machine with about ~3Gb 
>> of
>>     
> RAM.
>   
>> Are there any known problems with bulk uploads when running 
>> "pseudo-distributed" on on a single box, rather than a true cluster?
>> Is there anything else I can try?
>>  Ron
>>
>>
>> ___________________________________________
>> Ronald Taylor, Ph.D.
>> Computational Biology & Bioinformatics Group Pacific Northwest 
>> National Laboratory
>> 902 Battelle Boulevard
>> P.O. Box 999, MSIN K7-90
>> Richland, WA  99352 USA
>> Office:  509-372-6568
>> Email: ronald.taylor@pnl.gov
>> www.pnl.gov
>>
>>
>>  ------------------------------
>> *From:* Ryan Rawson [mailto:ryanobjc@gmail.com]
>> *Sent:* Friday, April 03, 2009 5:56 PM
>> *To:* Taylor, Ronald C
>> *Subject:* Re: FW: Still need help with data upload into HBase
>>
>> Welcome to hbase :-)
>>
>> This is pretty much how it goes for nearly every new user.
>>
>> We might want to review our docs...
>>
>> On Fri, Apr 3, 2009 at 5:54 PM, Taylor, Ronald C
>>     
> <ro...@pnl.gov>wrote:
>   
>>> Thanks. I'll make those settings, too, in addition to bumping up the

>>> file handle limit, and give it another go.
>>> Ron
>>>
>>> -----Original Message-----
>>> From: Ryan Rawson [mailto:ryanobjc@gmail.com]
>>> Sent: Friday, April 03, 2009 5:48 PM
>>> To: hbase-user@hadoop.apache.org
>>>  Subject: Re: Still need help with data upload into HBase
>>>
>>> Hey,
>>>
>>> File handle - yes... there was a FAQ and/or getting started which 
>>> talks about upping lots of limits.
>>>
>>> I have these set in my hadoop-site.xml (that is read by datanode):
>>> <property>
>>> <name>dfs.datanode.max.xcievers</name>
>>> <value>2047</value>
>>> </property>
>>>
>>> <property>
>>> <name>dfs.datanode.handler.count</name>
>>> <value>10</value>
>>> </property>
>>>
>>> I should probably set the datanode.handler.count higher.
>>>
>>> Don't forget to toss a reasonable amount of ram at hdfs... not sure 
>>> what that is exactly, but -Xmx1000m wouldn't hurt.
>>>
>>> On Fri, Apr 3, 2009 at 5:44 PM, Taylor, Ronald C
>>> <ro...@pnl.gov>wrote:
>>>
>>>       
>>>> Hi Ryan,
>>>>
>>>> Thanks for the info. Re checking the Hadoop datanode log file: I 
>>>> just did so, and found a "too many open files" error. Checking the 
>>>> Hbase
>>>>         
>>> FAQ,
>>>       
>>>> I see that I should drastically bump up the file handle limit. So I
>>>>         
>>> will
>>>       
>>>> give that a try.
>>>>
>>>> Question: what does the xciver variable do? My hadoop-site.xml file
>>>>         
>>> does
>>>       
>>>> not contain any entry for such a var. (Nothing reported in the 
>>>> datalog file either with the word "xciver".)
>>>>
>>>> Re using the local file system: well, as soon as I load a nice data
>>>>         
>>> set
>>>       
>>>> loaded in, I'm starting a demo project manipulating it for our Env 
>>>> Molecular Sciences Lab (EMSL), a DOE Nat User Facility. And I'm
>>>>         
>>> supposed
>>>       
>>>> to be doing the manipulating using MapReduce programs, to show the 
>>>> usefulness of such an approach. So I need Hadoop and the HDFS. And 
>>>> so
>>>>         
>>> I
>>>       
>>>> would prefer to keep using Hbase on top of Hadoop, rather than the
>>>>         
>>> local
>>>       
>>>> Linux file system. Hopefully the "small HDFS clusters" issues you 
>>>> mention are survivable. Eventually, some of this programming might
>>>>         
>>> wind
>>>       
>>>> up on Chinook, our 160 Teraflop supercomputer cluster, but that's a
>>>>         
>>> ways
>>>       
>>>> down the road. I'm starting on my Linux desktop.
>>>>
>>>> I'll try bumping up the file handle limit, restart Hadoop and 
>>>> Hbase,
>>>>         
>>> and
>>>       
>>>> see what happens.
>>>> Ron
>>>>
>>>> ___________________________________________
>>>> Ronald Taylor, Ph.D.
>>>> Computational Biology & Bioinformatics Group Pacific Northwest 
>>>> National Laboratory
>>>> 902 Battelle Boulevard
>>>> P.O. Box 999, MSIN K7-90
>>>> Richland, WA  99352 USA
>>>> Office:  509-372-6568
>>>> Email: ronald.taylor@pnl.gov
>>>> www.pnl.gov
>>>>
>>>>
>>>> -----Original Message-----
>>>> From: Ryan Rawson [mailto:ryanobjc@gmail.com]
>>>> Sent: Friday, April 03, 2009 5:08 PM
>>>> To: hbase-user@hadoop.apache.org
>>>> Subject: Re: Still need help with data upload into HBase
>>>>
>>>> Hey,
>>>>
>>>> Can you check the datanode logs?  You might be running into the
>>>>         
>>> dreaded
>>>       
>>>> xciver limit :-(
>>>>
>>>> try upping the xciver in hadoop-site.xml... i run at 2048.
>>>>
>>>> -ryan
>>>>
>>>> -----Original Message-----
>>>> From: Ryan Rawson [mailto:ryanobjc@gmail.com]
>>>> Sent: Friday, April 03, 2009 5:13 PM
>>>> To: hbase-user@hadoop.apache.org
>>>> Subject: Re: Still need help with data upload into HBase
>>>>
>>>> Non replicated yet is probably what you think - HDFS hasnt place
>>>>         
>>> blocks
>>>       
>>>> on
>>>> more nodes yet.  This could be due to the pseudo distributed nature
>>>>         
>
>   
>>>> of your set-up.  I'm not familiar with that configuration, so I 
>>>> can't really
>>>>         
>>> say
>>>       
>>>> more.
>>>>
>>>> If you only have 1 machine, you might as well just go with local
>>>>         
>>> files.
>>>       
>>>> The
>>>> HDFS gets you distributed replication, but until you have many
>>>>         
>>> machines,
>>>       
>>>> it
>>>> won't buy you anything and only cause problems, since small HDFS 
>>>> clusters are known to have issues.
>>>>
>>>> Good luck (again!)
>>>> -ryan
>>>>
>>>> On Fri, Apr 3, 2009 at 5:07 PM, Ryan Rawson <ry...@gmail.com>
>>>>         
>>> wrote:
>>>       
>>>>> Hey,
>>>>>
>>>>> Can you check the datanode logs?  You might be running into the
>>>>>           
>>>> dreaded
>>>>         
>>>>> xciver limit :-(
>>>>>
>>>>> try upping the xciver in hadoop-site.xml... i run at 2048.
>>>>>
>>>>> -ryan
>>>>>
>>>>>
>>>>> On Fri, Apr 3, 2009 at 4:35 PM, Taylor, Ronald C
>>>>>           
>>>> <ro...@pnl.gov>wrote:
>>>>         
>>>>>> Hello folks,
>>>>>>
>>>>>> I have just tried using Ryan's doCommit() method for my bulk 
>>>>>> upload
>>>>>>             
>>>> into
>>>>         
>>>>>> one Hbase table. No luck. I still start to get errors around row
>>>>>>             
>
>   
>>>>>> 160,000. On-screen, the program starts to generate error msgs 
>>>>>> like
>>>>>>             
>>>> so:
>>>>         
>>>>>> ...
>>>>>> INFO: Retrying connect to server: /127.0.0.1:60383. Already tried

>>>>>> 8 time(s).
>>>>>> Apr 3, 2009 2:39:52 PM
>>>>>> org.apache.hadoop.hbase.ipc.HBaseClient$Connection
>>>>>> handleConnectionFailure
>>>>>> INFO: Retrying connect to server: /127.0.0.1:60383. Already tried

>>>>>> 9 time(s).
>>>>>> Apr 3, 2009 2:39:57 PM
>>>>>> org.apache.hadoop.hbase.ipc.HBaseClient$Connection
>>>>>> handleConnectionFailure
>>>>>> INFO: Retrying connect to server: /127.0.0.1:60383. Already tried

>>>>>> 0 time(s).
>>>>>> Apr 3, 2009 2:39:58 PM
>>>>>> org.apache.hadoop.hbase.ipc.HBaseClient$Connection
>>>>>> handleConnectionFailure
>>>>>> INFO: Retrying connect to server: /127.0.0.1:60383. Already tried

>>>>>> 1 time(s).
>>>>>> ...
>>>>>> In regard to log file information, I have appended at bottom some
>>>>>>             
>>> of
>>>       
>>>> the
>>>>         
>>>>>> output from my hbase-<user>-master-<machine>.log file, at the 
>>>>>> place where it looks to me like things might have started to go
>>>>>>             
> wrong.
>   
>>>> Several
>>>>         
>>>>>> questions:
>>>>>>
>>>>>> 1)  Is there any readily apparent cause for such a 
>>>>>> HBaseClient$Connection handleConnectionFailure to occur in a 
>>>>>> Hbase installation configured on a Linux desktop to work in the 
>>>>>> pseudo-distributed operation mode? From my understanding, even
>>>>>>             
>>>> importing
>>>>         
>>>>>> ~200,000 rows (each row being filled with info for ten columns) 
>>>>>> is
>>>>>>             
>>> a
>>>       
>>>>>> minimal data set for Hbase, and upload should not be failing like
>>>>>>             
>>>> this.
>>>>         
>>>>>> FYI - minimal changes were made to the Hbase default settings in
>>>>>>             
>>> the
>>>       
>>>>>> Hbase ../conf/ config files when I installed Hbase 0.19.0. I have
>>>>>>             
>>> one
>>>       
>>>>>> entry in hbase-env.sh, to set JAVA_HOME, and one property entry 
>>>>>> in hbase-site.xml, to set the hbase.rootdir.
>>>>>>
>>>>>> 2) My Linux box has about 3 Gb of memory. I left the HADOOP_HEAP
>>>>>>             
>>> and
>>>       
>>>>>> HBASE_HEAP sizes at their default values, which I understand are
>>>>>>             
>>>> 1000Mb
>>>>         
>>>>>> each. Should I have changed either value?
>>>>>>
>>>>>> 3) I left the dfs.replication value at the default of  "3" in 
>>>>>> the hadoop-site.xml file, for my test of pseudo-distributed
>>>>>>             
> operation.
>   
>>>>>> Should I have changed that to "1", for operation on my single
>>>>>>             
>>>> machine?
>>>>         
>>>>>> Downsizing to "1" would appear to me to negate trying out Hadoop
>>>>>>             
>
>   
>>>>>> in
>>>>>>             
>>>> the
>>>>         
>>>>>> pseudo-distributed operation mode, so I left the value "as is", 
>>>>>> but
>>>>>>             
>>>> did
>>>>         
>>>>>> I get this wrong?
>>>>>>
>>>>>> 4) In the log output below, you can see that Hbase starts to 
>>>>>> block
>>>>>>             
>>>> and
>>>>         
>>>>>> then unblock updates to my one Hbase table (called the 
>>>>>> "ppInteractionTable", for protein-protein interaction table). A
>>>>>>             
>>>> little
>>>>         
>>>>>> later, a msg says that the ppInteractionTable has been closed. 
>>>>>> At
>>>>>>             
>>>> this
>>>>         
>>>>>> point, my program has *not* issued a command to close the table 
>>>>>> -
>>>>>>             
>>>> that
>>>>         
>>>>>> only happens at the end of the program. So - why is this
>>>>>>             
> happening?
>   
>>>>>> Also, near the end of my log extract, I get a different error
>>>>>>             
> msg:
>   
>>>>>> NotReplicatedYetException. I have no idea what that means.
>>>>>>             
>>> Actually,
>>>       
>>>> I
>>>>         
>>>>>> don't really have a grasp yet on what any of these error msgs is
>>>>>>             
>
>   
>>>>>> supposed to tell us. So - once again, any help would be much 
>>>>>> appreciated.
>>>>>>
>>>>>>  Ron
>>>>>>
>>>>>> ___________________________________________
>>>>>> Ronald Taylor, Ph.D.
>>>>>> Computational Biology & Bioinformatics Group Pacific Northwest 
>>>>>> National Laboratory
>>>>>> 902 Battelle Boulevard
>>>>>> P.O. Box 999, MSIN K7-90
>>>>>> Richland, WA  99352 USA
>>>>>> Office:  509-372-6568
>>>>>> Email: ronald.taylor@pnl.gov
>>>>>> www.pnl.gov
>>>>>>
>>>>>>
>>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Taylor, Ronald C
>>>>>> Sent: Tuesday, March 31, 2009 5:48 PM
>>>>>> To: 'hbase-user@hadoop.apache.org'
>>>>>> Cc: Taylor, Ronald C
>>>>>> Subject: Novice Hbase user needs help with data upload - gets a 
>>>>>> RetriesExhaustedException, followed by 
>>>>>> NoServerForRegionException
>>>>>>
>>>>>>
>>>>>> Hello folks,
>>>>>>
>>>>>> This is my first msg to the list - I just joined today, and I am
>>>>>>             
>
>   
>>>>>> a novice Hadoop/HBase programmer. I have a question:
>>>>>>
>>>>>> I have written a Java program to create an HBase table and then
>>>>>>             
>>> enter
>>>       
>>>> a
>>>>         
>>>>>> number of rows into the table. The only way I have found so far 
>>>>>> to
>>>>>>             
>>> do
>>>       
>>>>>> this is to enter each row one-by-one, creating a new BatchUpdate
>>>>>>             
>
>   
>>>>>> updateObj for each row, doing about ten updateObj.put()'s to add
>>>>>>             
>>> the
>>>       
>>>>>> column data, and then doing a tableObj.commit(updateObj). 
>>>>>> There's probably a more efficient way (happy to hear, if so!), 
>>>>>> but this is
>>>>>>             
>>>> what
>>>>         
>>>>>> I'm starting with.
>>>>>>
>>>>>> When I do this on input that creates 3000 rows, the program 
>>>>>> works
>>>>>>             
>>>> fine.
>>>>         
>>>>>> When I try this on input that would create 300,000 rows (still 
>>>>>> relatively small for an HBase table, I would think), the program
>>>>>>             
>
>   
>>>>>> terminates around row 160,000 or so, generating first an 
>>>>>> RetriesExhaustedException, followed by
>>>>>>             
> NoServerForRegionException.
>   
>>>> The
>>>>         
>>>>>> HBase server crashes, and I have to restart it. The Hadoop 
>>>>>> server appears to remain OK and does not need restarting.
>>>>>>
>>>>>> Can anybody give me any guidance? I presume that I might need to
>>>>>>             
>>>> adjust
>>>>         
>>>>>> some setting for larger input in the HBase and/or Hadoop config
>>>>>>             
>>>> files.
>>>>         
>>>>>> At present, I am using default settings. I have installed Hadoop
>>>>>>             
>>>> 0.19.0
>>>>         
>>>>>> and HBase 0.19.0 in the "pseudo" cluster mode on a single 
>>>>>> machine,
>>>>>>             
>>> my
>>>       
>>>>>> Red Hat Linux desktop, which has 3 Gb RAM.
>>>>>>
>>>>>> Any help / suggestions would be much appreciated.
>>>>>>
>>>>>>  Cheers,
>>>>>>   Ron Taylor
>>>>>>             
>
>   

Re: setting xciever number limit

Posted by Ryan Rawson <ry...@gmail.com>.
sic is latin for 'so, thus'.  meaning, that the word looks odd or wrong, but
it is quote exactly as it is in the original.

Of course, an xciever is something that transmits and receives, but
apparently i before e, except after c didnt apply this time.

On Thu, Apr 9, 2009 at 1:22 PM, Lars George <la...@worldlingo.com> wrote:

> Hi Ron,
>
> I am adding it to indicate the misspelling. Usually these keywords are of
> course in the hadoop-defaults.xml, but that seems not to be the case with
> this rather new one. Maybe the 0.19.1 and newer is all good, my 0.19.0 does
> not have it though.
>
> And to clarify the spelling, here the code from Hadoop's
> DataXceiverServer.java:
>
>   this.maxXceiverCount = conf.getInt("dfs.datanode.max.xcievers",
>       MAX_XCEIVER_COUNT);
>
> HTH,
>
> Lars
>
>
> Taylor, Ronald C wrote:
>
>>  Hi Lars,
>>
>> I looked up the Troubleshooting entry, as you suggested. It says
>> "dfs.datanode.max.xcievers (sic)". Um ... isn't the "(sic)" supposed to
>> indicate that the *wrong* spelling is being used in the previous phrase?
>> Or is the "(sic)" being used here to signify that yes, we know this is
>> an unexpected spelling, but use it anyway?
>> Ron
>>
>> -----Original Message-----
>> From: Lars George [mailto:lars@worldlingo.com] Sent: Thursday, April 09,
>> 2009 1:09 AM
>> To: hbase-user@hadoop.apache.org
>> Cc: Taylor, Ronald C
>> Subject: Re: Still need help with data upload into HBase
>>
>> Hi Ron,
>>
>> The syntax is like this (sic):
>>
>>    <property>
>>        <name>dfs.datanode.max.xcievers</name>
>>        <value>4096</value>
>>    </property>
>>
>> and it is documented on the HBase wiki here:
>> http://wiki.apache.org/hadoop/Hbase/Troubleshooting
>>
>> Regards,
>> Lars
>>
>>
>> Taylor, Ronald C wrote:
>>
>>
>>>  Hi Ryan,
>>>
>>> Thanks for the suggestion on checking whether the number of file handles
>>> allowed actually gets increased after I make the change the
>>> /etc/security/limits.conf.
>>>
>>> Turns out it was not. I had to check with one of our sysadmins so that
>>>
>>>
>>
>>
>>
>>> the new 32K number of handles setting actually gets used on my Red Hat
>>>
>>>
>>
>>
>>
>>> box.
>>>
>>> With that, and with one other change which I'll get to in a moment, I
>>> finally was able to read in all the rows that I wanted, instead of the
>>>
>>>
>>
>>
>>
>>> program breaking before finishing. Checked the table by scanning it -
>>> looks OK. So - it looks like things are working as they should.
>>>
>>> Thank you very much for the help.
>>>
>>> Now as to the other parameter that needed changing: I found that the
>>> xceivers (xcievers?) limit was not being bumped up - I was crashing on
>>>
>>>
>>
>>
>>
>>> that. I went to add what Ryan suggested in hadoop-site.xml, i.e.,
>>>
>>> <property>
>>> <name>dfs.datanode.max.xcievers</name>
>>> <value>2047</value>
>>> </property>
>>>
>>> and discovered that I did not know whether to use
>>> "dfs.datanode.max.xcievers" or "dfs.datanode.max.xceivers", where the "i"
>>> and "e" switch. I was getting error msgs in the log files with
>>>
>>>  "xceiverCount 257 exceeds the limit of concurrent xcievers 256"
>>>
>>>  with BOTH spelling variants employed within the same error msg. Very
>>> confusing. So I added property entries for both spellings in the
>>> hadoop-site.xml file. Figured one of them would take effect. That appears to
>>> work fine. But I would like get the correct spelling. Did a
>>>
>>>
>>
>>
>>
>>> Google search and the spelling keeps popping up both ways, so I remain
>>>
>>>
>>
>>
>>
>>> confused.
>>>
>>> I think the Hbase getting started documentation could use some
>>> enhancement on file handle settings, xceiver (xciever?) settings, and
>>> datanode handler count settings.
>>>  Ron
>>>
>>> ___________________________________________
>>> Ronald Taylor, Ph.D.
>>> Computational Biology & Bioinformatics Group Pacific Northwest National
>>> Laboratory
>>> 902 Battelle Boulevard
>>> P.O. Box 999, MSIN K7-90
>>> Richland, WA  99352 USA
>>> Office:  509-372-6568
>>> Email: ronald.taylor@pnl.gov
>>> www.pnl.gov
>>>
>>> -----Original Message-----
>>> From: Ryan Rawson [mailto:ryanobjc@gmail.com]
>>> Sent: Monday, April 06, 2009 6:47 PM
>>> To: Taylor, Ronald C
>>> Cc: hbase-user@hadoop.apache.org
>>> Subject: Re: Still need help with data upload into HBase
>>>
>>> I ran into a problem on ubuntu where /etc/security/limits.conf wasnt
>>> being honored due to a missing line in /etc/pam.d/common-session:
>>> "session required        pam_limits.so"
>>>
>>> this prevented the ulimits from being run.
>>>
>>> can you sudo to the hadoop/hbase user and verify with ulimit -a ?
>>>
>>>
>>>
>>> On Mon, Apr 6, 2009 at 5:07 PM, Taylor, Ronald C
>>> <ro...@pnl.gov>wrote:
>>>
>>>
>>>
>>>>  Hello Ryan and the list,
>>>>
>>>> Well, I am still stuck. In addition to making the changes recommended
>>>>
>>>>
>>>
>>
>>
>>> by Ryan to my hadoop-site.xml file (see below), I also added a line for
>>>> HBase to /etc/security/limits.conf and had the fs.file-max hugely
>>>>
>>>>
>>>
>>
>>
>>> increased, to hopefully handle any file handle limit problem. Still no
>>>>
>>>>
>>>
>>>
>>>> luck with my upload program. It fails about where it did before, around
>>>> the loading of  the 160,000th row into the one table that I create in Hbase.
>>>> Didn't  the "too many file open" msg, but did get "handleConnectionFailure"
>>>>  in the same place in the upload.
>>>>
>>>> I then tried a complete reinstall of Hbase and Hadoop, upgrading from
>>>>
>>>>
>>>
>>
>>
>>> 0.19.0 to 0.19.1. Used the same config parameters as before, and reran
>>>>
>>>>
>>>
>>>
>>>> the program. It fails again, at about the same number of rows uploaded
>>>>
>>>>
>>>
>>>
>>>> - and I'm back to getting "too many files open" as what I think is the
>>>>
>>>>
>>>
>>>
>>>> principal error msg.
>>>>
>>>> So - does anybody have any suggestions? I am running a
>>>>
>>>>
>>> "pseudo-distributed"
>>>
>>>
>>>> installation of Hadoop on one Red Hat Linux machine with about ~3Gb of
>>>>
>>>>
>>> RAM.
>>>
>>>
>>>> Are there any known problems with bulk uploads when running
>>>> "pseudo-distributed" on on a single box, rather than a true cluster?
>>>> Is there anything else I can try?
>>>>  Ron
>>>>
>>>>
>>>> ___________________________________________
>>>> Ronald Taylor, Ph.D.
>>>> Computational Biology & Bioinformatics Group Pacific Northwest National
>>>> Laboratory
>>>> 902 Battelle Boulevard
>>>> P.O. Box 999, MSIN K7-90
>>>> Richland, WA  99352 USA
>>>> Office:  509-372-6568
>>>> Email: ronald.taylor@pnl.gov
>>>> www.pnl.gov
>>>>
>>>>
>>>>  ------------------------------
>>>> *From:* Ryan Rawson [mailto:ryanobjc@gmail.com]
>>>> *Sent:* Friday, April 03, 2009 5:56 PM
>>>> *To:* Taylor, Ronald C
>>>> *Subject:* Re: FW: Still need help with data upload into HBase
>>>>
>>>> Welcome to hbase :-)
>>>>
>>>> This is pretty much how it goes for nearly every new user.
>>>>
>>>> We might want to review our docs...
>>>>
>>>> On Fri, Apr 3, 2009 at 5:54 PM, Taylor, Ronald C
>>>>
>>>>
>>> <ro...@pnl.gov>wrote:
>>>
>>>
>>>> Thanks. I'll make those settings, too, in addition to bumping up the
>>>>>
>>>>>
>>>>
>>
>>
>>> file handle limit, and give it another go.
>>>>> Ron
>>>>>
>>>>> -----Original Message-----
>>>>> From: Ryan Rawson [mailto:ryanobjc@gmail.com]
>>>>> Sent: Friday, April 03, 2009 5:48 PM
>>>>> To: hbase-user@hadoop.apache.org
>>>>>  Subject: Re: Still need help with data upload into HBase
>>>>>
>>>>> Hey,
>>>>>
>>>>> File handle - yes... there was a FAQ and/or getting started which talks
>>>>> about upping lots of limits.
>>>>>
>>>>> I have these set in my hadoop-site.xml (that is read by datanode):
>>>>> <property>
>>>>> <name>dfs.datanode.max.xcievers</name>
>>>>> <value>2047</value>
>>>>> </property>
>>>>>
>>>>> <property>
>>>>> <name>dfs.datanode.handler.count</name>
>>>>> <value>10</value>
>>>>> </property>
>>>>>
>>>>> I should probably set the datanode.handler.count higher.
>>>>>
>>>>> Don't forget to toss a reasonable amount of ram at hdfs... not sure
>>>>> what that is exactly, but -Xmx1000m wouldn't hurt.
>>>>>
>>>>> On Fri, Apr 3, 2009 at 5:44 PM, Taylor, Ronald C
>>>>> <ro...@pnl.gov>wrote:
>>>>>
>>>>>
>>>>>
>>>>>> Hi Ryan,
>>>>>>
>>>>>> Thanks for the info. Re checking the Hadoop datanode log file: I just
>>>>>> did so, and found a "too many open files" error. Checking the Hbase
>>>>>>
>>>>>>
>>>>> FAQ,
>>>>>
>>>>>
>>>>>> I see that I should drastically bump up the file handle limit. So I
>>>>>>
>>>>>>
>>>>> will
>>>>>
>>>>>
>>>>>> give that a try.
>>>>>>
>>>>>> Question: what does the xciver variable do? My hadoop-site.xml file
>>>>>>
>>>>>>
>>>>> does
>>>>>
>>>>>
>>>>>> not contain any entry for such a var. (Nothing reported in the datalog
>>>>>> file either with the word "xciver".)
>>>>>>
>>>>>> Re using the local file system: well, as soon as I load a nice data
>>>>>>
>>>>>>
>>>>> set
>>>>>
>>>>>
>>>>>> loaded in, I'm starting a demo project manipulating it for our Env
>>>>>> Molecular Sciences Lab (EMSL), a DOE Nat User Facility. And I'm
>>>>>>
>>>>>>
>>>>> supposed
>>>>>
>>>>>
>>>>>> to be doing the manipulating using MapReduce programs, to show the
>>>>>> usefulness of such an approach. So I need Hadoop and the HDFS. And so
>>>>>>
>>>>>>
>>>>> I
>>>>>
>>>>>
>>>>>> would prefer to keep using Hbase on top of Hadoop, rather than the
>>>>>>
>>>>>>
>>>>> local
>>>>>
>>>>>
>>>>>> Linux file system. Hopefully the "small HDFS clusters" issues you
>>>>>> mention are survivable. Eventually, some of this programming might
>>>>>>
>>>>>>
>>>>> wind
>>>>>
>>>>>
>>>>>> up on Chinook, our 160 Teraflop supercomputer cluster, but that's a
>>>>>>
>>>>>>
>>>>> ways
>>>>>
>>>>>
>>>>>> down the road. I'm starting on my Linux desktop.
>>>>>>
>>>>>> I'll try bumping up the file handle limit, restart Hadoop and Hbase,
>>>>>>
>>>>>>
>>>>> and
>>>>>
>>>>>
>>>>>> see what happens.
>>>>>> Ron
>>>>>>
>>>>>> ___________________________________________
>>>>>> Ronald Taylor, Ph.D.
>>>>>> Computational Biology & Bioinformatics Group Pacific Northwest
>>>>>> National Laboratory
>>>>>> 902 Battelle Boulevard
>>>>>> P.O. Box 999, MSIN K7-90
>>>>>> Richland, WA  99352 USA
>>>>>> Office:  509-372-6568
>>>>>> Email: ronald.taylor@pnl.gov
>>>>>> www.pnl.gov
>>>>>>
>>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Ryan Rawson [mailto:ryanobjc@gmail.com]
>>>>>> Sent: Friday, April 03, 2009 5:08 PM
>>>>>> To: hbase-user@hadoop.apache.org
>>>>>> Subject: Re: Still need help with data upload into HBase
>>>>>>
>>>>>> Hey,
>>>>>>
>>>>>> Can you check the datanode logs?  You might be running into the
>>>>>>
>>>>>>
>>>>> dreaded
>>>>>
>>>>>
>>>>>> xciver limit :-(
>>>>>>
>>>>>> try upping the xciver in hadoop-site.xml... i run at 2048.
>>>>>>
>>>>>> -ryan
>>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Ryan Rawson [mailto:ryanobjc@gmail.com]
>>>>>> Sent: Friday, April 03, 2009 5:13 PM
>>>>>> To: hbase-user@hadoop.apache.org
>>>>>> Subject: Re: Still need help with data upload into HBase
>>>>>>
>>>>>> Non replicated yet is probably what you think - HDFS hasnt place
>>>>>>
>>>>>>
>>>>> blocks
>>>>>
>>>>>
>>>>>> on
>>>>>> more nodes yet.  This could be due to the pseudo distributed nature
>>>>>>
>>>>>>
>>>>>
>>>
>>>> of your set-up.  I'm not familiar with that configuration, so I can't
>>>>>> really
>>>>>>
>>>>>>
>>>>> say
>>>>>
>>>>>
>>>>>> more.
>>>>>>
>>>>>> If you only have 1 machine, you might as well just go with local
>>>>>>
>>>>>>
>>>>> files.
>>>>>
>>>>>
>>>>>> The
>>>>>> HDFS gets you distributed replication, but until you have many
>>>>>>
>>>>>>
>>>>> machines,
>>>>>
>>>>>
>>>>>> it
>>>>>> won't buy you anything and only cause problems, since small HDFS
>>>>>> clusters are known to have issues.
>>>>>>
>>>>>> Good luck (again!)
>>>>>> -ryan
>>>>>>
>>>>>> On Fri, Apr 3, 2009 at 5:07 PM, Ryan Rawson <ry...@gmail.com>
>>>>>>
>>>>>>
>>>>> wrote:
>>>>>
>>>>>
>>>>>> Hey,
>>>>>>>
>>>>>>> Can you check the datanode logs?  You might be running into the
>>>>>>>
>>>>>>>
>>>>>> dreaded
>>>>>>
>>>>>>
>>>>>>> xciver limit :-(
>>>>>>>
>>>>>>> try upping the xciver in hadoop-site.xml... i run at 2048.
>>>>>>>
>>>>>>> -ryan
>>>>>>>
>>>>>>>
>>>>>>> On Fri, Apr 3, 2009 at 4:35 PM, Taylor, Ronald C
>>>>>>>
>>>>>>>
>>>>>> <ro...@pnl.gov>wrote:
>>>>>>
>>>>>>
>>>>>>> Hello folks,
>>>>>>>>
>>>>>>>> I have just tried using Ryan's doCommit() method for my bulk upload
>>>>>>>>
>>>>>>>>
>>>>>>> into
>>>>>>
>>>>>>
>>>>>>> one Hbase table. No luck. I still start to get errors around row
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>
>>>> 160,000. On-screen, the program starts to generate error msgs like
>>>>>>>>
>>>>>>>>
>>>>>>> so:
>>>>>>
>>>>>>
>>>>>>> ...
>>>>>>>> INFO: Retrying connect to server: /127.0.0.1:60383. Already tried
>>>>>>>>
>>>>>>>>
>>>>>>>
>>
>>
>>> 8 time(s).
>>>>>>>> Apr 3, 2009 2:39:52 PM
>>>>>>>> org.apache.hadoop.hbase.ipc.HBaseClient$Connection
>>>>>>>> handleConnectionFailure
>>>>>>>> INFO: Retrying connect to server: /127.0.0.1:60383. Already tried
>>>>>>>>
>>>>>>>>
>>>>>>>
>>
>>
>>> 9 time(s).
>>>>>>>> Apr 3, 2009 2:39:57 PM
>>>>>>>> org.apache.hadoop.hbase.ipc.HBaseClient$Connection
>>>>>>>> handleConnectionFailure
>>>>>>>> INFO: Retrying connect to server: /127.0.0.1:60383. Already tried
>>>>>>>>
>>>>>>>>
>>>>>>>
>>
>>
>>> 0 time(s).
>>>>>>>> Apr 3, 2009 2:39:58 PM
>>>>>>>> org.apache.hadoop.hbase.ipc.HBaseClient$Connection
>>>>>>>> handleConnectionFailure
>>>>>>>> INFO: Retrying connect to server: /127.0.0.1:60383. Already tried
>>>>>>>>
>>>>>>>>
>>>>>>>
>>
>>
>>> 1 time(s).
>>>>>>>> ...
>>>>>>>> In regard to log file information, I have appended at bottom some
>>>>>>>>
>>>>>>>>
>>>>>>> of
>>>>>
>>>>>
>>>>>> the
>>>>>>
>>>>>>
>>>>>>> output from my hbase-<user>-master-<machine>.log file, at the place
>>>>>>>> where it looks to me like things might have started to go
>>>>>>>>
>>>>>>>>
>>>>>>> wrong.
>>>
>>>
>>>> Several
>>>>>>
>>>>>>
>>>>>>> questions:
>>>>>>>>
>>>>>>>> 1)  Is there any readily apparent cause for such a
>>>>>>>> HBaseClient$Connection handleConnectionFailure to occur in a Hbase
>>>>>>>> installation configured on a Linux desktop to work in the pseudo-distributed
>>>>>>>> operation mode? From my understanding, even
>>>>>>>>
>>>>>>>>
>>>>>>> importing
>>>>>>
>>>>>>
>>>>>>> ~200,000 rows (each row being filled with info for ten columns) is
>>>>>>>>
>>>>>>>>
>>>>>>> a
>>>>>
>>>>>
>>>>>> minimal data set for Hbase, and upload should not be failing like
>>>>>>>>
>>>>>>>>
>>>>>>> this.
>>>>>>
>>>>>>
>>>>>>> FYI - minimal changes were made to the Hbase default settings in
>>>>>>>>
>>>>>>>>
>>>>>>> the
>>>>>
>>>>>
>>>>>> Hbase ../conf/ config files when I installed Hbase 0.19.0. I have
>>>>>>>>
>>>>>>>>
>>>>>>> one
>>>>>
>>>>>
>>>>>> entry in hbase-env.sh, to set JAVA_HOME, and one property entry in
>>>>>>>> hbase-site.xml, to set the hbase.rootdir.
>>>>>>>>
>>>>>>>> 2) My Linux box has about 3 Gb of memory. I left the HADOOP_HEAP
>>>>>>>>
>>>>>>>>
>>>>>>> and
>>>>>
>>>>>
>>>>>> HBASE_HEAP sizes at their default values, which I understand are
>>>>>>>>
>>>>>>>>
>>>>>>> 1000Mb
>>>>>>
>>>>>>
>>>>>>> each. Should I have changed either value?
>>>>>>>>
>>>>>>>> 3) I left the dfs.replication value at the default of  "3" in the
>>>>>>>> hadoop-site.xml file, for my test of pseudo-distributed
>>>>>>>>
>>>>>>>>
>>>>>>> operation.
>>>
>>>
>>>> Should I have changed that to "1", for operation on my single
>>>>>>>>
>>>>>>>>
>>>>>>> machine?
>>>>>>
>>>>>>
>>>>>>> Downsizing to "1" would appear to me to negate trying out Hadoop
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>
>>>> in
>>>>>>>>
>>>>>>>>
>>>>>>> the
>>>>>>
>>>>>>
>>>>>>> pseudo-distributed operation mode, so I left the value "as is", but
>>>>>>>>
>>>>>>>>
>>>>>>> did
>>>>>>
>>>>>>
>>>>>>> I get this wrong?
>>>>>>>>
>>>>>>>> 4) In the log output below, you can see that Hbase starts to block
>>>>>>>>
>>>>>>>>
>>>>>>> and
>>>>>>
>>>>>>
>>>>>>> then unblock updates to my one Hbase table (called the
>>>>>>>> "ppInteractionTable", for protein-protein interaction table). A
>>>>>>>>
>>>>>>>>
>>>>>>> little
>>>>>>
>>>>>>
>>>>>>> later, a msg says that the ppInteractionTable has been closed. At
>>>>>>>>
>>>>>>>>
>>>>>>> this
>>>>>>
>>>>>>
>>>>>>> point, my program has *not* issued a command to close the table -
>>>>>>>>
>>>>>>>>
>>>>>>> that
>>>>>>
>>>>>>
>>>>>>> only happens at the end of the program. So - why is this
>>>>>>>>
>>>>>>>>
>>>>>>> happening?
>>>
>>>
>>>> Also, near the end of my log extract, I get a different error
>>>>>>>>
>>>>>>>>
>>>>>>> msg:
>>>
>>>
>>>> NotReplicatedYetException. I have no idea what that means.
>>>>>>>>
>>>>>>>>
>>>>>>> Actually,
>>>>>
>>>>>
>>>>>> I
>>>>>>
>>>>>>
>>>>>>> don't really have a grasp yet on what any of these error msgs is
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>
>>>> supposed to tell us. So - once again, any help would be much
>>>>>>>> appreciated.
>>>>>>>>
>>>>>>>>  Ron
>>>>>>>>
>>>>>>>> ___________________________________________
>>>>>>>> Ronald Taylor, Ph.D.
>>>>>>>> Computational Biology & Bioinformatics Group Pacific Northwest
>>>>>>>> National Laboratory
>>>>>>>> 902 Battelle Boulevard
>>>>>>>> P.O. Box 999, MSIN K7-90
>>>>>>>> Richland, WA  99352 USA
>>>>>>>> Office:  509-372-6568
>>>>>>>> Email: ronald.taylor@pnl.gov
>>>>>>>> www.pnl.gov
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> -----Original Message-----
>>>>>>>> From: Taylor, Ronald C
>>>>>>>> Sent: Tuesday, March 31, 2009 5:48 PM
>>>>>>>> To: 'hbase-user@hadoop.apache.org'
>>>>>>>> Cc: Taylor, Ronald C
>>>>>>>> Subject: Novice Hbase user needs help with data upload - gets a
>>>>>>>> RetriesExhaustedException, followed by NoServerForRegionException
>>>>>>>>
>>>>>>>>
>>>>>>>> Hello folks,
>>>>>>>>
>>>>>>>> This is my first msg to the list - I just joined today, and I am
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>
>>>> a novice Hadoop/HBase programmer. I have a question:
>>>>>>>>
>>>>>>>> I have written a Java program to create an HBase table and then
>>>>>>>>
>>>>>>>>
>>>>>>> enter
>>>>>
>>>>>
>>>>>> a
>>>>>>
>>>>>>
>>>>>>> number of rows into the table. The only way I have found so far to
>>>>>>>>
>>>>>>>>
>>>>>>> do
>>>>>
>>>>>
>>>>>> this is to enter each row one-by-one, creating a new BatchUpdate
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>
>>>> updateObj for each row, doing about ten updateObj.put()'s to add
>>>>>>>>
>>>>>>>>
>>>>>>> the
>>>>>
>>>>>
>>>>>> column data, and then doing a tableObj.commit(updateObj). There's
>>>>>>>> probably a more efficient way (happy to hear, if so!), but this is
>>>>>>>>
>>>>>>>>
>>>>>>> what
>>>>>>
>>>>>>
>>>>>>> I'm starting with.
>>>>>>>>
>>>>>>>> When I do this on input that creates 3000 rows, the program works
>>>>>>>>
>>>>>>>>
>>>>>>> fine.
>>>>>>
>>>>>>
>>>>>>> When I try this on input that would create 300,000 rows (still
>>>>>>>> relatively small for an HBase table, I would think), the program
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>
>>>> terminates around row 160,000 or so, generating first an
>>>>>>>> RetriesExhaustedException, followed by
>>>>>>>>
>>>>>>>>
>>>>>>> NoServerForRegionException.
>>>
>>>
>>>> The
>>>>>>
>>>>>>
>>>>>>> HBase server crashes, and I have to restart it. The Hadoop server
>>>>>>>> appears to remain OK and does not need restarting.
>>>>>>>>
>>>>>>>> Can anybody give me any guidance? I presume that I might need to
>>>>>>>>
>>>>>>>>
>>>>>>> adjust
>>>>>>
>>>>>>
>>>>>>> some setting for larger input in the HBase and/or Hadoop config
>>>>>>>>
>>>>>>>>
>>>>>>> files.
>>>>>>
>>>>>>
>>>>>>> At present, I am using default settings. I have installed Hadoop
>>>>>>>>
>>>>>>>>
>>>>>>> 0.19.0
>>>>>>
>>>>>>
>>>>>>> and HBase 0.19.0 in the "pseudo" cluster mode on a single machine,
>>>>>>>>
>>>>>>>>
>>>>>>> my
>>>>>
>>>>>
>>>>>> Red Hat Linux desktop, which has 3 Gb RAM.
>>>>>>>>
>>>>>>>> Any help / suggestions would be much appreciated.
>>>>>>>>
>>>>>>>>  Cheers,
>>>>>>>>  Ron Taylor
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>
>>
>>
>>
>

RE: setting xciever number limit

Posted by "Taylor, Ronald C" <ro...@pnl.gov>.
 
Hi Lars,

Thanks for the clarification. Yep, like you, I couldn't find anything in
hadoop-defaults.xml (or in any other config file). I hope there will be
default entries for all parameters in the next release, so that I'll
have a reference in one place as to what makes up the full parameter
universe (haven't found any such listing yet outside the config files
themselves).

 Ron

-----Original Message-----
From: Lars George [mailto:lars@worldlingo.com] 
Sent: Thursday, April 09, 2009 1:23 PM
To: hbase-user@hadoop.apache.org
Subject: Re: setting xciever number limit

Hi Ron,

I am adding it to indicate the misspelling. Usually these keywords are
of course in the hadoop-defaults.xml, but that seems not to be the case
with this rather new one. Maybe the 0.19.1 and newer is all good, my
0.19.0 does not have it though.

And to clarify the spelling, here the code from Hadoop's
DataXceiverServer.java:

    this.maxXceiverCount = conf.getInt("dfs.datanode.max.xcievers",
        MAX_XCEIVER_COUNT);

HTH,
Lars


Taylor, Ronald C wrote:
>  
> Hi Lars,
>
> I looked up the Troubleshooting entry, as you suggested. It says 
> "dfs.datanode.max.xcievers (sic)". Um ... isn't the "(sic)" supposed 
> to indicate that the *wrong* spelling is being used in the previous
phrase?
> Or is the "(sic)" being used here to signify that yes, we know this is

> an unexpected spelling, but use it anyway?
> Ron
>
> -----Original Message-----
> From: Lars George [mailto:lars@worldlingo.com]
> Sent: Thursday, April 09, 2009 1:09 AM
> To: hbase-user@hadoop.apache.org
> Cc: Taylor, Ronald C
> Subject: Re: Still need help with data upload into HBase
>
> Hi Ron,
>
> The syntax is like this (sic):
>
>     <property>
>         <name>dfs.datanode.max.xcievers</name>
>         <value>4096</value>
>     </property>
>
> and it is documented on the HBase wiki here: 
> http://wiki.apache.org/hadoop/Hbase/Troubleshooting
>
> Regards,
> Lars
>
>
> Taylor, Ronald C wrote:
>   
>>  
>> Hi Ryan,
>>
>> Thanks for the suggestion on checking whether the number of file 
>> handles allowed actually gets increased after I make the change the 
>> /etc/security/limits.conf.
>>
>> Turns out it was not. I had to check with one of our sysadmins so 
>> that
>>     
>
>   
>> the new 32K number of handles setting actually gets used on my Red 
>> Hat
>>     
>
>   
>> box.
>>
>> With that, and with one other change which I'll get to in a moment, I

>> finally was able to read in all the rows that I wanted, instead of 
>> the
>>     
>
>   
>> program breaking before finishing. Checked the table by scanning it -

>> looks OK. So - it looks like things are working as they should.
>>
>> Thank you very much for the help.
>>
>> Now as to the other parameter that needed changing: I found that the 
>> xceivers (xcievers?) limit was not being bumped up - I was crashing 
>> on
>>     
>
>   
>> that. I went to add what Ryan suggested in hadoop-site.xml, i.e.,
>>
>> <property>
>> <name>dfs.datanode.max.xcievers</name>
>> <value>2047</value>
>> </property>
>>
>> and discovered that I did not know whether to use 
>> "dfs.datanode.max.xcievers" or "dfs.datanode.max.xceivers", where the

>> "i" and "e" switch. I was getting error msgs in the log files with
>>
>>   "xceiverCount 257 exceeds the limit of concurrent xcievers 256"
>>
>>  with BOTH spelling variants employed within the same error msg. Very

>> confusing. So I added property entries for both spellings in the 
>> hadoop-site.xml file. Figured one of them would take effect. That 
>> appears to work fine. But I would like get the correct spelling. Did 
>> a
>>     
>
>   
>> Google search and the spelling keeps popping up both ways, so I 
>> remain
>>     
>
>   
>> confused.
>>
>> I think the Hbase getting started documentation could use some 
>> enhancement on file handle settings, xceiver (xciever?) settings, and

>> datanode handler count settings.
>>  Ron
>>
>> ___________________________________________
>> Ronald Taylor, Ph.D.
>> Computational Biology & Bioinformatics Group Pacific Northwest 
>> National Laboratory
>> 902 Battelle Boulevard
>> P.O. Box 999, MSIN K7-90
>> Richland, WA  99352 USA
>> Office:  509-372-6568
>> Email: ronald.taylor@pnl.gov
>> www.pnl.gov
>>
>> -----Original Message-----
>> From: Ryan Rawson [mailto:ryanobjc@gmail.com]
>> Sent: Monday, April 06, 2009 6:47 PM
>> To: Taylor, Ronald C
>> Cc: hbase-user@hadoop.apache.org
>> Subject: Re: Still need help with data upload into HBase
>>
>> I ran into a problem on ubuntu where /etc/security/limits.conf wasnt 
>> being honored due to a missing line in /etc/pam.d/common-session:
>> "session required        pam_limits.so"
>>
>> this prevented the ulimits from being run.
>>
>> can you sudo to the hadoop/hbase user and verify with ulimit -a ?
>>
>>
>>
>> On Mon, Apr 6, 2009 at 5:07 PM, Taylor, Ronald C
>> <ro...@pnl.gov>wrote:
>>
>>   
>>     
>>>  Hello Ryan and the list,
>>>
>>> Well, I am still stuck. In addition to making the changes 
>>> recommended
>>>       
>
>   
>>> by Ryan to my hadoop-site.xml file (see below), I also added a line 
>>> for HBase to /etc/security/limits.conf and had the fs.file-max 
>>> hugely
>>>       
>
>   
>>> increased, to hopefully handle any file handle limit problem. Still 
>>> no
>>>     
>>>       
>>   
>>     
>>> luck with my upload program. It fails about where it did before, 
>>> around the loading of  the 160,000th row into the one table that I 
>>> create in Hbase. Didn't  the "too many file open" msg, but did get 
>>> "handleConnectionFailure"  in the same place in the upload.
>>>
>>> I then tried a complete reinstall of Hbase and Hadoop, upgrading 
>>> from
>>>       
>
>   
>>> 0.19.0 to 0.19.1. Used the same config parameters as before, and 
>>> reran
>>>     
>>>       
>>   
>>     
>>> the program. It fails again, at about the same number of rows 
>>> uploaded
>>>     
>>>       
>>   
>>     
>>> - and I'm back to getting "too many files open" as what I think is 
>>> the
>>>     
>>>       
>>   
>>     
>>> principal error msg.
>>>
>>> So - does anybody have any suggestions? I am running a
>>>     
>>>       
>> "pseudo-distributed"
>>   
>>     
>>> installation of Hadoop on one Red Hat Linux machine with about ~3Gb 
>>> of
>>>     
>>>       
>> RAM.
>>   
>>     
>>> Are there any known problems with bulk uploads when running 
>>> "pseudo-distributed" on on a single box, rather than a true cluster?
>>> Is there anything else I can try?
>>>  Ron
>>>
>>>
>>> ___________________________________________
>>> Ronald Taylor, Ph.D.
>>> Computational Biology & Bioinformatics Group Pacific Northwest 
>>> National Laboratory
>>> 902 Battelle Boulevard
>>> P.O. Box 999, MSIN K7-90
>>> Richland, WA  99352 USA
>>> Office:  509-372-6568
>>> Email: ronald.taylor@pnl.gov
>>> www.pnl.gov
>>>
>>>
>>>  ------------------------------
>>> *From:* Ryan Rawson [mailto:ryanobjc@gmail.com]
>>> *Sent:* Friday, April 03, 2009 5:56 PM
>>> *To:* Taylor, Ronald C
>>> *Subject:* Re: FW: Still need help with data upload into HBase
>>>
>>> Welcome to hbase :-)
>>>
>>> This is pretty much how it goes for nearly every new user.
>>>
>>> We might want to review our docs...
>>>
>>> On Fri, Apr 3, 2009 at 5:54 PM, Taylor, Ronald C
>>>     
>>>       
>> <ro...@pnl.gov>wrote:
>>   
>>     
>>>> Thanks. I'll make those settings, too, in addition to bumping up 
>>>> the
>>>>         
>
>   
>>>> file handle limit, and give it another go.
>>>> Ron
>>>>
>>>> -----Original Message-----
>>>> From: Ryan Rawson [mailto:ryanobjc@gmail.com]
>>>> Sent: Friday, April 03, 2009 5:48 PM
>>>> To: hbase-user@hadoop.apache.org
>>>>  Subject: Re: Still need help with data upload into HBase
>>>>
>>>> Hey,
>>>>
>>>> File handle - yes... there was a FAQ and/or getting started which 
>>>> talks about upping lots of limits.
>>>>
>>>> I have these set in my hadoop-site.xml (that is read by datanode):
>>>> <property>
>>>> <name>dfs.datanode.max.xcievers</name>
>>>> <value>2047</value>
>>>> </property>
>>>>
>>>> <property>
>>>> <name>dfs.datanode.handler.count</name>
>>>> <value>10</value>
>>>> </property>
>>>>
>>>> I should probably set the datanode.handler.count higher.
>>>>
>>>> Don't forget to toss a reasonable amount of ram at hdfs... not sure

>>>> what that is exactly, but -Xmx1000m wouldn't hurt.
>>>>
>>>> On Fri, Apr 3, 2009 at 5:44 PM, Taylor, Ronald C
>>>> <ro...@pnl.gov>wrote:
>>>>
>>>>       
>>>>         
>>>>> Hi Ryan,
>>>>>
>>>>> Thanks for the info. Re checking the Hadoop datanode log file: I 
>>>>> just did so, and found a "too many open files" error. Checking the

>>>>> Hbase
>>>>>         
>>>>>           
>>>> FAQ,
>>>>       
>>>>         
>>>>> I see that I should drastically bump up the file handle limit. So 
>>>>> I
>>>>>         
>>>>>           
>>>> will
>>>>       
>>>>         
>>>>> give that a try.
>>>>>
>>>>> Question: what does the xciver variable do? My hadoop-site.xml 
>>>>> file
>>>>>         
>>>>>           
>>>> does
>>>>       
>>>>         
>>>>> not contain any entry for such a var. (Nothing reported in the 
>>>>> datalog file either with the word "xciver".)
>>>>>
>>>>> Re using the local file system: well, as soon as I load a nice 
>>>>> data
>>>>>         
>>>>>           
>>>> set
>>>>       
>>>>         
>>>>> loaded in, I'm starting a demo project manipulating it for our Env

>>>>> Molecular Sciences Lab (EMSL), a DOE Nat User Facility. And I'm
>>>>>         
>>>>>           
>>>> supposed
>>>>       
>>>>         
>>>>> to be doing the manipulating using MapReduce programs, to show the

>>>>> usefulness of such an approach. So I need Hadoop and the HDFS. And

>>>>> so
>>>>>         
>>>>>           
>>>> I
>>>>       
>>>>         
>>>>> would prefer to keep using Hbase on top of Hadoop, rather than the
>>>>>         
>>>>>           
>>>> local
>>>>       
>>>>         
>>>>> Linux file system. Hopefully the "small HDFS clusters" issues you 
>>>>> mention are survivable. Eventually, some of this programming might
>>>>>         
>>>>>           
>>>> wind
>>>>       
>>>>         
>>>>> up on Chinook, our 160 Teraflop supercomputer cluster, but that's 
>>>>> a
>>>>>         
>>>>>           
>>>> ways
>>>>       
>>>>         
>>>>> down the road. I'm starting on my Linux desktop.
>>>>>
>>>>> I'll try bumping up the file handle limit, restart Hadoop and 
>>>>> Hbase,
>>>>>         
>>>>>           
>>>> and
>>>>       
>>>>         
>>>>> see what happens.
>>>>> Ron
>>>>>
>>>>> ___________________________________________
>>>>> Ronald Taylor, Ph.D.
>>>>> Computational Biology & Bioinformatics Group Pacific Northwest 
>>>>> National Laboratory
>>>>> 902 Battelle Boulevard
>>>>> P.O. Box 999, MSIN K7-90
>>>>> Richland, WA  99352 USA
>>>>> Office:  509-372-6568
>>>>> Email: ronald.taylor@pnl.gov
>>>>> www.pnl.gov
>>>>>
>>>>>
>>>>> -----Original Message-----
>>>>> From: Ryan Rawson [mailto:ryanobjc@gmail.com]
>>>>> Sent: Friday, April 03, 2009 5:08 PM
>>>>> To: hbase-user@hadoop.apache.org
>>>>> Subject: Re: Still need help with data upload into HBase
>>>>>
>>>>> Hey,
>>>>>
>>>>> Can you check the datanode logs?  You might be running into the
>>>>>         
>>>>>           
>>>> dreaded
>>>>       
>>>>         
>>>>> xciver limit :-(
>>>>>
>>>>> try upping the xciver in hadoop-site.xml... i run at 2048.
>>>>>
>>>>> -ryan
>>>>>
>>>>> -----Original Message-----
>>>>> From: Ryan Rawson [mailto:ryanobjc@gmail.com]
>>>>> Sent: Friday, April 03, 2009 5:13 PM
>>>>> To: hbase-user@hadoop.apache.org
>>>>> Subject: Re: Still need help with data upload into HBase
>>>>>
>>>>> Non replicated yet is probably what you think - HDFS hasnt place
>>>>>         
>>>>>           
>>>> blocks
>>>>       
>>>>         
>>>>> on
>>>>> more nodes yet.  This could be due to the pseudo distributed 
>>>>> nature
>>>>>         
>>>>>           
>>   
>>     
>>>>> of your set-up.  I'm not familiar with that configuration, so I 
>>>>> can't really
>>>>>         
>>>>>           
>>>> say
>>>>       
>>>>         
>>>>> more.
>>>>>
>>>>> If you only have 1 machine, you might as well just go with local
>>>>>         
>>>>>           
>>>> files.
>>>>       
>>>>         
>>>>> The
>>>>> HDFS gets you distributed replication, but until you have many
>>>>>         
>>>>>           
>>>> machines,
>>>>       
>>>>         
>>>>> it
>>>>> won't buy you anything and only cause problems, since small HDFS 
>>>>> clusters are known to have issues.
>>>>>
>>>>> Good luck (again!)
>>>>> -ryan
>>>>>
>>>>> On Fri, Apr 3, 2009 at 5:07 PM, Ryan Rawson <ry...@gmail.com>
>>>>>         
>>>>>           
>>>> wrote:
>>>>       
>>>>         
>>>>>> Hey,
>>>>>>
>>>>>> Can you check the datanode logs?  You might be running into the
>>>>>>           
>>>>>>             
>>>>> dreaded
>>>>>         
>>>>>           
>>>>>> xciver limit :-(
>>>>>>
>>>>>> try upping the xciver in hadoop-site.xml... i run at 2048.
>>>>>>
>>>>>> -ryan
>>>>>>
>>>>>>
>>>>>> On Fri, Apr 3, 2009 at 4:35 PM, Taylor, Ronald C
>>>>>>           
>>>>>>             
>>>>> <ro...@pnl.gov>wrote:
>>>>>         
>>>>>           
>>>>>>> Hello folks,
>>>>>>>
>>>>>>> I have just tried using Ryan's doCommit() method for my bulk 
>>>>>>> upload
>>>>>>>             
>>>>>>>               
>>>>> into
>>>>>         
>>>>>           
>>>>>>> one Hbase table. No luck. I still start to get errors around row
>>>>>>>             
>>>>>>>               
>>   
>>     
>>>>>>> 160,000. On-screen, the program starts to generate error msgs 
>>>>>>> like
>>>>>>>             
>>>>>>>               
>>>>> so:
>>>>>         
>>>>>           
>>>>>>> ...
>>>>>>> INFO: Retrying connect to server: /127.0.0.1:60383. Already 
>>>>>>> tried
>>>>>>>               
>
>   
>>>>>>> 8 time(s).
>>>>>>> Apr 3, 2009 2:39:52 PM
>>>>>>> org.apache.hadoop.hbase.ipc.HBaseClient$Connection
>>>>>>> handleConnectionFailure
>>>>>>> INFO: Retrying connect to server: /127.0.0.1:60383. Already 
>>>>>>> tried
>>>>>>>               
>
>   
>>>>>>> 9 time(s).
>>>>>>> Apr 3, 2009 2:39:57 PM
>>>>>>> org.apache.hadoop.hbase.ipc.HBaseClient$Connection
>>>>>>> handleConnectionFailure
>>>>>>> INFO: Retrying connect to server: /127.0.0.1:60383. Already 
>>>>>>> tried
>>>>>>>               
>
>   
>>>>>>> 0 time(s).
>>>>>>> Apr 3, 2009 2:39:58 PM
>>>>>>> org.apache.hadoop.hbase.ipc.HBaseClient$Connection
>>>>>>> handleConnectionFailure
>>>>>>> INFO: Retrying connect to server: /127.0.0.1:60383. Already 
>>>>>>> tried
>>>>>>>               
>
>   
>>>>>>> 1 time(s).
>>>>>>> ...
>>>>>>> In regard to log file information, I have appended at bottom 
>>>>>>> some
>>>>>>>             
>>>>>>>               
>>>> of
>>>>       
>>>>         
>>>>> the
>>>>>         
>>>>>           
>>>>>>> output from my hbase-<user>-master-<machine>.log file, at the 
>>>>>>> place where it looks to me like things might have started to go
>>>>>>>             
>>>>>>>               
>> wrong.
>>   
>>     
>>>>> Several
>>>>>         
>>>>>           
>>>>>>> questions:
>>>>>>>
>>>>>>> 1)  Is there any readily apparent cause for such a 
>>>>>>> HBaseClient$Connection handleConnectionFailure to occur in a 
>>>>>>> Hbase installation configured on a Linux desktop to work in the 
>>>>>>> pseudo-distributed operation mode? From my understanding, even
>>>>>>>             
>>>>>>>               
>>>>> importing
>>>>>         
>>>>>           
>>>>>>> ~200,000 rows (each row being filled with info for ten columns) 
>>>>>>> is
>>>>>>>             
>>>>>>>               
>>>> a
>>>>       
>>>>         
>>>>>>> minimal data set for Hbase, and upload should not be failing 
>>>>>>> like
>>>>>>>             
>>>>>>>               
>>>>> this.
>>>>>         
>>>>>           
>>>>>>> FYI - minimal changes were made to the Hbase default settings in
>>>>>>>             
>>>>>>>               
>>>> the
>>>>       
>>>>         
>>>>>>> Hbase ../conf/ config files when I installed Hbase 0.19.0. I 
>>>>>>> have
>>>>>>>             
>>>>>>>               
>>>> one
>>>>       
>>>>         
>>>>>>> entry in hbase-env.sh, to set JAVA_HOME, and one property entry 
>>>>>>> in hbase-site.xml, to set the hbase.rootdir.
>>>>>>>
>>>>>>> 2) My Linux box has about 3 Gb of memory. I left the HADOOP_HEAP
>>>>>>>             
>>>>>>>               
>>>> and
>>>>       
>>>>         
>>>>>>> HBASE_HEAP sizes at their default values, which I understand are
>>>>>>>             
>>>>>>>               
>>>>> 1000Mb
>>>>>         
>>>>>           
>>>>>>> each. Should I have changed either value?
>>>>>>>
>>>>>>> 3) I left the dfs.replication value at the default of  "3" in 
>>>>>>> the hadoop-site.xml file, for my test of pseudo-distributed
>>>>>>>             
>>>>>>>               
>> operation.
>>   
>>     
>>>>>>> Should I have changed that to "1", for operation on my single
>>>>>>>             
>>>>>>>               
>>>>> machine?
>>>>>         
>>>>>           
>>>>>>> Downsizing to "1" would appear to me to negate trying out Hadoop
>>>>>>>             
>>>>>>>               
>>   
>>     
>>>>>>> in
>>>>>>>             
>>>>>>>               
>>>>> the
>>>>>         
>>>>>           
>>>>>>> pseudo-distributed operation mode, so I left the value "as is", 
>>>>>>> but
>>>>>>>             
>>>>>>>               
>>>>> did
>>>>>         
>>>>>           
>>>>>>> I get this wrong?
>>>>>>>
>>>>>>> 4) In the log output below, you can see that Hbase starts to 
>>>>>>> block
>>>>>>>             
>>>>>>>               
>>>>> and
>>>>>         
>>>>>           
>>>>>>> then unblock updates to my one Hbase table (called the 
>>>>>>> "ppInteractionTable", for protein-protein interaction table). A
>>>>>>>             
>>>>>>>               
>>>>> little
>>>>>         
>>>>>           
>>>>>>> later, a msg says that the ppInteractionTable has been closed. 
>>>>>>> At
>>>>>>>             
>>>>>>>               
>>>>> this
>>>>>         
>>>>>           
>>>>>>> point, my program has *not* issued a command to close the table
>>>>>>> -
>>>>>>>             
>>>>>>>               
>>>>> that
>>>>>         
>>>>>           
>>>>>>> only happens at the end of the program. So - why is this
>>>>>>>             
>>>>>>>               
>> happening?
>>   
>>     
>>>>>>> Also, near the end of my log extract, I get a different error
>>>>>>>             
>>>>>>>               
>> msg:
>>   
>>     
>>>>>>> NotReplicatedYetException. I have no idea what that means.
>>>>>>>             
>>>>>>>               
>>>> Actually,
>>>>       
>>>>         
>>>>> I
>>>>>         
>>>>>           
>>>>>>> don't really have a grasp yet on what any of these error msgs is
>>>>>>>             
>>>>>>>               
>>   
>>     
>>>>>>> supposed to tell us. So - once again, any help would be much 
>>>>>>> appreciated.
>>>>>>>
>>>>>>>  Ron
>>>>>>>
>>>>>>> ___________________________________________
>>>>>>> Ronald Taylor, Ph.D.
>>>>>>> Computational Biology & Bioinformatics Group Pacific Northwest 
>>>>>>> National Laboratory
>>>>>>> 902 Battelle Boulevard
>>>>>>> P.O. Box 999, MSIN K7-90
>>>>>>> Richland, WA  99352 USA
>>>>>>> Office:  509-372-6568
>>>>>>> Email: ronald.taylor@pnl.gov
>>>>>>> www.pnl.gov
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: Taylor, Ronald C
>>>>>>> Sent: Tuesday, March 31, 2009 5:48 PM
>>>>>>> To: 'hbase-user@hadoop.apache.org'
>>>>>>> Cc: Taylor, Ronald C
>>>>>>> Subject: Novice Hbase user needs help with data upload - gets a 
>>>>>>> RetriesExhaustedException, followed by 
>>>>>>> NoServerForRegionException
>>>>>>>
>>>>>>>
>>>>>>> Hello folks,
>>>>>>>
>>>>>>> This is my first msg to the list - I just joined today, and I am
>>>>>>>             
>>>>>>>               
>>   
>>     
>>>>>>> a novice Hadoop/HBase programmer. I have a question:
>>>>>>>
>>>>>>> I have written a Java program to create an HBase table and then
>>>>>>>             
>>>>>>>               
>>>> enter
>>>>       
>>>>         
>>>>> a
>>>>>         
>>>>>           
>>>>>>> number of rows into the table. The only way I have found so far 
>>>>>>> to
>>>>>>>             
>>>>>>>               
>>>> do
>>>>       
>>>>         
>>>>>>> this is to enter each row one-by-one, creating a new BatchUpdate
>>>>>>>             
>>>>>>>               
>>   
>>     
>>>>>>> updateObj for each row, doing about ten updateObj.put()'s to add
>>>>>>>             
>>>>>>>               
>>>> the
>>>>       
>>>>         
>>>>>>> column data, and then doing a tableObj.commit(updateObj). 
>>>>>>> There's probably a more efficient way (happy to hear, if so!), 
>>>>>>> but this is
>>>>>>>             
>>>>>>>               
>>>>> what
>>>>>         
>>>>>           
>>>>>>> I'm starting with.
>>>>>>>
>>>>>>> When I do this on input that creates 3000 rows, the program 
>>>>>>> works
>>>>>>>             
>>>>>>>               
>>>>> fine.
>>>>>         
>>>>>           
>>>>>>> When I try this on input that would create 300,000 rows (still 
>>>>>>> relatively small for an HBase table, I would think), the program
>>>>>>>             
>>>>>>>               
>>   
>>     
>>>>>>> terminates around row 160,000 or so, generating first an 
>>>>>>> RetriesExhaustedException, followed by
>>>>>>>             
>>>>>>>               
>> NoServerForRegionException.
>>   
>>     
>>>>> The
>>>>>         
>>>>>           
>>>>>>> HBase server crashes, and I have to restart it. The Hadoop 
>>>>>>> server appears to remain OK and does not need restarting.
>>>>>>>
>>>>>>> Can anybody give me any guidance? I presume that I might need to
>>>>>>>             
>>>>>>>               
>>>>> adjust
>>>>>         
>>>>>           
>>>>>>> some setting for larger input in the HBase and/or Hadoop config
>>>>>>>             
>>>>>>>               
>>>>> files.
>>>>>         
>>>>>           
>>>>>>> At present, I am using default settings. I have installed Hadoop
>>>>>>>             
>>>>>>>               
>>>>> 0.19.0
>>>>>         
>>>>>           
>>>>>>> and HBase 0.19.0 in the "pseudo" cluster mode on a single 
>>>>>>> machine,
>>>>>>>             
>>>>>>>               
>>>> my
>>>>       
>>>>         
>>>>>>> Red Hat Linux desktop, which has 3 Gb RAM.
>>>>>>>
>>>>>>> Any help / suggestions would be much appreciated.
>>>>>>>
>>>>>>>  Cheers,
>>>>>>>   Ron Taylor
>>>>>>>             
>>>>>>>               
>>   
>>     
>
>   

Re: setting xciever number limit

Posted by Lars George <la...@worldlingo.com>.
Hi Ron,

I am adding it to indicate the misspelling. Usually these keywords are 
of course in the hadoop-defaults.xml, but that seems not to be the case 
with this rather new one. Maybe the 0.19.1 and newer is all good, my 
0.19.0 does not have it though.

And to clarify the spelling, here the code from Hadoop's 
DataXceiverServer.java:

    this.maxXceiverCount = conf.getInt("dfs.datanode.max.xcievers",
        MAX_XCEIVER_COUNT);

HTH,
Lars


Taylor, Ronald C wrote:
>  
> Hi Lars,
>
> I looked up the Troubleshooting entry, as you suggested. It says
> "dfs.datanode.max.xcievers (sic)". Um ... isn't the "(sic)" supposed to
> indicate that the *wrong* spelling is being used in the previous phrase?
> Or is the "(sic)" being used here to signify that yes, we know this is
> an unexpected spelling, but use it anyway?
> Ron
>
> -----Original Message-----
> From: Lars George [mailto:lars@worldlingo.com] 
> Sent: Thursday, April 09, 2009 1:09 AM
> To: hbase-user@hadoop.apache.org
> Cc: Taylor, Ronald C
> Subject: Re: Still need help with data upload into HBase
>
> Hi Ron,
>
> The syntax is like this (sic):
>
>     <property>
>         <name>dfs.datanode.max.xcievers</name>
>         <value>4096</value>
>     </property>
>
> and it is documented on the HBase wiki here: 
> http://wiki.apache.org/hadoop/Hbase/Troubleshooting
>
> Regards,
> Lars
>
>
> Taylor, Ronald C wrote:
>   
>>  
>> Hi Ryan,
>>
>> Thanks for the suggestion on checking whether the number of file 
>> handles allowed actually gets increased after I make the change the 
>> /etc/security/limits.conf.
>>
>> Turns out it was not. I had to check with one of our sysadmins so that
>>     
>
>   
>> the new 32K number of handles setting actually gets used on my Red Hat
>>     
>
>   
>> box.
>>
>> With that, and with one other change which I'll get to in a moment, I 
>> finally was able to read in all the rows that I wanted, instead of the
>>     
>
>   
>> program breaking before finishing. Checked the table by scanning it - 
>> looks OK. So - it looks like things are working as they should.
>>
>> Thank you very much for the help.
>>
>> Now as to the other parameter that needed changing: I found that the 
>> xceivers (xcievers?) limit was not being bumped up - I was crashing on
>>     
>
>   
>> that. I went to add what Ryan suggested in hadoop-site.xml, i.e.,
>>
>> <property>
>> <name>dfs.datanode.max.xcievers</name>
>> <value>2047</value>
>> </property>
>>
>> and discovered that I did not know whether to use 
>> "dfs.datanode.max.xcievers" or "dfs.datanode.max.xceivers", where the 
>> "i" and "e" switch. I was getting error msgs in the log files with
>>
>>   "xceiverCount 257 exceeds the limit of concurrent xcievers 256"
>>
>>  with BOTH spelling variants employed within the same error msg. Very 
>> confusing. So I added property entries for both spellings in the 
>> hadoop-site.xml file. Figured one of them would take effect. That 
>> appears to work fine. But I would like get the correct spelling. Did a
>>     
>
>   
>> Google search and the spelling keeps popping up both ways, so I remain
>>     
>
>   
>> confused.
>>
>> I think the Hbase getting started documentation could use some 
>> enhancement on file handle settings, xceiver (xciever?) settings, and 
>> datanode handler count settings.
>>  Ron
>>
>> ___________________________________________
>> Ronald Taylor, Ph.D.
>> Computational Biology & Bioinformatics Group Pacific Northwest 
>> National Laboratory
>> 902 Battelle Boulevard
>> P.O. Box 999, MSIN K7-90
>> Richland, WA  99352 USA
>> Office:  509-372-6568
>> Email: ronald.taylor@pnl.gov
>> www.pnl.gov
>>
>> -----Original Message-----
>> From: Ryan Rawson [mailto:ryanobjc@gmail.com]
>> Sent: Monday, April 06, 2009 6:47 PM
>> To: Taylor, Ronald C
>> Cc: hbase-user@hadoop.apache.org
>> Subject: Re: Still need help with data upload into HBase
>>
>> I ran into a problem on ubuntu where /etc/security/limits.conf wasnt 
>> being honored due to a missing line in /etc/pam.d/common-session:
>> "session required        pam_limits.so"
>>
>> this prevented the ulimits from being run.
>>
>> can you sudo to the hadoop/hbase user and verify with ulimit -a ?
>>
>>
>>
>> On Mon, Apr 6, 2009 at 5:07 PM, Taylor, Ronald C
>> <ro...@pnl.gov>wrote:
>>
>>   
>>     
>>>  Hello Ryan and the list,
>>>
>>> Well, I am still stuck. In addition to making the changes recommended
>>>       
>
>   
>>> by Ryan to my hadoop-site.xml file (see below), I also added a line 
>>> for HBase to /etc/security/limits.conf and had the fs.file-max hugely
>>>       
>
>   
>>> increased, to hopefully handle any file handle limit problem. Still 
>>> no
>>>     
>>>       
>>   
>>     
>>> luck with my upload program. It fails about where it did before, 
>>> around the loading of  the 160,000th row into the one table that I 
>>> create in Hbase. Didn't  the "too many file open" msg, but did get 
>>> "handleConnectionFailure"  in the same place in the upload.
>>>
>>> I then tried a complete reinstall of Hbase and Hadoop, upgrading from
>>>       
>
>   
>>> 0.19.0 to 0.19.1. Used the same config parameters as before, and 
>>> reran
>>>     
>>>       
>>   
>>     
>>> the program. It fails again, at about the same number of rows 
>>> uploaded
>>>     
>>>       
>>   
>>     
>>> - and I'm back to getting "too many files open" as what I think is 
>>> the
>>>     
>>>       
>>   
>>     
>>> principal error msg.
>>>
>>> So - does anybody have any suggestions? I am running a
>>>     
>>>       
>> "pseudo-distributed"
>>   
>>     
>>> installation of Hadoop on one Red Hat Linux machine with about ~3Gb 
>>> of
>>>     
>>>       
>> RAM.
>>   
>>     
>>> Are there any known problems with bulk uploads when running 
>>> "pseudo-distributed" on on a single box, rather than a true cluster?
>>> Is there anything else I can try?
>>>  Ron
>>>
>>>
>>> ___________________________________________
>>> Ronald Taylor, Ph.D.
>>> Computational Biology & Bioinformatics Group Pacific Northwest 
>>> National Laboratory
>>> 902 Battelle Boulevard
>>> P.O. Box 999, MSIN K7-90
>>> Richland, WA  99352 USA
>>> Office:  509-372-6568
>>> Email: ronald.taylor@pnl.gov
>>> www.pnl.gov
>>>
>>>
>>>  ------------------------------
>>> *From:* Ryan Rawson [mailto:ryanobjc@gmail.com]
>>> *Sent:* Friday, April 03, 2009 5:56 PM
>>> *To:* Taylor, Ronald C
>>> *Subject:* Re: FW: Still need help with data upload into HBase
>>>
>>> Welcome to hbase :-)
>>>
>>> This is pretty much how it goes for nearly every new user.
>>>
>>> We might want to review our docs...
>>>
>>> On Fri, Apr 3, 2009 at 5:54 PM, Taylor, Ronald C
>>>     
>>>       
>> <ro...@pnl.gov>wrote:
>>   
>>     
>>>> Thanks. I'll make those settings, too, in addition to bumping up the
>>>>         
>
>   
>>>> file handle limit, and give it another go.
>>>> Ron
>>>>
>>>> -----Original Message-----
>>>> From: Ryan Rawson [mailto:ryanobjc@gmail.com]
>>>> Sent: Friday, April 03, 2009 5:48 PM
>>>> To: hbase-user@hadoop.apache.org
>>>>  Subject: Re: Still need help with data upload into HBase
>>>>
>>>> Hey,
>>>>
>>>> File handle - yes... there was a FAQ and/or getting started which 
>>>> talks about upping lots of limits.
>>>>
>>>> I have these set in my hadoop-site.xml (that is read by datanode):
>>>> <property>
>>>> <name>dfs.datanode.max.xcievers</name>
>>>> <value>2047</value>
>>>> </property>
>>>>
>>>> <property>
>>>> <name>dfs.datanode.handler.count</name>
>>>> <value>10</value>
>>>> </property>
>>>>
>>>> I should probably set the datanode.handler.count higher.
>>>>
>>>> Don't forget to toss a reasonable amount of ram at hdfs... not sure 
>>>> what that is exactly, but -Xmx1000m wouldn't hurt.
>>>>
>>>> On Fri, Apr 3, 2009 at 5:44 PM, Taylor, Ronald C
>>>> <ro...@pnl.gov>wrote:
>>>>
>>>>       
>>>>         
>>>>> Hi Ryan,
>>>>>
>>>>> Thanks for the info. Re checking the Hadoop datanode log file: I 
>>>>> just did so, and found a "too many open files" error. Checking the 
>>>>> Hbase
>>>>>         
>>>>>           
>>>> FAQ,
>>>>       
>>>>         
>>>>> I see that I should drastically bump up the file handle limit. So I
>>>>>         
>>>>>           
>>>> will
>>>>       
>>>>         
>>>>> give that a try.
>>>>>
>>>>> Question: what does the xciver variable do? My hadoop-site.xml file
>>>>>         
>>>>>           
>>>> does
>>>>       
>>>>         
>>>>> not contain any entry for such a var. (Nothing reported in the 
>>>>> datalog file either with the word "xciver".)
>>>>>
>>>>> Re using the local file system: well, as soon as I load a nice data
>>>>>         
>>>>>           
>>>> set
>>>>       
>>>>         
>>>>> loaded in, I'm starting a demo project manipulating it for our Env 
>>>>> Molecular Sciences Lab (EMSL), a DOE Nat User Facility. And I'm
>>>>>         
>>>>>           
>>>> supposed
>>>>       
>>>>         
>>>>> to be doing the manipulating using MapReduce programs, to show the 
>>>>> usefulness of such an approach. So I need Hadoop and the HDFS. And 
>>>>> so
>>>>>         
>>>>>           
>>>> I
>>>>       
>>>>         
>>>>> would prefer to keep using Hbase on top of Hadoop, rather than the
>>>>>         
>>>>>           
>>>> local
>>>>       
>>>>         
>>>>> Linux file system. Hopefully the "small HDFS clusters" issues you 
>>>>> mention are survivable. Eventually, some of this programming might
>>>>>         
>>>>>           
>>>> wind
>>>>       
>>>>         
>>>>> up on Chinook, our 160 Teraflop supercomputer cluster, but that's a
>>>>>         
>>>>>           
>>>> ways
>>>>       
>>>>         
>>>>> down the road. I'm starting on my Linux desktop.
>>>>>
>>>>> I'll try bumping up the file handle limit, restart Hadoop and 
>>>>> Hbase,
>>>>>         
>>>>>           
>>>> and
>>>>       
>>>>         
>>>>> see what happens.
>>>>> Ron
>>>>>
>>>>> ___________________________________________
>>>>> Ronald Taylor, Ph.D.
>>>>> Computational Biology & Bioinformatics Group Pacific Northwest 
>>>>> National Laboratory
>>>>> 902 Battelle Boulevard
>>>>> P.O. Box 999, MSIN K7-90
>>>>> Richland, WA  99352 USA
>>>>> Office:  509-372-6568
>>>>> Email: ronald.taylor@pnl.gov
>>>>> www.pnl.gov
>>>>>
>>>>>
>>>>> -----Original Message-----
>>>>> From: Ryan Rawson [mailto:ryanobjc@gmail.com]
>>>>> Sent: Friday, April 03, 2009 5:08 PM
>>>>> To: hbase-user@hadoop.apache.org
>>>>> Subject: Re: Still need help with data upload into HBase
>>>>>
>>>>> Hey,
>>>>>
>>>>> Can you check the datanode logs?  You might be running into the
>>>>>         
>>>>>           
>>>> dreaded
>>>>       
>>>>         
>>>>> xciver limit :-(
>>>>>
>>>>> try upping the xciver in hadoop-site.xml... i run at 2048.
>>>>>
>>>>> -ryan
>>>>>
>>>>> -----Original Message-----
>>>>> From: Ryan Rawson [mailto:ryanobjc@gmail.com]
>>>>> Sent: Friday, April 03, 2009 5:13 PM
>>>>> To: hbase-user@hadoop.apache.org
>>>>> Subject: Re: Still need help with data upload into HBase
>>>>>
>>>>> Non replicated yet is probably what you think - HDFS hasnt place
>>>>>         
>>>>>           
>>>> blocks
>>>>       
>>>>         
>>>>> on
>>>>> more nodes yet.  This could be due to the pseudo distributed nature
>>>>>         
>>>>>           
>>   
>>     
>>>>> of your set-up.  I'm not familiar with that configuration, so I 
>>>>> can't really
>>>>>         
>>>>>           
>>>> say
>>>>       
>>>>         
>>>>> more.
>>>>>
>>>>> If you only have 1 machine, you might as well just go with local
>>>>>         
>>>>>           
>>>> files.
>>>>       
>>>>         
>>>>> The
>>>>> HDFS gets you distributed replication, but until you have many
>>>>>         
>>>>>           
>>>> machines,
>>>>       
>>>>         
>>>>> it
>>>>> won't buy you anything and only cause problems, since small HDFS 
>>>>> clusters are known to have issues.
>>>>>
>>>>> Good luck (again!)
>>>>> -ryan
>>>>>
>>>>> On Fri, Apr 3, 2009 at 5:07 PM, Ryan Rawson <ry...@gmail.com>
>>>>>         
>>>>>           
>>>> wrote:
>>>>       
>>>>         
>>>>>> Hey,
>>>>>>
>>>>>> Can you check the datanode logs?  You might be running into the
>>>>>>           
>>>>>>             
>>>>> dreaded
>>>>>         
>>>>>           
>>>>>> xciver limit :-(
>>>>>>
>>>>>> try upping the xciver in hadoop-site.xml... i run at 2048.
>>>>>>
>>>>>> -ryan
>>>>>>
>>>>>>
>>>>>> On Fri, Apr 3, 2009 at 4:35 PM, Taylor, Ronald C
>>>>>>           
>>>>>>             
>>>>> <ro...@pnl.gov>wrote:
>>>>>         
>>>>>           
>>>>>>> Hello folks,
>>>>>>>
>>>>>>> I have just tried using Ryan's doCommit() method for my bulk 
>>>>>>> upload
>>>>>>>             
>>>>>>>               
>>>>> into
>>>>>         
>>>>>           
>>>>>>> one Hbase table. No luck. I still start to get errors around row
>>>>>>>             
>>>>>>>               
>>   
>>     
>>>>>>> 160,000. On-screen, the program starts to generate error msgs 
>>>>>>> like
>>>>>>>             
>>>>>>>               
>>>>> so:
>>>>>         
>>>>>           
>>>>>>> ...
>>>>>>> INFO: Retrying connect to server: /127.0.0.1:60383. Already tried
>>>>>>>               
>
>   
>>>>>>> 8 time(s).
>>>>>>> Apr 3, 2009 2:39:52 PM
>>>>>>> org.apache.hadoop.hbase.ipc.HBaseClient$Connection
>>>>>>> handleConnectionFailure
>>>>>>> INFO: Retrying connect to server: /127.0.0.1:60383. Already tried
>>>>>>>               
>
>   
>>>>>>> 9 time(s).
>>>>>>> Apr 3, 2009 2:39:57 PM
>>>>>>> org.apache.hadoop.hbase.ipc.HBaseClient$Connection
>>>>>>> handleConnectionFailure
>>>>>>> INFO: Retrying connect to server: /127.0.0.1:60383. Already tried
>>>>>>>               
>
>   
>>>>>>> 0 time(s).
>>>>>>> Apr 3, 2009 2:39:58 PM
>>>>>>> org.apache.hadoop.hbase.ipc.HBaseClient$Connection
>>>>>>> handleConnectionFailure
>>>>>>> INFO: Retrying connect to server: /127.0.0.1:60383. Already tried
>>>>>>>               
>
>   
>>>>>>> 1 time(s).
>>>>>>> ...
>>>>>>> In regard to log file information, I have appended at bottom some
>>>>>>>             
>>>>>>>               
>>>> of
>>>>       
>>>>         
>>>>> the
>>>>>         
>>>>>           
>>>>>>> output from my hbase-<user>-master-<machine>.log file, at the 
>>>>>>> place where it looks to me like things might have started to go
>>>>>>>             
>>>>>>>               
>> wrong.
>>   
>>     
>>>>> Several
>>>>>         
>>>>>           
>>>>>>> questions:
>>>>>>>
>>>>>>> 1)  Is there any readily apparent cause for such a 
>>>>>>> HBaseClient$Connection handleConnectionFailure to occur in a 
>>>>>>> Hbase installation configured on a Linux desktop to work in the 
>>>>>>> pseudo-distributed operation mode? From my understanding, even
>>>>>>>             
>>>>>>>               
>>>>> importing
>>>>>         
>>>>>           
>>>>>>> ~200,000 rows (each row being filled with info for ten columns) 
>>>>>>> is
>>>>>>>             
>>>>>>>               
>>>> a
>>>>       
>>>>         
>>>>>>> minimal data set for Hbase, and upload should not be failing like
>>>>>>>             
>>>>>>>               
>>>>> this.
>>>>>         
>>>>>           
>>>>>>> FYI - minimal changes were made to the Hbase default settings in
>>>>>>>             
>>>>>>>               
>>>> the
>>>>       
>>>>         
>>>>>>> Hbase ../conf/ config files when I installed Hbase 0.19.0. I have
>>>>>>>             
>>>>>>>               
>>>> one
>>>>       
>>>>         
>>>>>>> entry in hbase-env.sh, to set JAVA_HOME, and one property entry 
>>>>>>> in hbase-site.xml, to set the hbase.rootdir.
>>>>>>>
>>>>>>> 2) My Linux box has about 3 Gb of memory. I left the HADOOP_HEAP
>>>>>>>             
>>>>>>>               
>>>> and
>>>>       
>>>>         
>>>>>>> HBASE_HEAP sizes at their default values, which I understand are
>>>>>>>             
>>>>>>>               
>>>>> 1000Mb
>>>>>         
>>>>>           
>>>>>>> each. Should I have changed either value?
>>>>>>>
>>>>>>> 3) I left the dfs.replication value at the default of  "3" in 
>>>>>>> the hadoop-site.xml file, for my test of pseudo-distributed
>>>>>>>             
>>>>>>>               
>> operation.
>>   
>>     
>>>>>>> Should I have changed that to "1", for operation on my single
>>>>>>>             
>>>>>>>               
>>>>> machine?
>>>>>         
>>>>>           
>>>>>>> Downsizing to "1" would appear to me to negate trying out Hadoop
>>>>>>>             
>>>>>>>               
>>   
>>     
>>>>>>> in
>>>>>>>             
>>>>>>>               
>>>>> the
>>>>>         
>>>>>           
>>>>>>> pseudo-distributed operation mode, so I left the value "as is", 
>>>>>>> but
>>>>>>>             
>>>>>>>               
>>>>> did
>>>>>         
>>>>>           
>>>>>>> I get this wrong?
>>>>>>>
>>>>>>> 4) In the log output below, you can see that Hbase starts to 
>>>>>>> block
>>>>>>>             
>>>>>>>               
>>>>> and
>>>>>         
>>>>>           
>>>>>>> then unblock updates to my one Hbase table (called the 
>>>>>>> "ppInteractionTable", for protein-protein interaction table). A
>>>>>>>             
>>>>>>>               
>>>>> little
>>>>>         
>>>>>           
>>>>>>> later, a msg says that the ppInteractionTable has been closed. 
>>>>>>> At
>>>>>>>             
>>>>>>>               
>>>>> this
>>>>>         
>>>>>           
>>>>>>> point, my program has *not* issued a command to close the table 
>>>>>>> -
>>>>>>>             
>>>>>>>               
>>>>> that
>>>>>         
>>>>>           
>>>>>>> only happens at the end of the program. So - why is this
>>>>>>>             
>>>>>>>               
>> happening?
>>   
>>     
>>>>>>> Also, near the end of my log extract, I get a different error
>>>>>>>             
>>>>>>>               
>> msg:
>>   
>>     
>>>>>>> NotReplicatedYetException. I have no idea what that means.
>>>>>>>             
>>>>>>>               
>>>> Actually,
>>>>       
>>>>         
>>>>> I
>>>>>         
>>>>>           
>>>>>>> don't really have a grasp yet on what any of these error msgs is
>>>>>>>             
>>>>>>>               
>>   
>>     
>>>>>>> supposed to tell us. So - once again, any help would be much 
>>>>>>> appreciated.
>>>>>>>
>>>>>>>  Ron
>>>>>>>
>>>>>>> ___________________________________________
>>>>>>> Ronald Taylor, Ph.D.
>>>>>>> Computational Biology & Bioinformatics Group Pacific Northwest 
>>>>>>> National Laboratory
>>>>>>> 902 Battelle Boulevard
>>>>>>> P.O. Box 999, MSIN K7-90
>>>>>>> Richland, WA  99352 USA
>>>>>>> Office:  509-372-6568
>>>>>>> Email: ronald.taylor@pnl.gov
>>>>>>> www.pnl.gov
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: Taylor, Ronald C
>>>>>>> Sent: Tuesday, March 31, 2009 5:48 PM
>>>>>>> To: 'hbase-user@hadoop.apache.org'
>>>>>>> Cc: Taylor, Ronald C
>>>>>>> Subject: Novice Hbase user needs help with data upload - gets a 
>>>>>>> RetriesExhaustedException, followed by 
>>>>>>> NoServerForRegionException
>>>>>>>
>>>>>>>
>>>>>>> Hello folks,
>>>>>>>
>>>>>>> This is my first msg to the list - I just joined today, and I am
>>>>>>>             
>>>>>>>               
>>   
>>     
>>>>>>> a novice Hadoop/HBase programmer. I have a question:
>>>>>>>
>>>>>>> I have written a Java program to create an HBase table and then
>>>>>>>             
>>>>>>>               
>>>> enter
>>>>       
>>>>         
>>>>> a
>>>>>         
>>>>>           
>>>>>>> number of rows into the table. The only way I have found so far 
>>>>>>> to
>>>>>>>             
>>>>>>>               
>>>> do
>>>>       
>>>>         
>>>>>>> this is to enter each row one-by-one, creating a new BatchUpdate
>>>>>>>             
>>>>>>>               
>>   
>>     
>>>>>>> updateObj for each row, doing about ten updateObj.put()'s to add
>>>>>>>             
>>>>>>>               
>>>> the
>>>>       
>>>>         
>>>>>>> column data, and then doing a tableObj.commit(updateObj). 
>>>>>>> There's probably a more efficient way (happy to hear, if so!), 
>>>>>>> but this is
>>>>>>>             
>>>>>>>               
>>>>> what
>>>>>         
>>>>>           
>>>>>>> I'm starting with.
>>>>>>>
>>>>>>> When I do this on input that creates 3000 rows, the program 
>>>>>>> works
>>>>>>>             
>>>>>>>               
>>>>> fine.
>>>>>         
>>>>>           
>>>>>>> When I try this on input that would create 300,000 rows (still 
>>>>>>> relatively small for an HBase table, I would think), the program
>>>>>>>             
>>>>>>>               
>>   
>>     
>>>>>>> terminates around row 160,000 or so, generating first an 
>>>>>>> RetriesExhaustedException, followed by
>>>>>>>             
>>>>>>>               
>> NoServerForRegionException.
>>   
>>     
>>>>> The
>>>>>         
>>>>>           
>>>>>>> HBase server crashes, and I have to restart it. The Hadoop 
>>>>>>> server appears to remain OK and does not need restarting.
>>>>>>>
>>>>>>> Can anybody give me any guidance? I presume that I might need to
>>>>>>>             
>>>>>>>               
>>>>> adjust
>>>>>         
>>>>>           
>>>>>>> some setting for larger input in the HBase and/or Hadoop config
>>>>>>>             
>>>>>>>               
>>>>> files.
>>>>>         
>>>>>           
>>>>>>> At present, I am using default settings. I have installed Hadoop
>>>>>>>             
>>>>>>>               
>>>>> 0.19.0
>>>>>         
>>>>>           
>>>>>>> and HBase 0.19.0 in the "pseudo" cluster mode on a single 
>>>>>>> machine,
>>>>>>>             
>>>>>>>               
>>>> my
>>>>       
>>>>         
>>>>>>> Red Hat Linux desktop, which has 3 Gb RAM.
>>>>>>>
>>>>>>> Any help / suggestions would be much appreciated.
>>>>>>>
>>>>>>>  Cheers,
>>>>>>>   Ron Taylor
>>>>>>>             
>>>>>>>               
>>   
>>     
>
>