You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Marc Harris <mh...@jumptap.com> on 2008/01/22 18:08:00 UTC

Restrictions on the string that can be used for the hadoop key

Are there any restrictions on the string that can be used as the key for
an hbase row? I ask because I am using strings of the form:
    <URL><space><alphanumerics><comma><numerics>

and I frequently get problems that seem to start with the following log
message (in my regionserver log file):

2008-01-18 17:24:21,512 FATAL org.apache.hadoop.hbase.HRegionServer: Set
stop flag in regionserver/0:0:0:0:0:0:0:0:60020.splitOrCompactChecker
java.lang.IllegalArgumentException: java.net.URISyntaxException: Illegal
character in scheme name at index 7:
hregion_pagefetch,http://someurl.com/fo/bar%20abcd1,5538225121025076292

This region server then appears to shut down, and restarting everything
(hbase and all hadoop processes) still fails with that same error. I end
up having to re-format the entire hadoop directory.

Can anyone shell some light on what may be happening? It looks to me
like something is adding the prefix "hregion_" to the beginning of my
key, and something else is interpreting the whole thing as a URL and
getting very confused.

Thanks.

Re: Restrictions on the string that can be used for the hadoop key

Posted by Mike Forrest <mf...@trailfire.com>.
Hadoop-2056 seems to address this.

Bryan Duxbury wrote:
> Marc,
>
> If there isn't an issue for this in the bug tracker, please enter one. 
> If you can, look in your region server logs for where this exception 
> occurs and post the entire stack trace. This might be something having 
> to do with the way files are named in DFS.
>
> -Bryan
>
> On Jan 22, 2008, at 9:28 AM, Marc Harris wrote:
>
>> Do you know if there is a bug in the hbase bug tracking system about
>> this?
>> a) I don't see this restriction specified in any documentation.
>> b) It's quite bad that a client error should take the server down
>> c) It's really bad that it seems to corrupt the data.
>>
>> - Marc
>>
>>
>> On Tue, 2008-01-22 at 09:23 -0800, Mike Forrest wrote:
>>
>>> I ran into a similar error, and it ended up being due to the colon 
>>> (":")
>>> character in the URL.   Encoding the URL or just stripping the colon
>>> character(s) fixed it.
>>>
>>> Marc Harris wrote:
>>>> Are there any restrictions on the string that can be used as the 
>>>> key for
>>>> an hbase row? I ask because I am using strings of the form:
>>>>     <URL><space><alphanumerics><comma><numerics>
>>>>
>>>> and I frequently get problems that seem to start with the following 
>>>> log
>>>> message (in my regionserver log file):
>>>>
>>>> 2008-01-18 17:24:21,512 FATAL 
>>>> org.apache.hadoop.hbase.HRegionServer: Set
>>>> stop flag in regionserver/0:0:0:0:0:0:0:0:60020.splitOrCompactChecker
>>>> java.lang.IllegalArgumentException: java.net.URISyntaxException: 
>>>> Illegal
>>>> character in scheme name at index 7:
>>>> hregion_pagefetch,http://someurl.com/fo/bar%20abcd1,5538225121025076292 
>>>>
>>>>
>>>> This region server then appears to shut down, and restarting 
>>>> everything
>>>> (hbase and all hadoop processes) still fails with that same error. 
>>>> I end
>>>> up having to re-format the entire hadoop directory.
>>>>
>>>> Can anyone shell some light on what may be happening? It looks to me
>>>> like something is adding the prefix "hregion_" to the beginning of my
>>>> key, and something else is interpreting the whole thing as a URL and
>>>> getting very confused.
>>>>
>>>> Thanks.
>>>>
>>>>
>>>
>


Re: Restrictions on the string that can be used for the hadoop key

Posted by Bryan Duxbury <br...@rapleaf.com>.
Marc,

If there isn't an issue for this in the bug tracker, please enter  
one. If you can, look in your region server logs for where this  
exception occurs and post the entire stack trace. This might be  
something having to do with the way files are named in DFS.

-Bryan

On Jan 22, 2008, at 9:28 AM, Marc Harris wrote:

> Do you know if there is a bug in the hbase bug tracking system about
> this?
> a) I don't see this restriction specified in any documentation.
> b) It's quite bad that a client error should take the server down
> c) It's really bad that it seems to corrupt the data.
>
> - Marc
>
>
> On Tue, 2008-01-22 at 09:23 -0800, Mike Forrest wrote:
>
>> I ran into a similar error, and it ended up being due to the colon  
>> (":")
>> character in the URL.   Encoding the URL or just stripping the colon
>> character(s) fixed it.
>>
>> Marc Harris wrote:
>>> Are there any restrictions on the string that can be used as the  
>>> key for
>>> an hbase row? I ask because I am using strings of the form:
>>>     <URL><space><alphanumerics><comma><numerics>
>>>
>>> and I frequently get problems that seem to start with the  
>>> following log
>>> message (in my regionserver log file):
>>>
>>> 2008-01-18 17:24:21,512 FATAL  
>>> org.apache.hadoop.hbase.HRegionServer: Set
>>> stop flag in regionserver/ 
>>> 0:0:0:0:0:0:0:0:60020.splitOrCompactChecker
>>> java.lang.IllegalArgumentException: java.net.URISyntaxException:  
>>> Illegal
>>> character in scheme name at index 7:
>>> hregion_pagefetch,http://someurl.com/fo/bar% 
>>> 20abcd1,5538225121025076292
>>>
>>> This region server then appears to shut down, and restarting  
>>> everything
>>> (hbase and all hadoop processes) still fails with that same  
>>> error. I end
>>> up having to re-format the entire hadoop directory.
>>>
>>> Can anyone shell some light on what may be happening? It looks to me
>>> like something is adding the prefix "hregion_" to the beginning  
>>> of my
>>> key, and something else is interpreting the whole thing as a URL and
>>> getting very confused.
>>>
>>> Thanks.
>>>
>>>
>>


Re: Restrictions on the string that can be used for the hadoop key

Posted by stack <st...@duboce.net>.
Lads, you both must be running 0.15.x hbase?  This has been fixed in 
TRUNK (HADOOP-2079).
St.Ack

Marc Harris wrote:
> Do you know if there is a bug in the hbase bug tracking system about
> this?
> a) I don't see this restriction specified in any documentation.
> b) It's quite bad that a client error should take the server down
> c) It's really bad that it seems to corrupt the data.
>
> - Marc
>
>
> On Tue, 2008-01-22 at 09:23 -0800, Mike Forrest wrote:
>
>   
>> I ran into a similar error, and it ended up being due to the colon (":") 
>> character in the URL.   Encoding the URL or just stripping the colon 
>> character(s) fixed it.
>>
>> Marc Harris wrote:
>>     
>>> Are there any restrictions on the string that can be used as the key for
>>> an hbase row? I ask because I am using strings of the form:
>>>     <URL><space><alphanumerics><comma><numerics>
>>>
>>> and I frequently get problems that seem to start with the following log
>>> message (in my regionserver log file):
>>>
>>> 2008-01-18 17:24:21,512 FATAL org.apache.hadoop.hbase.HRegionServer: Set
>>> stop flag in regionserver/0:0:0:0:0:0:0:0:60020.splitOrCompactChecker
>>> java.lang.IllegalArgumentException: java.net.URISyntaxException: Illegal
>>> character in scheme name at index 7:
>>> hregion_pagefetch,http://someurl.com/fo/bar%20abcd1,5538225121025076292
>>>
>>> This region server then appears to shut down, and restarting everything
>>> (hbase and all hadoop processes) still fails with that same error. I end
>>> up having to re-format the entire hadoop directory.
>>>
>>> Can anyone shell some light on what may be happening? It looks to me
>>> like something is adding the prefix "hregion_" to the beginning of my
>>> key, and something else is interpreting the whole thing as a URL and
>>> getting very confused.
>>>
>>> Thanks.
>>>
>>>   
>>>       
>
>   


Re: Restrictions on the string that can be used for the hadoop key

Posted by Marc Harris <mh...@jumptap.com>.
Do you know if there is a bug in the hbase bug tracking system about
this?
a) I don't see this restriction specified in any documentation.
b) It's quite bad that a client error should take the server down
c) It's really bad that it seems to corrupt the data.

- Marc


On Tue, 2008-01-22 at 09:23 -0800, Mike Forrest wrote:

> I ran into a similar error, and it ended up being due to the colon (":") 
> character in the URL.   Encoding the URL or just stripping the colon 
> character(s) fixed it.
> 
> Marc Harris wrote:
> > Are there any restrictions on the string that can be used as the key for
> > an hbase row? I ask because I am using strings of the form:
> >     <URL><space><alphanumerics><comma><numerics>
> >
> > and I frequently get problems that seem to start with the following log
> > message (in my regionserver log file):
> >
> > 2008-01-18 17:24:21,512 FATAL org.apache.hadoop.hbase.HRegionServer: Set
> > stop flag in regionserver/0:0:0:0:0:0:0:0:60020.splitOrCompactChecker
> > java.lang.IllegalArgumentException: java.net.URISyntaxException: Illegal
> > character in scheme name at index 7:
> > hregion_pagefetch,http://someurl.com/fo/bar%20abcd1,5538225121025076292
> >
> > This region server then appears to shut down, and restarting everything
> > (hbase and all hadoop processes) still fails with that same error. I end
> > up having to re-format the entire hadoop directory.
> >
> > Can anyone shell some light on what may be happening? It looks to me
> > like something is adding the prefix "hregion_" to the beginning of my
> > key, and something else is interpreting the whole thing as a URL and
> > getting very confused.
> >
> > Thanks.
> >
> >   
> 

Re: Restrictions on the string that can be used for the hadoop key

Posted by Mike Forrest <mf...@trailfire.com>.
I ran into a similar error, and it ended up being due to the colon (":") 
character in the URL.   Encoding the URL or just stripping the colon 
character(s) fixed it.

Marc Harris wrote:
> Are there any restrictions on the string that can be used as the key for
> an hbase row? I ask because I am using strings of the form:
>     <URL><space><alphanumerics><comma><numerics>
>
> and I frequently get problems that seem to start with the following log
> message (in my regionserver log file):
>
> 2008-01-18 17:24:21,512 FATAL org.apache.hadoop.hbase.HRegionServer: Set
> stop flag in regionserver/0:0:0:0:0:0:0:0:60020.splitOrCompactChecker
> java.lang.IllegalArgumentException: java.net.URISyntaxException: Illegal
> character in scheme name at index 7:
> hregion_pagefetch,http://someurl.com/fo/bar%20abcd1,5538225121025076292
>
> This region server then appears to shut down, and restarting everything
> (hbase and all hadoop processes) still fails with that same error. I end
> up having to re-format the entire hadoop directory.
>
> Can anyone shell some light on what may be happening? It looks to me
> like something is adding the prefix "hregion_" to the beginning of my
> key, and something else is interpreting the whole thing as a URL and
> getting very confused.
>
> Thanks.
>
>