You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Marc Harris <mh...@jumptap.com> on 2008/01/22 18:08:00 UTC
Restrictions on the string that can be used for the hadoop key
Are there any restrictions on the string that can be used as the key for
an hbase row? I ask because I am using strings of the form:
<URL><space><alphanumerics><comma><numerics>
and I frequently get problems that seem to start with the following log
message (in my regionserver log file):
2008-01-18 17:24:21,512 FATAL org.apache.hadoop.hbase.HRegionServer: Set
stop flag in regionserver/0:0:0:0:0:0:0:0:60020.splitOrCompactChecker
java.lang.IllegalArgumentException: java.net.URISyntaxException: Illegal
character in scheme name at index 7:
hregion_pagefetch,http://someurl.com/fo/bar%20abcd1,5538225121025076292
This region server then appears to shut down, and restarting everything
(hbase and all hadoop processes) still fails with that same error. I end
up having to re-format the entire hadoop directory.
Can anyone shell some light on what may be happening? It looks to me
like something is adding the prefix "hregion_" to the beginning of my
key, and something else is interpreting the whole thing as a URL and
getting very confused.
Thanks.
Re: Restrictions on the string that can be used for the hadoop key
Posted by Mike Forrest <mf...@trailfire.com>.
Hadoop-2056 seems to address this.
Bryan Duxbury wrote:
> Marc,
>
> If there isn't an issue for this in the bug tracker, please enter one.
> If you can, look in your region server logs for where this exception
> occurs and post the entire stack trace. This might be something having
> to do with the way files are named in DFS.
>
> -Bryan
>
> On Jan 22, 2008, at 9:28 AM, Marc Harris wrote:
>
>> Do you know if there is a bug in the hbase bug tracking system about
>> this?
>> a) I don't see this restriction specified in any documentation.
>> b) It's quite bad that a client error should take the server down
>> c) It's really bad that it seems to corrupt the data.
>>
>> - Marc
>>
>>
>> On Tue, 2008-01-22 at 09:23 -0800, Mike Forrest wrote:
>>
>>> I ran into a similar error, and it ended up being due to the colon
>>> (":")
>>> character in the URL. Encoding the URL or just stripping the colon
>>> character(s) fixed it.
>>>
>>> Marc Harris wrote:
>>>> Are there any restrictions on the string that can be used as the
>>>> key for
>>>> an hbase row? I ask because I am using strings of the form:
>>>> <URL><space><alphanumerics><comma><numerics>
>>>>
>>>> and I frequently get problems that seem to start with the following
>>>> log
>>>> message (in my regionserver log file):
>>>>
>>>> 2008-01-18 17:24:21,512 FATAL
>>>> org.apache.hadoop.hbase.HRegionServer: Set
>>>> stop flag in regionserver/0:0:0:0:0:0:0:0:60020.splitOrCompactChecker
>>>> java.lang.IllegalArgumentException: java.net.URISyntaxException:
>>>> Illegal
>>>> character in scheme name at index 7:
>>>> hregion_pagefetch,http://someurl.com/fo/bar%20abcd1,5538225121025076292
>>>>
>>>>
>>>> This region server then appears to shut down, and restarting
>>>> everything
>>>> (hbase and all hadoop processes) still fails with that same error.
>>>> I end
>>>> up having to re-format the entire hadoop directory.
>>>>
>>>> Can anyone shell some light on what may be happening? It looks to me
>>>> like something is adding the prefix "hregion_" to the beginning of my
>>>> key, and something else is interpreting the whole thing as a URL and
>>>> getting very confused.
>>>>
>>>> Thanks.
>>>>
>>>>
>>>
>
Re: Restrictions on the string that can be used for the hadoop key
Posted by Bryan Duxbury <br...@rapleaf.com>.
Marc,
If there isn't an issue for this in the bug tracker, please enter
one. If you can, look in your region server logs for where this
exception occurs and post the entire stack trace. This might be
something having to do with the way files are named in DFS.
-Bryan
On Jan 22, 2008, at 9:28 AM, Marc Harris wrote:
> Do you know if there is a bug in the hbase bug tracking system about
> this?
> a) I don't see this restriction specified in any documentation.
> b) It's quite bad that a client error should take the server down
> c) It's really bad that it seems to corrupt the data.
>
> - Marc
>
>
> On Tue, 2008-01-22 at 09:23 -0800, Mike Forrest wrote:
>
>> I ran into a similar error, and it ended up being due to the colon
>> (":")
>> character in the URL. Encoding the URL or just stripping the colon
>> character(s) fixed it.
>>
>> Marc Harris wrote:
>>> Are there any restrictions on the string that can be used as the
>>> key for
>>> an hbase row? I ask because I am using strings of the form:
>>> <URL><space><alphanumerics><comma><numerics>
>>>
>>> and I frequently get problems that seem to start with the
>>> following log
>>> message (in my regionserver log file):
>>>
>>> 2008-01-18 17:24:21,512 FATAL
>>> org.apache.hadoop.hbase.HRegionServer: Set
>>> stop flag in regionserver/
>>> 0:0:0:0:0:0:0:0:60020.splitOrCompactChecker
>>> java.lang.IllegalArgumentException: java.net.URISyntaxException:
>>> Illegal
>>> character in scheme name at index 7:
>>> hregion_pagefetch,http://someurl.com/fo/bar%
>>> 20abcd1,5538225121025076292
>>>
>>> This region server then appears to shut down, and restarting
>>> everything
>>> (hbase and all hadoop processes) still fails with that same
>>> error. I end
>>> up having to re-format the entire hadoop directory.
>>>
>>> Can anyone shell some light on what may be happening? It looks to me
>>> like something is adding the prefix "hregion_" to the beginning
>>> of my
>>> key, and something else is interpreting the whole thing as a URL and
>>> getting very confused.
>>>
>>> Thanks.
>>>
>>>
>>
Re: Restrictions on the string that can be used for the hadoop key
Posted by stack <st...@duboce.net>.
Lads, you both must be running 0.15.x hbase? This has been fixed in
TRUNK (HADOOP-2079).
St.Ack
Marc Harris wrote:
> Do you know if there is a bug in the hbase bug tracking system about
> this?
> a) I don't see this restriction specified in any documentation.
> b) It's quite bad that a client error should take the server down
> c) It's really bad that it seems to corrupt the data.
>
> - Marc
>
>
> On Tue, 2008-01-22 at 09:23 -0800, Mike Forrest wrote:
>
>
>> I ran into a similar error, and it ended up being due to the colon (":")
>> character in the URL. Encoding the URL or just stripping the colon
>> character(s) fixed it.
>>
>> Marc Harris wrote:
>>
>>> Are there any restrictions on the string that can be used as the key for
>>> an hbase row? I ask because I am using strings of the form:
>>> <URL><space><alphanumerics><comma><numerics>
>>>
>>> and I frequently get problems that seem to start with the following log
>>> message (in my regionserver log file):
>>>
>>> 2008-01-18 17:24:21,512 FATAL org.apache.hadoop.hbase.HRegionServer: Set
>>> stop flag in regionserver/0:0:0:0:0:0:0:0:60020.splitOrCompactChecker
>>> java.lang.IllegalArgumentException: java.net.URISyntaxException: Illegal
>>> character in scheme name at index 7:
>>> hregion_pagefetch,http://someurl.com/fo/bar%20abcd1,5538225121025076292
>>>
>>> This region server then appears to shut down, and restarting everything
>>> (hbase and all hadoop processes) still fails with that same error. I end
>>> up having to re-format the entire hadoop directory.
>>>
>>> Can anyone shell some light on what may be happening? It looks to me
>>> like something is adding the prefix "hregion_" to the beginning of my
>>> key, and something else is interpreting the whole thing as a URL and
>>> getting very confused.
>>>
>>> Thanks.
>>>
>>>
>>>
>
>
Re: Restrictions on the string that can be used for the hadoop key
Posted by Marc Harris <mh...@jumptap.com>.
Do you know if there is a bug in the hbase bug tracking system about
this?
a) I don't see this restriction specified in any documentation.
b) It's quite bad that a client error should take the server down
c) It's really bad that it seems to corrupt the data.
- Marc
On Tue, 2008-01-22 at 09:23 -0800, Mike Forrest wrote:
> I ran into a similar error, and it ended up being due to the colon (":")
> character in the URL. Encoding the URL or just stripping the colon
> character(s) fixed it.
>
> Marc Harris wrote:
> > Are there any restrictions on the string that can be used as the key for
> > an hbase row? I ask because I am using strings of the form:
> > <URL><space><alphanumerics><comma><numerics>
> >
> > and I frequently get problems that seem to start with the following log
> > message (in my regionserver log file):
> >
> > 2008-01-18 17:24:21,512 FATAL org.apache.hadoop.hbase.HRegionServer: Set
> > stop flag in regionserver/0:0:0:0:0:0:0:0:60020.splitOrCompactChecker
> > java.lang.IllegalArgumentException: java.net.URISyntaxException: Illegal
> > character in scheme name at index 7:
> > hregion_pagefetch,http://someurl.com/fo/bar%20abcd1,5538225121025076292
> >
> > This region server then appears to shut down, and restarting everything
> > (hbase and all hadoop processes) still fails with that same error. I end
> > up having to re-format the entire hadoop directory.
> >
> > Can anyone shell some light on what may be happening? It looks to me
> > like something is adding the prefix "hregion_" to the beginning of my
> > key, and something else is interpreting the whole thing as a URL and
> > getting very confused.
> >
> > Thanks.
> >
> >
>
Re: Restrictions on the string that can be used for the hadoop key
Posted by Mike Forrest <mf...@trailfire.com>.
I ran into a similar error, and it ended up being due to the colon (":")
character in the URL. Encoding the URL or just stripping the colon
character(s) fixed it.
Marc Harris wrote:
> Are there any restrictions on the string that can be used as the key for
> an hbase row? I ask because I am using strings of the form:
> <URL><space><alphanumerics><comma><numerics>
>
> and I frequently get problems that seem to start with the following log
> message (in my regionserver log file):
>
> 2008-01-18 17:24:21,512 FATAL org.apache.hadoop.hbase.HRegionServer: Set
> stop flag in regionserver/0:0:0:0:0:0:0:0:60020.splitOrCompactChecker
> java.lang.IllegalArgumentException: java.net.URISyntaxException: Illegal
> character in scheme name at index 7:
> hregion_pagefetch,http://someurl.com/fo/bar%20abcd1,5538225121025076292
>
> This region server then appears to shut down, and restarting everything
> (hbase and all hadoop processes) still fails with that same error. I end
> up having to re-format the entire hadoop directory.
>
> Can anyone shell some light on what may be happening? It looks to me
> like something is adding the prefix "hregion_" to the beginning of my
> key, and something else is interpreting the whole thing as a URL and
> getting very confused.
>
> Thanks.
>
>