You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by Ho Kenneth - kennho <Ke...@acxiom.com> on 2013/01/10 03:32:11 UTC

unicode character as delimiter

Hi all,

I have an input file that has a unicode character as a delimiter, which is þ  (thorn)

For example:

col1þcol2þcol3

  Þ has a value of UTF-8(hex) 0xC3 0xBE (c3be)

And I have tried the following but no luck:
create table test(col1 string, col2 string, col3 string) row format delimited fields terminated by '\c3be';

I'd appreciate your help! Thanks in advance.

--ken



***************************************************************************
The information contained in this communication is confidential, is
intended only for the use of the recipient named above, and may be legally
privileged.

If the reader of this message is not the intended recipient, you are
hereby notified that any dissemination, distribution or copying of this
communication is strictly prohibited.

If you have received this communication in error, please resend this
communication to the sender and delete the original message or any copy
of it from your computer system.

Thank You.
****************************************************************************

Re: unicode character as delimiter

Posted by Ho Kenneth - kennho <Ke...@acxiom.com>.
I'd appreicate if someone can help out with this issue. Tons of thanks!  :)

I have tried many different combinations but still not able to get it to
work.

Q: how do we parse delimiter - "þ"



On 1/10/13 8:08 AM, "Ho Kenneth - kennho" <Ke...@acxiom.com> wrote:

>Thanks for the quick response.
>
>I try '\376', but still not working  :(
>
>
>
>On 1/10/13 6:23 AM, "Dean Wampler" <de...@thinkbiganalytics.com>
>wrote:
>
>>You have to use the octal representation, e.g., ^A is \001.
>>
>>On Wed, Jan 9, 2013 at 8:32 PM, Ho Kenneth - kennho
>><Ke...@acxiom.com>wrote:
>>
>>> Hi all,
>>>
>>> I have an input file that has a unicode character as a delimiter, which
>>>is
>>> þ  (thorn)
>>>
>>> For example:
>>>
>>> col1þcol2þcol3
>>>
>>>   Þ has a value of UTF-8(hex) 0xC3 0xBE (c3be)
>>>
>>> And I have tried the following but no luck:
>>> create table test(col1 string, col2 string, col3 string) row format
>>> delimited fields terminated by '\c3be';
>>>
>>> I'd appreciate your help! Thanks in advance.
>>>
>>> --ken
>>>
>>>
>>>
>>> 
>>>************************************************************************
>>>*
>>>**
>>> The information contained in this communication is confidential, is
>>> intended only for the use of the recipient named above, and may be
>>>legally
>>> privileged.
>>>
>>> If the reader of this message is not the intended recipient, you are
>>> hereby notified that any dissemination, distribution or copying of this
>>> communication is strictly prohibited.
>>>
>>> If you have received this communication in error, please resend this
>>> communication to the sender and delete the original message or any copy
>>> of it from your computer system.
>>>
>>> Thank You.
>>>
>>> 
>>>************************************************************************
>>>*
>>>***
>>>
>>
>>
>>
>>-- 
>>*Dean Wampler, Ph.D.*
>>thinkbiganalytics.com
>>+1-312-339-1330
>


Re: unicode character as delimiter

Posted by Ho Kenneth - kennho <Ke...@acxiom.com>.
Thanks for the quick response.

I try '\376', but still not working  :(



On 1/10/13 6:23 AM, "Dean Wampler" <de...@thinkbiganalytics.com>
wrote:

>You have to use the octal representation, e.g., ^A is \001.
>
>On Wed, Jan 9, 2013 at 8:32 PM, Ho Kenneth - kennho
><Ke...@acxiom.com>wrote:
>
>> Hi all,
>>
>> I have an input file that has a unicode character as a delimiter, which
>>is
>> þ  (thorn)
>>
>> For example:
>>
>> col1þcol2þcol3
>>
>>   Þ has a value of UTF-8(hex) 0xC3 0xBE (c3be)
>>
>> And I have tried the following but no luck:
>> create table test(col1 string, col2 string, col3 string) row format
>> delimited fields terminated by '\c3be';
>>
>> I'd appreciate your help! Thanks in advance.
>>
>> --ken
>>
>>
>>
>> 
>>*************************************************************************
>>**
>> The information contained in this communication is confidential, is
>> intended only for the use of the recipient named above, and may be
>>legally
>> privileged.
>>
>> If the reader of this message is not the intended recipient, you are
>> hereby notified that any dissemination, distribution or copying of this
>> communication is strictly prohibited.
>>
>> If you have received this communication in error, please resend this
>> communication to the sender and delete the original message or any copy
>> of it from your computer system.
>>
>> Thank You.
>>
>> 
>>*************************************************************************
>>***
>>
>
>
>
>-- 
>*Dean Wampler, Ph.D.*
>thinkbiganalytics.com
>+1-312-339-1330


Re: unicode character as delimiter

Posted by Dean Wampler <de...@thinkbiganalytics.com>.
You have to use the octal representation, e.g., ^A is \001.

On Wed, Jan 9, 2013 at 8:32 PM, Ho Kenneth - kennho
<Ke...@acxiom.com>wrote:

> Hi all,
>
> I have an input file that has a unicode character as a delimiter, which is
> þ  (thorn)
>
> For example:
>
> col1þcol2þcol3
>
>   Þ has a value of UTF-8(hex) 0xC3 0xBE (c3be)
>
> And I have tried the following but no luck:
> create table test(col1 string, col2 string, col3 string) row format
> delimited fields terminated by '\c3be';
>
> I'd appreciate your help! Thanks in advance.
>
> --ken
>
>
>
> ***************************************************************************
> The information contained in this communication is confidential, is
> intended only for the use of the recipient named above, and may be legally
> privileged.
>
> If the reader of this message is not the intended recipient, you are
> hereby notified that any dissemination, distribution or copying of this
> communication is strictly prohibited.
>
> If you have received this communication in error, please resend this
> communication to the sender and delete the original message or any copy
> of it from your computer system.
>
> Thank You.
>
> ****************************************************************************
>



-- 
*Dean Wampler, Ph.D.*
thinkbiganalytics.com
+1-312-339-1330