You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Raj Hadoop <ha...@yahoo.com> on 2013/07/08 22:52:05 UTC
Special characters in web log file causing issues
Hi ,
The log file that I am trying to load throuh Hive has some special characters
The field is shown below and the special characters ¿¿are also shown.
Shockwave Flash
in;Motive ManagementPlug-in;Google Update;Java(TM)Platform SE 7U21;McAfee SiteAdvisor;McAfee Virtual Technician;Windows Live¿¿ Photo Gallery;McAfee SecurityCenter;Silverlig
The above is causing the record to be terminated and loading another line. How can I avoid this type of issues and how to load the proper data ? Any suggestions please.
Thanks,
Raj;Chrome Remote Desktop Viewer;NativeClient;Chrome PDF Viewer;Adobe Acrobat;Microsoft Office 2010;Motive Plug-
Re: Special characters in web log file causing issues
Posted by Nitin Pawar <ni...@gmail.com>.
yes Raj,
thats a unix command
On Tue, Jul 9, 2013 at 6:48 AM, Hadoop Raj <ha...@yahoo.com> wrote:
> Hi Sanjay,
>
> Is that a unix trap command or any other thing? Please let me know.
>
>
> Sent from my iPhone
>
> On Jul 8, 2013, at 7:46 PM, Sanjay Subramanian <
> Sanjay.Subramanian@wizecommerce.com> wrote:
>
> U may have to remove non-printable chars first, save an intermediate file
> and then load into Hive
>
> tr -cd '[:print:]\r\n\t'
>
> Or if u have *strings* function that will only output printable chars
>
>
> From: Raj Hadoop <ha...@yahoo.com>
> Reply-To: "user@hive.apache.org" <us...@hive.apache.org>, Raj Hadoop <
> hadoopraj@yahoo.com>
> Date: Monday, July 8, 2013 1:52 PM
> To: Hive <us...@hive.apache.org>
> Subject: Special characters in web log file causing issues
>
>
> Hi ,
>
> The log file that I am trying to load throuh Hive has some special
> characters
>
> The field is shown below and the special characters *¿¿***are also shown.
>
> Shockwave Flash;Chrome Remote Desktop Viewer;Native Client;Chrome
> PDF Viewer;Adobe Acrobat;Microsoft Office 2010;Motive Plug-
> in;Motive Management Plug-in;Google Update;Java(TM) Platform SE 7 U21
> ;McAfee SiteAdvisor;McAfee Virtual Technician;Windows Live*¿¿ *Photo
> Gallery;McAfee SecurityCenter;Silverlig
>
>
> The above is causing the record to be terminated and loading another
> line. How can I avoid this type of issues and how to load the proper data
> ? Any suggestions please.
>
> Thanks,
> Raj
>
> CONFIDENTIALITY NOTICE
> ======================
> This email message and any attachments are for the exclusive use of the
> intended recipient(s) and may contain confidential and privileged
> information. Any unauthorized review, use, disclosure or distribution is
> prohibited. If you are not the intended recipient, please contact the
> sender by reply email and destroy all copies of the original message along
> with any attachments, from your computer system. If you are the intended
> recipient, please be advised that the content of this message is subject to
> access, review and disclosure by the sender's Email System Administrator.
>
>
--
Nitin Pawar
Re: Special characters in web log file causing issues
Posted by Hadoop Raj <ha...@yahoo.com>.
Hi Sanjay,
Is that a unix trap command or any other thing? Please let me know.
Sent from my iPhone
On Jul 8, 2013, at 7:46 PM, Sanjay Subramanian <Sa...@wizecommerce.com> wrote:
> U may have to remove non-printable chars first, save an intermediate file and then load into Hive
>
> tr -cd '[:print:]\r\n\t'
>
> Or if u have strings function that will only output printable chars
>
>
> From: Raj Hadoop <ha...@yahoo.com>
> Reply-To: "user@hive.apache.org" <us...@hive.apache.org>, Raj Hadoop <ha...@yahoo.com>
> Date: Monday, July 8, 2013 1:52 PM
> To: Hive <us...@hive.apache.org>
> Subject: Special characters in web log file causing issues
>
>
> Hi ,
>
> The log file that I am trying to load throuh Hive has some special characters
>
> The field is shown below and the special characters ¿¿are also shown.
>
> Shockwave Flash;Chrome Remote Desktop Viewer;Native Client;Chrome PDF Viewer;Adobe Acrobat;Microsoft Office 2010;Motive Plug-
> in;Motive Management Plug-in;Google Update;Java(TM) Platform SE 7 U21;McAfee SiteAdvisor;McAfee Virtual Technician;Windows Live¿¿ Photo Gallery;McAfee SecurityCenter;Silverlig
>
>
> The above is causing the record to be terminated and loading another line. How can I avoid this type of issues and how to load the proper data ? Any suggestions please.
>
> Thanks,
> Raj
>
> CONFIDENTIALITY NOTICE
> ======================
> This email message and any attachments are for the exclusive use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message along with any attachments, from your computer system. If you are the intended recipient, please be advised that the content of this message is subject to access, review and disclosure by the sender's Email System Administrator.
Re: Special characters in web log file causing issues
Posted by Sanjay Subramanian <Sa...@wizecommerce.com>.
U may have to remove non-printable chars first, save an intermediate file and then load into Hive
tr -cd '[:print:]\r\n\t'
Or if u have strings function that will only output printable chars
From: Raj Hadoop <ha...@yahoo.com>>
Reply-To: "user@hive.apache.org<ma...@hive.apache.org>" <us...@hive.apache.org>>, Raj Hadoop <ha...@yahoo.com>>
Date: Monday, July 8, 2013 1:52 PM
To: Hive <us...@hive.apache.org>>
Subject: Special characters in web log file causing issues
Hi ,
The log file that I am trying to load throuh Hive has some special characters
The field is shown below and the special characters ¿¿are also shown.
Shockwave Flash;Chrome Remote Desktop Viewer;Native Client;Chrome PDF Viewer;Adobe Acrobat;Microsoft Office 2010;Motive Plug-
in;Motive Management Plug-in;Google Update;Java(TM) Platform SE 7 U21;McAfee SiteAdvisor;McAfee Virtual Technician;Windows Live¿¿ Photo Gallery;McAfee SecurityCenter;Silverlig
The above is causing the record to be terminated and loading another line. How can I avoid this type of issues and how to load the proper data ? Any suggestions please.
Thanks,
Raj
CONFIDENTIALITY NOTICE
======================
This email message and any attachments are for the exclusive use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message along with any attachments, from your computer system. If you are the intended recipient, please be advised that the content of this message is subject to access, review and disclosure by the sender's Email System Administrator.