You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hadoop.apache.org by Manish <ma...@rocketmail.com> on 2012/09/30 14:07:30 UTC

RE: zip file or tar file cosumption

> I am getting below error when loading zip file 
> 
> 
> Driver returned: 9.  Errors: Hive history file=/tmp/hue/hive_job_log_hue_201209300434_1768401171.txt
> Loading data to table default.pageview_zip
> Failed with exception Error moving: hdfs://localhost:54310/user/manish/input/zip/11sep12.zip into: /user/manish/input/zip
> FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask
> 
> My load statement is: LOAD DATA INPATH '/user/manish/input/11sep12.zip' OVERWRITE INTO TABLE `pageview_zip`
> 
> Table definition: 
> CREATE external TABLE pageview_zip
> (
> C_0 STRING,
> C_1 STRING,
> C_7 MAP<STRING,STRING>,
> C_8 STRING,
> C_13 MAP<STRING,STRING>,
> C_21 STRING
> )
> COMMENT 'Page View'
> ROW FORMAT DELIMITED FIELDS TERMINATED BY ' ' COLLECTION ITEMS TERMINATED BY ';' MAP KEYS TERMINATED BY '=' 
> STORED AS TEXTFILE LOCATION '/user/manish/input/zip'
> 
> Thank You,
> Manish
> 
> 
> 
> On Thu, 2012-09-27 at 11:11 +0000, Savant, Keshav wrote: 
> 
> > True Manish.
> > 
> >  
> > 
> > Keshav C Savant 
> > 
> > 
> >  
> > 
> > From: Manish.Bhoge [mailto:Manish.Bhoge@target.com] 
> > Sent: Thursday, September 27, 2012 4:26 PM
> > To: user@hive.apache.org; manishbhoge@rocketmail.com
> > Subject: RE: zip file or tar file cosumption
> > 
> > 
> >  
> > 
> > Thanks Savant. I believe this will hold good for .zip file also.
> > 
> >  
> > 
> > Thank You,
> > 
> > Manish.
> > 
> >  
> > 
> > From: Savant, Keshav [mailto:Keshav.C.Savant@fisglobal.com] 
> > Sent: Thursday, September 27, 2012 10:19 AM
> > To: user@hive.apache.org; manishbhoge@rocketmail.com
> > Subject: RE: zip file or tar file cosumption
> > 
> > 
> >  
> > 
> > Manish the table that has been created for zipped text files should
> > be defined as sequence file, for example
> > 
> >  
> > 
> > CREATE TABLE my_table_zip(col1 STRING,col2 STRING) ROW FORMAT
> > DELIMITED FIELDS TERMINATED BY ',' stored as sequencefile;
> > 
> >  
> > 
> > After this you can use regular load command to load these files, for
> > example
> > 
> >  
> > 
> > load data local inpath 'path-to-csv-file.gz' into table
> > my_table_zip;
> > 
> >  
> > 
> > hope this helps
> > 
> >  
> > 
> > Keshav C Savant 
> > 
> > 
> >  
> > 
> > From: Manish Bhoge [mailto:manishbhoge@rocketmail.com] 
> > Sent: Wednesday, September 26, 2012 9:43 PM
> > To: user@hive.apache.org
> > Subject: Re: zip file or tar file cosumption
> > 
> > 
> >  
> > 
> > Hi Richin,
> > 
> > Thanks! Yes this is what I wanted to understand how to load zip file
> > to Hive table. Now, I'll try this option.
> > 
> > Thank You,
> > Manish. 
> > 
> > Sent from my BlackBerry, pls excuse typo
> > 
> > 
> > 
> >                                   
> > ____________________________________________________________________
> > 
> > From:<ri...@nokia.com> 
> > 
> > 
> > Date:Wed, 26 Sep 2012 14:51:39 +0000
> > 
> > 
> > To:<us...@hive.apache.org>
> > 
> > 
> > ReplyTo:user@hive.apache.org 
> > 
> > 
> > Subject:RE: zip file or tar file cosumption
> > 
> > 
> >  
> > 
> > 
> > You are right Chuck. I thought his question was how to use zip files
> > or any compressed files in Hive tables.
> > 
> >  
> > 
> > Yeah, seems like you can’t do that
> > see:http://mail-archives.apache.org/mod_mbox/hive-user/201203.mbox/%
> > 3CCAENxBwxkF--3PzCkpz1HX21=Gb9YVASr2JL0U3yUL2tfGu010Q@mail.gmail.com
> > %3E
> > 
> > But you can always compress your files in gzip format and they
> > should be good to go.
> > 
> >  
> > 
> > Richin
> > 
> >  
> > 
> > From: ext Connell, Chuck [mailto:Chuck.Connell@nuance.com] 
> > Sent: Wednesday, September 26, 2012 10:44 AM
> > To: user@hive.apache.org
> > Subject: RE: zip file or tar file cosumption
> > 
> > 
> >  
> > 
> > But TEXTFILE in Hive always has newline as the record delimiter. How
> > could this possibly work with a zip/tar file that can contain ASCII
> > 10 characters at random locations, and certainly does not have ASCII
> > 10 at the end of each data record?
> > 
> >  
> > 
> > Chuck Connell
> > 
> > Nuance R&D Data Team
> > 
> > Burlington, MA
> > 
> >  
> > 
> >  
> > 
> > 
> > From:richin.jain@nokia.com [mailto:richin.jain@nokia.com] 
> > Sent: Wednesday, September 26, 2012 10:14 AM
> > To: user@hive.apache.org; manishbhoge@rocketmail.com
> > Subject: RE: zip file or tar file cosumption
> > 
> > 
> >  
> > 
> > Hi Manish,
> > 
> >  
> > 
> > If you have your zip file at location -  /home/manish/zipfile, you
> > can just point your external table to that location like
> > 
> > CREATE EXTERNAL TABLE manish_test (field1 string, field2 string) ROW
> > FORMAT DELIMITED FIELDS TERMINATED BY <your_column_delimiter> STORED
> > AS TEXTFILE LOCATION ‘/home/manish/zipfile’;
> > 
> >  
> > 
> > OR
> > 
> >  
> > 
> > If you already have external table pointing to a certain location
> > you can load this zip file into your table as
> > 
> > LOAD DATA INPATH ‘/home/manish/zipfile’ INTO TABLE manish_test;
> > 
> >  
> > 
> > Hope this helps.
> > 
> >  
> > 
> > Richin
> > 
> >  
> > 
> > From: ext Manish Bhoge [mailto:manishbhoge@rocketmail.com] 
> > Sent: Wednesday, September 26, 2012 9:13 AM
> > To: user@hive.apache.org
> > Subject: Re: zip file or tar file cosumption
> > 
> > 
> >  
> > 
> > Hi Savant,
> > 
> > Got it. But I still need to understand that how to load zip? Can I
> > directly use zip file in external table. can u pls help to get the
> > load statement.
> > 
> > Sent from my BlackBerry, pls excuse typo
> > 
> > 
> > 
> >                                   
> > ____________________________________________________________________
> > 
> > From:"Savant, Keshav" <Ke...@fisglobal.com>
> > 
> > 
> > Date:Wed, 26 Sep 2012 12:25:38 +0000
> > 
> > 
> > To:user@hive.apache.org<us...@hive.apache.org>
> > 
> > 
> > ReplyTo:user@hive.apache.org
> > 
> > 
> > Cc:Manish.Bhoge@target.com<Ma...@target.com>;
> > Chuck.Connell@nuance.com<Ch...@nuance.com>
> > 
> > 
> > Subject:RE: zip file or tar file cosumption
> > 
> > 
> >  
> > 
> > 
> > Another solution would be
> > 
> >  
> > 
> > Using shell script do following
> > 
> > 1.      unzip txt files, 
> > 
> > 2.      one by one merge those 50 (or N number of) text files into
> > one text file,
> > 
> > 3.      then the zip/tar that bigger text file,
> > 
> > 4.      then that big zip/tar file can be uploaded into hive.
> > 
> >  
> > 
> > Keshav C Savant 
> > 
> > 
> >  
> > 
> > From: Connell, Chuck [mailto:Chuck.Connell@nuance.com] 
> > Sent: Wednesday, September 26, 2012 4:04 PM
> > To: user@hive.apache.org
> > Subject: RE: zip file or tar file cosumption
> > 
> > 
> >  
> > 
> > This could be a problem. Hive uses newline as the record separator.
> > A ZIP file will certainly newline characters. So I doubt this is
> > possible.
> > 
> > BUT, I would like to hear from anyone who has solved the "newline is
> > always a record separator" problem, because we ran into it for
> > another type of compressed file.
> > 
> > Chuck
> > 
> > 
> >                                   
> > ____________________________________________________________________
> > 
> > From: Manish.Bhoge [Manish.Bhoge@target.com]
> > Sent: Wednesday, September 26, 2012 3:17 AM
> > To: user@hive.apache.org
> > Subject: zip file or tar file cosumption
> > 
> > 
> > Hivers,
> > 
> >  
> > 
> > I want to understand that would it be possible to utilize zip/tar
> > files directly into Hive. All the files has similar schema
> > (structure).  Say 50 *.txt files are zipped into a single zip file
> > can we load data directly from this zip file OR should we need to
> > unzip first?
> > 
> >  
> > 
> > Thanks & Regards
> > 
> > Manish Bhoge | Technical Architect ¤TargetDW/BI|( +919379850010 (M)
> > Ext: 5691 VOIP: 22165 |! “Excellence is not a skill, It is an
> > attitude.” MySite
> > 
> >  
> > 
> > 
> > _____________
> > The information contained in this message is proprietary and/or
> > confidential. If you are not the intended recipient, please: (i)
> > delete the message and all copies; (ii) do not disclose, distribute
> > or use the message in any manner; and (iii) notify the sender
> > immediately. In addition, please be aware that any message addressed
> > to our domain is subject to archiving and review by persons other
> > than the intended recipient. Thank you.
> > 
> > 
> > _____________
> > The information contained in this message is proprietary and/or
> > confidential. If you are not the intended recipient, please: (i)
> > delete the message and all copies; (ii) do not disclose, distribute
> > or use the message in any manner; and (iii) notify the sender
> > immediately. In addition, please be aware that any message addressed
> > to our domain is subject to archiving and review by persons other
> > than the intended recipient. Thank you.
> > 
> > 
> > _____________
> > The information contained in this message is proprietary and/or
> > confidential. If you are not the intended recipient, please: (i)
> > delete the message and all copies; (ii) do not disclose, distribute
> > or use the message in any manner; and (iii) notify the sender
> > immediately. In addition, please be aware that any message addressed
> > to our domain is subject to archiving and review by persons other
> > than the intended recipient. Thank you.
> > 
> 
> 
>