You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@sqoop.apache.org by Artem Ervits <ar...@nyp.org> on 2012/11/15 20:17:34 UTC

Append to file

Hello community,


1.       I'd like to know what people are doing when they need to do a daily Sqoop import, incremental or not, to append the data to an existing file. One possible solution I can think of is Hadoop archive files. Does Sqoop support an append to an existing file or HAR is the way to go in my scenario? Also

2.       Another question I have is that if I sqoop import using -as-avrodatafile in conjunction with -compress, the result is returned with .avro extension. If I wasn't using compression, I can cat the file, how may I view the file when it's compressed? I have three files with .avro extension, I tried to run gunzip -d on them but it won't recognize the codec.

Thank you.

Artem Ervits
Data Analyst
New York Presbyterian Hospital



--------------------

This electronic message is intended to be for the use only of the named recipient, and may contain information that is confidential or privileged.  If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution or use of the contents of this message is strictly prohibited.  If you have received this message in error or are not the named recipient, please notify us immediately by contacting the sender at the electronic mail address noted above, and delete and destroy all copies of this message.  Thank you.




--------------------

This electronic message is intended to be for the use only of the named recipient, and may contain information that is confidential or privileged.  If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution or use of the contents of this message is strictly prohibited.  If you have received this message in error or are not the named recipient, please notify us immediately by contacting the sender at the electronic mail address noted above, and delete and destroy all copies of this message.  Thank you.




Re: Append to file

Posted by Jarek Jarcec Cecho <ja...@apache.org>.
Hi Artem,
thank you for your questions, let me provide comment them:

1) HDFS to my best knowledge do not support appending to existing files and so your daily imports will need to create new additional files. However as all other parts of the framework were designed to work with this, it's should not be an issue. Would you mind sharing your use case? Why do you need to append to already existing files? Also Sqoop is not supporting Hadoop archives (HAR) natively, however you can do several imports and then convert them to HAR manually.

2) Parameter --as-avrodatafile will create avro container files that are not text files. They are not text files even when you do not use --compress argument. Thus you can't read them directly. You need to use Avro libraries to get the data. I would recommend going to Avro official pages for more information [1]. I would recommend removing --as-avrodatafile parameter in case that you need to have simple text files.

Jarcec

Links:
1: http://avro.apache.org/

On Thu, Nov 15, 2012 at 07:17:34PM +0000, Artem Ervits wrote:
> Hello community,
> 
> 
> 1.       I'd like to know what people are doing when they need to do a daily Sqoop import, incremental or not, to append the data to an existing file. One possible solution I can think of is Hadoop archive files. Does Sqoop support an append to an existing file or HAR is the way to go in my scenario? Also
> 
> 2.       Another question I have is that if I sqoop import using -as-avrodatafile in conjunction with -compress, the result is returned with .avro extension. If I wasn't using compression, I can cat the file, how may I view the file when it's compressed? I have three files with .avro extension, I tried to run gunzip -d on them but it won't recognize the codec.
> 
> Thank you.
> 
> Artem Ervits
> Data Analyst
> New York Presbyterian Hospital
> 
> 
> 
> --------------------
> 
> This electronic message is intended to be for the use only of the named recipient, and may contain information that is confidential or privileged.  If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution or use of the contents of this message is strictly prohibited.  If you have received this message in error or are not the named recipient, please notify us immediately by contacting the sender at the electronic mail address noted above, and delete and destroy all copies of this message.  Thank you.
> 
> 
> 
> 
> --------------------
> 
> This electronic message is intended to be for the use only of the named recipient, and may contain information that is confidential or privileged.  If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution or use of the contents of this message is strictly prohibited.  If you have received this message in error or are not the named recipient, please notify us immediately by contacting the sender at the electronic mail address noted above, and delete and destroy all copies of this message.  Thank you.
> 
> 
>