You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Raj Hadoop <ha...@yahoo.com> on 2013/11/13 16:13:50 UTC
Compression for a HDFS text file - Hive External Partition Table
Hi ,
1) My requirement is to load a file ( a tar.gz file which has multiple tab separated values files and one file is the main file which has huge data – about 10 GB per day) to an externally partitioned hive table.
2) What I am doing is I have automated the process by extracting the tar.gz file and get the main data file (10GB text file) and then loading to a hdfs file as text file.
3) I want to compress the files. What is the procedure for it?
4) Do I need to use any utility to compress the hit data file before loading to HDFS? And also should I define an Input Structure for HDFS File format through a Java Program?
Regards,
Raj