You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by Ning Zhang <nz...@fb.com> on 2011/07/01 00:46:41 UTC

Re: Small file problem and GenMRFileSink1

If you are using hive trunk and your table is stored in RCFile format, you can run 

alter table src_rc_merge_test concatenate;


On Jun 30, 2011, at 9:53 AM, David Ginzburg wrote:

> 
> 
> Hi,
> I'm not sure weather this belongs in the hive-dev or hive-user.
> I have a folder with many small files.
> I would like to reduce the number of files the way hive merges output .
> I tried to understand from the source of org.apache.hadoop.hive.ql.optimizer.GenMRFileSink1 how to leverage the API to submit a job 
> that merges output files.
> I think I was able to identify:  
> private void createMergeJob(FileSinkOperator fsOp, GenMRProcContext ctx, String finalName)
> throws SemanticException 
> As the entry point to the logic that performs the operation, but I did not find documentation as to how to use it
> 
> Is there an example that simulates the use of this API call?
> 
> 
> 
> 
>