You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by Ning Zhang <nz...@fb.com> on 2011/07/01 00:46:41 UTC
Re: Small file problem and GenMRFileSink1
If you are using hive trunk and your table is stored in RCFile format, you can run
alter table src_rc_merge_test concatenate;
On Jun 30, 2011, at 9:53 AM, David Ginzburg wrote:
>
>
> Hi,
> I'm not sure weather this belongs in the hive-dev or hive-user.
> I have a folder with many small files.
> I would like to reduce the number of files the way hive merges output .
> I tried to understand from the source of org.apache.hadoop.hive.ql.optimizer.GenMRFileSink1 how to leverage the API to submit a job
> that merges output files.
> I think I was able to identify:
> private void createMergeJob(FileSinkOperator fsOp, GenMRProcContext ctx, String finalName)
> throws SemanticException
> As the entry point to the logic that performs the operation, but I did not find documentation as to how to use it
>
> Is there an example that simulates the use of this API call?
>
>
>
>
>