You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Shouguo Li <th...@gmail.com> on 2011/11/09 00:19:57 UTC

Re: split into less files

i think that has to do with your configured block size, check what's your
value for dfs.block.size in /hdfs-site.xml
but just curious, why would number of files matter for your use case?


On Fri, Oct 21, 2011 at 1:18 AM, Vikas Srivastava <
vikas.srivastava@one97.net> wrote:

> Hey All,
>
>
> i have an issue like i got a table having single partition but in that
> partition say around 100 200mb files  when i overwrite this into other
> table its make 100 files of 20 mb(compressed) what i want is that it should
> make only 1 or 2 or 10 file of 200mb or 100mb
>
>
> means after overwrite its should make less no of file as compare to non
> compressed.
>
>
>
>
> --
> With Regards
> Vikas Srivastava
>
> DWH & Analytics Team
> Mob:+91 9560885900
> One97 | Let's get talking !
>
>

Re: split into less files

Posted by Matt Tucker <ma...@gmail.com>.
It sounds like you want to look at setting hive.merge.mapredfiles to true in your hive-site.xml.

Just be aware that it will likely add another map step to your jobs to consolidate the files.

Matt Tucker



On Nov 8, 2011, at 6:19 PM, Shouguo Li <th...@gmail.com> wrote:

> i think that has to do with your configured block size, check what's your value for dfs.block.size in /hdfs-site.xml    
> but just curious, why would number of files matter for your use case?
> 
> 
> On Fri, Oct 21, 2011 at 1:18 AM, Vikas Srivastava <vi...@one97.net> wrote:
> Hey All,
> 
> 
> i have an issue like i got a table having single partition but in that partition say around 100 200mb files  when i overwrite this into other table its make 100 files of 20 mb(compressed) what i want is that it should make only 1 or 2 or 10 file of 200mb or 100mb
> 
> 
> means after overwrite its should make less no of file as compare to non compressed. 
> 
> 
> 
> 
> -- 
> With Regards
> Vikas Srivastava
> 
> DWH & Analytics Team
> Mob:+91 9560885900
> One97 | Let's get talking !
> 
>