You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by "Wu, James C." <Ja...@disney.com> on 2013/01/25 00:12:34 UTC

about hive limit optimization settings

Hi,

Do anyone know the meaning of these hive settings? The description of them are not clear to me. If someone can give me an example of how they shall be used, it would be great!

<property>
  <name>hive.limit.row.max.size</name>
  <value>100000</value>
  <description>When trying a smaller subset of data for simple LIMIT, how much size we need to guarantee
   each row to have at least.</description>
</property>

<property>
  <name>hive.limit.optimize.limit.file</name>
  <value>10</value>
  <description>When trying a smaller subset of data for simple LIMIT, maximum number of files we can
   sample.</description>
</property>

Regards,

James



Re: about hive limit optimization settings

Posted by Nitin Pawar <ni...@gmail.com>.
hive has a feature for data sampling where you actually don't read the
entire table but sample of the table.
I suppose these parameters belong to those queries.

more you can read at
https://cwiki.apache.org/Hive/languagemanual-sampling.html


On Fri, Jan 25, 2013 at 4:42 AM, Wu, James C. <Ja...@disney.com> wrote:

> Hi,****
>
> ** **
>
> Do anyone know the meaning of these hive settings? The description of them
> are not clear to me. If someone can give me an example of how they shall be
> used, it would be great!****
>
> ** **
>
> <property>****
>
>   <name>hive.limit.row.max.size</name>****
>
>   <value>100000</value>****
>
>   <description>When trying a smaller subset of data for simple LIMIT, how
> much size we need to guarantee****
>
>    each row to have at least.</description>****
>
> </property>****
>
> ** **
>
> <property>****
>
>   <name>hive.limit.optimize.limit.file</name>****
>
>   <value>10</value>****
>
>   <description>When trying a smaller subset of data for simple LIMIT,
> maximum number of files we can****
>
>    sample.</description>****
>
> </property>****
>
> ** **
>
> Regards,****
>
> ** **
>
> James****
>
> ** **
>
> ** **
>



-- 
Nitin Pawar

Re: about hive limit optimization settings

Posted by Abdelrhman Shettia <as...@hortonworks.com>.
Hi James. 

Basically if we have a table called table A which is mapped to a directory in hive /data/a . And n is the number of the files under /data/a  with each row size s.

hive -e "select * from a limit 10"

To show the result very fast 

hive.limit.optimize.limit.file < n 
in this case will be 10 

and the The  hive.limit.row.max.size = s which may vary according the actual data. 

Hope this helps. 





The  hive.limit.row.max.size control the size of each 


Hortonworks, Inc.
Technical Support Engineer
Abdelrahman Shettia
ashettia@hortonworks.com
Office phone: (708) 689-9609
How am I doing?   Please feel free to provide feedback to my manager Rick Morris at rick@hortonworks.com


On Jan 24, 2013, at 3:12 PM, "Wu, James C." <Ja...@disney.com> wrote:

> Hi,
>  
> Do anyone know the meaning of these hive settings? The description of them are not clear to me. If someone can give me an example of how they shall be used, it would be great!
>  
> <property>
>   <name>hive.limit.row.max.size</name>
>   <value>100000</value>
>   <description>When trying a smaller subset of data for simple LIMIT, how much size we need to guarantee
>    each row to have at least.</description>
> </property>
>  
> <property>
>   <name>hive.limit.optimize.limit.file</name>
>   <value>10</value>
>   <description>When trying a smaller subset of data for simple LIMIT, maximum number of files we can
>    sample.</description>
> </property>
>  
> Regards,
>  
> James
>  
>