You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hive.apache.org by shashwat shriparv <dw...@gmail.com> on 2012/10/29 16:20:53 UTC

Query is taking long time to process and return the result

I am trying to run hive query on huge amount of data(almost in half of
petabyte), and these query running map reduce internally. it takes very
long time to generate the data set(map reduce to complete) what
optimization mechanism for hive and Hadoop i can use to make these query
faster, one more important question i have does the amount of disk
available for map reduce or in /tmp directory is important for faster map
reduce?

-- 


∞
Shashwat Shriparv

Re: Query is taking long time to process and return the result

Posted by Dean Wampler <de...@thinkbiganalytics.com>.

It's impossible to answer such a vague, open-ended question without
specifics. What's the query, for example? How is the data organized (e.g.,
is it partitioned)? What are the cluster characteristics?

On Mon, Oct 29, 2012 at 10:20 AM, shashwat shriparv <
dwivedishashwat@gmail.com> wrote:

> I am trying to run hive query on huge amount of data(almost in half of
> petabyte), and these query running map reduce internally. it takes very
> long time to generate the data set(map reduce to complete) what
> optimization mechanism for hive and Hadoop i can use to make these query
> faster, one more important question i have does the amount of disk
> available for map reduce or in /tmp directory is important for faster map
> reduce?
>
> --
>
>
> ∞
> Shashwat Shriparv
>
>
>


-- 
*Dean Wampler, Ph.D.*
thinkbiganalytics.com
+1-312-339-1330