You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hive.apache.org by Charles Givre <cg...@apache.org> on 2020/02/11 01:54:37 UTC

Query Failures

Hello Everyone!
I recently joined a project that has a Hive/Impala installation and we are
experience a significant number of query failures.  We are using an older
version of Hive, and unfortunately there's nothing iI can do about that,
but I'm wondering is how I can make Hive do better with queries to give our
users a better experience.

For example, I can execute a basic SELECT * query or SELECT <fields> query
without issues.

However, if I attempt to:
1.  Add filters
2.  Do a SELECT DISTINCT
3.  Perform basic aggregation

I get errors like this: Execution Error, return code 1 from
org.apache.hadoop.hive.ql.exec.mr.MapRedTask.

Could someone point me to some good guides for querying Hive and/or
assisting my engineers in preventing these errors?
Thanks,

Re: Query Failures

Posted by David Mollitor <da...@gmail.com>.

https://community.cloudera.com/t5/Support-Questions/Map-and-Reduce-Error-Java-heap-space/td-p/45874

On Fri, Feb 14, 2020, 6:58 PM David Mollitor <da...@gmail.com> wrote:

> Hive has many optimizations.  One is that it will load the data directly
> from storage (HDFS) if it's a trivial query.  For example:
>
> Select * from table limit 10;
>
> In natural language it says "give me any ten rows (if available) from the
> table."  You don't need the overhead of launching a full mapreduce job for
> this.  Just read the rows from the file directly.
>
> Adding additional predicates on the query requires a mapreduce job to do
> the heavy lifting.  The error message you're getting is probably the result
> of a failed mapreduce job.  Nine times out of ten, the problem is that the
> mappers/reducers are not granted enough memory for their YARN containers.
>
> On Tue, Feb 11, 2020, 10:41 AM Pau Tallada <ta...@pic.es> wrote:
>
>> Hi,
>>
>> Do you have more complete tracebacks?
>>
>> Missatge de Charles Givre <cg...@apache.org> del dia dt., 11 de febr.
>> 2020 a les 2:54:
>>
>>> Hello Everyone!
>>> I recently joined a project that has a Hive/Impala installation and we
>>> are experience a significant number of query failures.  We are using an
>>> older version of Hive, and unfortunately there's nothing iI can do about
>>> that, but I'm wondering is how I can make Hive do better with queries to
>>> give our users a better experience.
>>>
>>> For example, I can execute a basic SELECT * query or SELECT <fields>
>>> query without issues.
>>>
>>> However, if I attempt to:
>>> 1.  Add filters
>>> 2.  Do a SELECT DISTINCT
>>> 3.  Perform basic aggregation
>>>
>>> I get errors like this: Execution Error, return code 1 from
>>> org.apache.hadoop.hive.ql.exec.mr.MapRedTask.
>>>
>>> Could someone point me to some good guides for querying Hive and/or
>>> assisting my engineers in preventing these errors?
>>> Thanks,
>>>
>>>
>>
>> --
>> ----------------------------------
>> Pau Tallada Crespí
>> Dep. d'Astrofísica i Cosmologia
>> Port d'Informació Científica (PIC)
>> Tel: +34 93 170 2729
>> ----------------------------------
>>
>>

Re: Query Failures

Posted by David Mollitor <da...@gmail.com>.

Hive has many optimizations.  One is that it will load the data directly
from storage (HDFS) if it's a trivial query.  For example:

Select * from table limit 10;

In natural language it says "give me any ten rows (if available) from the
table."  You don't need the overhead of launching a full mapreduce job for
this.  Just read the rows from the file directly.

Adding additional predicates on the query requires a mapreduce job to do
the heavy lifting.  The error message you're getting is probably the result
of a failed mapreduce job.  Nine times out of ten, the problem is that the
mappers/reducers are not granted enough memory for their YARN containers.

On Tue, Feb 11, 2020, 10:41 AM Pau Tallada <ta...@pic.es> wrote:

> Hi,
>
> Do you have more complete tracebacks?
>
> Missatge de Charles Givre <cg...@apache.org> del dia dt., 11 de febr.
> 2020 a les 2:54:
>
>> Hello Everyone!
>> I recently joined a project that has a Hive/Impala installation and we
>> are experience a significant number of query failures.  We are using an
>> older version of Hive, and unfortunately there's nothing iI can do about
>> that, but I'm wondering is how I can make Hive do better with queries to
>> give our users a better experience.
>>
>> For example, I can execute a basic SELECT * query or SELECT <fields>
>> query without issues.
>>
>> However, if I attempt to:
>> 1.  Add filters
>> 2.  Do a SELECT DISTINCT
>> 3.  Perform basic aggregation
>>
>> I get errors like this: Execution Error, return code 1 from
>> org.apache.hadoop.hive.ql.exec.mr.MapRedTask.
>>
>> Could someone point me to some good guides for querying Hive and/or
>> assisting my engineers in preventing these errors?
>> Thanks,
>>
>>
>
> --
> ----------------------------------
> Pau Tallada Crespí
> Dep. d'Astrofísica i Cosmologia
> Port d'Informació Científica (PIC)
> Tel: +34 93 170 2729
> ----------------------------------
>
>

Re: Query Failures

Posted by Pau Tallada <ta...@pic.es>.

Hi,

Do you have more complete tracebacks?

Missatge de Charles Givre <cg...@apache.org> del dia dt., 11 de febr. 2020
a les 2:54:

> Hello Everyone!
> I recently joined a project that has a Hive/Impala installation and we are
> experience a significant number of query failures.  We are using an older
> version of Hive, and unfortunately there's nothing iI can do about that,
> but I'm wondering is how I can make Hive do better with queries to give our
> users a better experience.
>
> For example, I can execute a basic SELECT * query or SELECT <fields> query
> without issues.
>
> However, if I attempt to:
> 1.  Add filters
> 2.  Do a SELECT DISTINCT
> 3.  Perform basic aggregation
>
> I get errors like this: Execution Error, return code 1 from
> org.apache.hadoop.hive.ql.exec.mr.MapRedTask.
>
> Could someone point me to some good guides for querying Hive and/or
> assisting my engineers in preventing these errors?
> Thanks,
>
>

-- 
----------------------------------
Pau Tallada Crespí
Dep. d'Astrofísica i Cosmologia
Port d'Informació Científica (PIC)
Tel: +34 93 170 2729
----------------------------------