You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@oozie.apache.org by Toby Evans <to...@gmail.com> on 2014/12/03 15:24:32 UTC

Oozie Hive action on AWS - unpredictable ip sources break the job

Hi there,

I've been having a few days of unalloyed torture getting Hive jobs to run
via Oozie on an AWS 5 machine cluster. The simplest job that involved the
live metastore succeeds or fails unpredictably. The error messages wasn't
too descriptive:

    Hive failed, error message[Main class
[org.apache.oozie.action.hadoop.HiveMain], exit code [1]]

After a lot of fun changing just about every imaginable setting, I studied
hivemetastore.log carefully (we have mySQL as the metastore) and realised
that every successful request came from 172.31.40.3.  Unsuccessful requests
came from 172.31.40.2,172.31.40.4 and 172.31.40.5 . The Hive console app
makes requests without problems on  172.31.40.1

This is getting somewhere after nearly week of having no idea whatsover is
going on. The question is now, is there a config setting somewhere I need
to change to allow all requests from 172.31.40.1-5 in? Or could I funnel
Oozie requests solely through 172.31.40.1 or 172.31.40.3, not  using 2/4/5.

Why would only 172.31.40.1 and 172.31.40.3 work?  There must be some
process by whereby jobs submitted to Oozie then get handed over to Hive.
There are 5 machines in the cluster, which matches the pattern of ip
addresses, and there seems to be a situation where the Oozie jobs are
randomly allocated to a machine in the cluster, which then contacts Hive
and  attempts to run  the query. The problem seems to be that the requests
only work when they come from a specific machine

all ideas and suggestions warmly received.

many thanks

Toby

Re: Oozie Hive action on AWS - unpredictable ip sources break the job

Posted by Mona Chitnis <ch...@yahoo-inc.com.INVALID>.
Hi Toby,
I'm not at all familiar with AWS and its configuration oddities and you could give me some info around your Oozie server deployment on that. But one thing worth checking is the 'hadoop.proxyuser.hosts' setting in Hadoop configuration file 'core-site.xml'. When jobs are submitted by Oozie, the Oozie server machine host goes in this 'hosts' list, to allow Oozie to submit jobs as a proxyuser on behalf of the user. From your observations it looks like only certain IP hosts are listed there and not all.
--Mona
 

     On Wednesday, December 3, 2014 6:26 AM, Toby Evans <to...@gmail.com> wrote:
   

 Hi there,

I've been having a few days of unalloyed torture getting Hive jobs to run
via Oozie on an AWS 5 machine cluster. The simplest job that involved the
live metastore succeeds or fails unpredictably. The error messages wasn't
too descriptive:

    Hive failed, error message[Main class
[org.apache.oozie.action.hadoop.HiveMain], exit code [1]]

After a lot of fun changing just about every imaginable setting, I studied
hivemetastore.log carefully (we have mySQL as the metastore) and realised
that every successful request came from 172.31.40.3.  Unsuccessful requests
came from 172.31.40.2,172.31.40.4 and 172.31.40.5 . The Hive console app
makes requests without problems on  172.31.40.1

This is getting somewhere after nearly week of having no idea whatsover is
going on. The question is now, is there a config setting somewhere I need
to change to allow all requests from 172.31.40.1-5 in? Or could I funnel
Oozie requests solely through 172.31.40.1 or 172.31.40.3, not  using 2/4/5.

Why would only 172.31.40.1 and 172.31.40.3 work?  There must be some
process by whereby jobs submitted to Oozie then get handed over to Hive.
There are 5 machines in the cluster, which matches the pattern of ip
addresses, and there seems to be a situation where the Oozie jobs are
randomly allocated to a machine in the cluster, which then contacts Hive
and  attempts to run  the query. The problem seems to be that the requests
only work when they come from a specific machine

all ideas and suggestions warmly received.

many thanks

Toby