You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@atlas.apache.org by Russell Anderson <rg...@us.ibm.com> on 2016/12/26 15:29:06 UTC

Hive Hook - what is missing ?


Hi dev list,

I have built the Atlas7rc2 - have working the Hive Import, the Dashboard,
and generally things appear to be working as expected.

However, I have followed the instruction below but cannot seem to get the
Hive Hook to detect table hive creations to be added to the metadata /
lineage.

Can anyone suggest how to debug what could be missing regardless of
following the instructions below ? Any and all help appreciated !!!

Regards,

Russ.

=============================================================================

Hive Hook


Hive supports listeners on hive command execution using hive hooks. This is
used to add/update/remove entities in Atlas using the model defined in
org.apache.atlas.hive.model.HiveDataModelGenerator. The hook submits the
request to a thread pool executor to avoid blocking the command execution.
The thread submits the entities as message to the notification server and
atlas server reads these messages and registers the entities. Follow these
instructions in your hive set-up to add hive hook for Atlas:
      Set-up atlas hook in hive-site.xml of your hive configuration:
    <property>
      <name>hive.exec.post.hooks</name>
      <value>org.apache.atlas.hive.hook.HiveHook</value>
    </property>


    <property>
      <name>atlas.cluster.name</name>
      <value>primary</value>
    </property>


      Add 'export HIVE_AUX_JARS_PATH=<atlas package>/hook/hive' in
      hive-env.sh of your hive configuration
      Copy <atlas-conf>/atlas-application.properties to the hive conf
      directory.


The following properties in <atlas-conf>/atlas-application.properties
control the thread pool and notification details:
      atlas.hook.hive.synchronous - boolean, true to run the hook
      synchronously. default false. Recommended to be set to false to avoid
      delays in hive query completion.
      atlas.hook.hive.numRetries - number of retries for notification
      failure. default 3
      atlas.hook.hive.minThreads - core number of threads. default 5
      atlas.hook.hive.maxThreads - maximum number of threads. default 5
      atlas.hook.hive.keepAliveTime - keep alive time in msecs. default 10
      atlas.hook.hive.queueSize - queue size for the threadpool. default
      10000

Re: Hive Hook - Create table xxx AS - Hive Hook BUG

Posted by Russell Anderson <rg...@us.ibm.com>.
The following SQL causes the Hive Hook to fail with the error below:

create table bigsql.missy999 as select * from bigsql.brancha;

Question: Is this a new SEVERE BUG in Atlas7rc2 Or has this bug been fixed
in .8 but not in .7rc2 ?


Excerpt from /var/log/hive/hiveserver2.log
==========================================

2016-12-29 11:52:10,787 INFO  bridge.HiveMetaStoreBridge
(HiveMetaStoreBridge.ja\
va:createOrUpdateDBInstance(162)) - Importing objects from databaseName :
bigsql
2016-12-29 11:52:10,787 INFO  metastore.HiveMetaStore
(HiveMetaStore.java:logInf\
o(746)) - 6: get_table : db=bigsql tbl=missy999
2016-12-29 11:52:10,787 INFO  HiveMetaStore.audit
(HiveMetaStore.java:logAuditEv\
ent(371)) - ugi=admin   ip=unknown-ip-addr      cmd=get_table : db=bigsql
tbl=mi\
ssy999
2016-12-29 11:52:10,918 ERROR metadata.Hive (Hive.java:getTable(1119)) -
Table m\
issy999 not found: bigsql.missy999 table not found
2016-12-29 11:52:10,919 ERROR hook.HiveHook (HiveHook.java:run(184)) -
Atlas hoo\
k failed due to error
org.apache.hadoop.hive.ql.metadata.InvalidTableException: Table not found
missy9\
99
        at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:1120)
        at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:1090)
        at org.apache.atlas.hive.hook.HiveHook.createOrUpdateEntities
(HiveHook.j\
ava:503)
        at org.apache.atlas.hive.hook.HiveHook.createOrUpdateEntities
(HiveHook.j\
ava:524)
        at org.apache.atlas.hive.hook.HiveHook.processHiveEntity
(HiveHook.java:6\
05)
        at org.apache.atlas.hive.hook.HiveHook.registerProcess
(HiveHook.java:585\
)
        at org.apache.atlas.hive.hook.HiveHook.fireAndForget
(HiveHook.java:223)
        at org.apache.atlas.hive.hook.HiveHook.access$200(HiveHook.java:78)
        at org.apache.atlas.hive.hook.HiveHook$2.run(HiveHook.java:182)
        at java.util.concurrent.Executors$RunnableAdapter.call
(Executors.java:51\
1)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker
(ThreadPoolExecutor.\
java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run
(ThreadPoolExecutor\
.java:617)
        at java.lang.Thread.run(Thread.java:745)
2016-12-29 11:52:11,221 INFO  log.PerfLogger (PerfLogger.java:PerfLogEnd
(148)) -\
 </PERFLOG method=serializePlan start=1483041130037 end=1483041131221
duration=1\
184 from=org.apache.hadoop.hive.ql.exec.Utilities>
2016-12-29 11:52:11,348 ERROR mr.ExecDriver (ExecDriver.java:execute(400))
- yar\
n
2016-12-29 11:52:13,528 INFO  impl.TimelineClientImpl
(TimelineClientImpl.java:s\
erviceInit(296)) - Timeline service address:
http://biginsights.ibm.com:8188/ws/\
v1/timeline/
2016-12-29 11:52:13,641 INFO  client.RMProxy (RMProxy.java:createRMProxy
(98)) - \
Connecting to ResourceManager at biginsights.ibm.com/192.168.1.132:8050
2016-12-29 11:52:14,269 INFO  fs.FSStatsPublisher
(FSStatsPublisher.java:init(49\
)) - created :
hdfs://biginsights.ibm.com:8020/apps/hive/warehouse/bigsql.db/.hi\
ve-staging_hive_2016-12-29_11-52-08_283_558364161490028907-1/-ext-10002


Russell G. Anderson
 Senior Technical Consultant




From:	Hemanth Yamijala <yh...@gmail.com>
To:	dev@atlas.incubator.apache.org
Cc:	Russell Anderson/Worcester/IBM@IBMUS, Barry
            Rosen/Worcester/IBM@IBMUS
Date:	12/29/2016 11:17 AM
Subject:	Re: Hive Hook - what is missing ? [ RESOLVED }



No worries. Glad it helped.

On 27-Dec-2016 21:18, "Russell Anderson" <rg...@us.ibm.com> wrote:

Hi Memanth Yamijala,

The magic of your thoughts !!!

So I did as you suggested but then I noticed a couple of things that hadn't
seemed important at the time :

1) In my Hadoop there were multiple entries for PRE, and FAILURE for Hive
Hook. These were left as default - so i changed these to match the one
recommended in the documentation.

2) I re-copied the libraries i had built where my AUX_PATH was pointing to.

3) I changed the permissions to match all the other jars in the hive/lib
directory - which were 'root'.

4) I then carefully examined the var/log/hive/hiveserver2.log for Hive Hook
messages

I now have the Hive Hook working as desired !!!

Thank you for getting me to look and think about things....!!!

Russ.



[image: Inactive hide details for Hemanth Yamijala ---12/26/2016 08:11:29
PM---Hi, A few questions to help debug:]Hemanth Yamijala ---12/26/2016
08:11:29 PM---Hi, A few questions to help debug:

From: Hemanth Yamijala <yh...@gmail.com>
To: dev@atlas.incubator.apache.org
Date: 12/26/2016 08:11 PM
Subject: Re: Hive Hook - what is missing ?
------------------------------



Hi,

A few questions to help debug:

1) Are you using Hive CLI or Beeline? Best results would be to
configure these hooks with HiveServer2 and use Beeline.

2) Could you please check hive logs and look for AtlasHook related
messages?

Thanks
Hemanth

On Mon, Dec 26, 2016 at 8:59 PM, Russell Anderson <rg...@us.ibm.com> wrote:
>
>
> Hi dev list,
>
> I have built the Atlas7rc2 - have working the Hive Import, the Dashboard,
> and generally things appear to be working as expected.
>
> However, I have followed the instruction below but cannot seem to get the
> Hive Hook to detect table hive creations to be added to the metadata /
> lineage.
>
> Can anyone suggest how to debug what could be missing regardless of
> following the instructions below ? Any and all help appreciated !!!
>
> Regards,
>
> Russ.
>
> ============================================================
=================
>
> Hive Hook
>
>
> Hive supports listeners on hive command execution using hive hooks. This
is
> used to add/update/remove entities in Atlas using the model defined in
> org.apache.atlas.hive.model.HiveDataModelGenerator. The hook submits the
> request to a thread pool executor to avoid blocking the command
execution.
> The thread submits the entities as message to the notification server and
> atlas server reads these messages and registers the entities. Follow
these
> instructions in your hive set-up to add hive hook for Atlas:
>       Set-up atlas hook in hive-site.xml of your hive configuration:
>     <property>
>       <name>hive.exec.post.hooks</name>
>       <value>org.apache.atlas.hive.hook.HiveHook</value>
>     </property>
>
>
>     <property>
>       <name>atlas.cluster.name</name>
>       <value>primary</value>
>     </property>
>
>
>       Add 'export HIVE_AUX_JARS_PATH=<atlas package>/hook/hive' in
>       hive-env.sh of your hive configuration
>       Copy <atlas-conf>/atlas-application.properties to the hive conf
>       directory.
>
>
> The following properties in <atlas-conf>/atlas-application.properties
> control the thread pool and notification details:
>       atlas.hook.hive.synchronous - boolean, true to run the hook
>       synchronously. default false. Recommended to be set to false to
avoid
>       delays in hive query completion.
>       atlas.hook.hive.numRetries - number of retries for notification
>       failure. default 3
>       atlas.hook.hive.minThreads - core number of threads. default 5
>       atlas.hook.hive.maxThreads - maximum number of threads. default 5
>       atlas.hook.hive.keepAliveTime - keep alive time in msecs. default
10
>       atlas.hook.hive.queueSize - queue size for the threadpool. default
>       10000



Re: Hive Hook - Create table xxx AS - Hive Hook BUG

Posted by Russell Anderson <rg...@us.ibm.com>.
Hi dev list,

After analyzing the changes as part of ATLAS-1364 - it appears that these
two may be related although no one has responded to my email here.

I have made the changes as described in this patch, will rebuild, and
re-test.

If anyone believes that there is yet another problem , please respond now?

Regards,

Russ.




From:	Russell Anderson/Worcester/IBM
To:	dev@atlas.incubator.apache.org
Cc:	Barry Rosen/Worcester/IBM@IBMUS, Russell
            Anderson/Worcester/IBM@IBMUS
Date:	12/29/2016 03:03 PM
Subject:	Re: Hive Hook - Create table xxx AS - Hive Hook BUG


The following SQL causes the Hive Hook to fail with the error below:

create table bigsql.missy999 as select * from bigsql.brancha;

Question: Is this a new SEVERE BUG in Atlas7rc2 Or has this bug been fixed
in .8 but not in .7rc2 ?


Excerpt from /var/log/hive/hiveserver2.log
==========================================

2016-12-29 11:52:10,787 INFO  bridge.HiveMetaStoreBridge
(HiveMetaStoreBridge.ja\
va:createOrUpdateDBInstance(162)) - Importing objects from databaseName :
bigsql
2016-12-29 11:52:10,787 INFO  metastore.HiveMetaStore
(HiveMetaStore.java:logInf\
o(746)) - 6: get_table : db=bigsql tbl=missy999
2016-12-29 11:52:10,787 INFO  HiveMetaStore.audit
(HiveMetaStore.java:logAuditEv\
ent(371)) - ugi=admin   ip=unknown-ip-addr      cmd=get_table : db=bigsql
tbl=mi\
ssy999
2016-12-29 11:52:10,918 ERROR metadata.Hive (Hive.java:getTable(1119)) -
Table m\
issy999 not found: bigsql.missy999 table not found
2016-12-29 11:52:10,919 ERROR hook.HiveHook (HiveHook.java:run(184)) -
Atlas hoo\
k failed due to error
org.apache.hadoop.hive.ql.metadata.InvalidTableException: Table not found
missy9\
99
        at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:1120)
        at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:1090)
        at org.apache.atlas.hive.hook.HiveHook.createOrUpdateEntities
(HiveHook.j\
ava:503)
        at org.apache.atlas.hive.hook.HiveHook.createOrUpdateEntities
(HiveHook.j\
ava:524)
        at org.apache.atlas.hive.hook.HiveHook.processHiveEntity
(HiveHook.java:6\
05)
        at org.apache.atlas.hive.hook.HiveHook.registerProcess
(HiveHook.java:585\
)
        at org.apache.atlas.hive.hook.HiveHook.fireAndForget
(HiveHook.java:223)
        at org.apache.atlas.hive.hook.HiveHook.access$200(HiveHook.java:78)
        at org.apache.atlas.hive.hook.HiveHook$2.run(HiveHook.java:182)
        at java.util.concurrent.Executors$RunnableAdapter.call
(Executors.java:51\
1)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker
(ThreadPoolExecutor.\
java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run
(ThreadPoolExecutor\
.java:617)
        at java.lang.Thread.run(Thread.java:745)
2016-12-29 11:52:11,221 INFO  log.PerfLogger (PerfLogger.java:PerfLogEnd
(148)) -\
 </PERFLOG method=serializePlan start=1483041130037 end=1483041131221
duration=1\
184 from=org.apache.hadoop.hive.ql.exec.Utilities>
2016-12-29 11:52:11,348 ERROR mr.ExecDriver (ExecDriver.java:execute(400))
- yar\
n
2016-12-29 11:52:13,528 INFO  impl.TimelineClientImpl
(TimelineClientImpl.java:s\
erviceInit(296)) - Timeline service address:
http://biginsights.ibm.com:8188/ws/\
v1/timeline/
2016-12-29 11:52:13,641 INFO  client.RMProxy (RMProxy.java:createRMProxy
(98)) - \
Connecting to ResourceManager at biginsights.ibm.com/192.168.1.132:8050
2016-12-29 11:52:14,269 INFO  fs.FSStatsPublisher
(FSStatsPublisher.java:init(49\
)) - created :
hdfs://biginsights.ibm.com:8020/apps/hive/warehouse/bigsql.db/.hi\
ve-staging_hive_2016-12-29_11-52-08_283_558364161490028907-1/-ext-10002


Russell G. Anderson
 Senior Technical Consultant





From:	Hemanth Yamijala <yh...@gmail.com>
To:	dev@atlas.incubator.apache.org
Cc:	Russell Anderson/Worcester/IBM@IBMUS, Barry
            Rosen/Worcester/IBM@IBMUS
Date:	12/29/2016 11:17 AM
Subject:	Re: Hive Hook - what is missing ? [ RESOLVED }



No worries. Glad it helped.

On 27-Dec-2016 21:18, "Russell Anderson" <rg...@us.ibm.com> wrote:

Hi Memanth Yamijala,

The magic of your thoughts !!!

So I did as you suggested but then I noticed a couple of things that hadn't
seemed important at the time :

1) In my Hadoop there were multiple entries for PRE, and FAILURE for Hive
Hook. These were left as default - so i changed these to match the one
recommended in the documentation.

2) I re-copied the libraries i had built where my AUX_PATH was pointing to.

3) I changed the permissions to match all the other jars in the hive/lib
directory - which were 'root'.

4) I then carefully examined the var/log/hive/hiveserver2.log for Hive Hook
messages

I now have the Hive Hook working as desired !!!

Thank you for getting me to look and think about things....!!!

Russ.



[image: Inactive hide details for Hemanth Yamijala ---12/26/2016 08:11:29
PM---Hi, A few questions to help debug:]Hemanth Yamijala ---12/26/2016
08:11:29 PM---Hi, A few questions to help debug:

From: Hemanth Yamijala <yh...@gmail.com>
To: dev@atlas.incubator.apache.org
Date: 12/26/2016 08:11 PM
Subject: Re: Hive Hook - what is missing ?
------------------------------



Hi,

A few questions to help debug:

1) Are you using Hive CLI or Beeline? Best results would be to
configure these hooks with HiveServer2 and use Beeline.

2) Could you please check hive logs and look for AtlasHook related
messages?

Thanks
Hemanth

On Mon, Dec 26, 2016 at 8:59 PM, Russell Anderson <rg...@us.ibm.com> wrote:
>
>
> Hi dev list,
>
> I have built the Atlas7rc2 - have working the Hive Import, the Dashboard,
> and generally things appear to be working as expected.
>
> However, I have followed the instruction below but cannot seem to get the
> Hive Hook to detect table hive creations to be added to the metadata /
> lineage.
>
> Can anyone suggest how to debug what could be missing regardless of
> following the instructions below ? Any and all help appreciated !!!
>
> Regards,
>
> Russ.
>
> ============================================================
=================
>
> Hive Hook
>
>
> Hive supports listeners on hive command execution using hive hooks. This
is
> used to add/update/remove entities in Atlas using the model defined in
> org.apache.atlas.hive.model.HiveDataModelGenerator. The hook submits the
> request to a thread pool executor to avoid blocking the command
execution.
> The thread submits the entities as message to the notification server and
> atlas server reads these messages and registers the entities. Follow
these
> instructions in your hive set-up to add hive hook for Atlas:
>       Set-up atlas hook in hive-site.xml of your hive configuration:
>     <property>
>       <name>hive.exec.post.hooks</name>
>       <value>org.apache.atlas.hive.hook.HiveHook</value>
>     </property>
>
>
>     <property>
>       <name>atlas.cluster.name</name>
>       <value>primary</value>
>     </property>
>
>
>       Add 'export HIVE_AUX_JARS_PATH=<atlas package>/hook/hive' in
>       hive-env.sh of your hive configuration
>       Copy <atlas-conf>/atlas-application.properties to the hive conf
>       directory.
>
>
> The following properties in <atlas-conf>/atlas-application.properties
> control the thread pool and notification details:
>       atlas.hook.hive.synchronous - boolean, true to run the hook
>       synchronously. default false. Recommended to be set to false to
avoid
>       delays in hive query completion.
>       atlas.hook.hive.numRetries - number of retries for notification
>       failure. default 3
>       atlas.hook.hive.minThreads - core number of threads. default 5
>       atlas.hook.hive.maxThreads - maximum number of threads. default 5
>       atlas.hook.hive.keepAliveTime - keep alive time in msecs. default
10
>       atlas.hook.hive.queueSize - queue size for the threadpool. default
>       10000




Re: Hive Hook - what is missing ? [ RESOLVED }

Posted by Hemanth Yamijala <yh...@gmail.com>.
No worries. Glad it helped.

On 27-Dec-2016 21:18, "Russell Anderson" <rg...@us.ibm.com> wrote:

Hi Memanth Yamijala,

The magic of your thoughts !!!

So I did as you suggested but then I noticed a couple of things that hadn't
seemed important at the time :

1) In my Hadoop there were multiple entries for PRE, and FAILURE for Hive
Hook. These were left as default - so i changed these to match the one
recommended in the documentation.

2) I re-copied the libraries i had built where my AUX_PATH was pointing to.

3) I changed the permissions to match all the other jars in the hive/lib
directory - which were 'root'.

4) I then carefully examined the var/log/hive/hiveserver2.log for Hive Hook
messages

I now have the Hive Hook working as desired !!!

Thank you for getting me to look and think about things....!!!

Russ.



[image: Inactive hide details for Hemanth Yamijala ---12/26/2016 08:11:29
PM---Hi, A few questions to help debug:]Hemanth Yamijala ---12/26/2016
08:11:29 PM---Hi, A few questions to help debug:

From: Hemanth Yamijala <yh...@gmail.com>
To: dev@atlas.incubator.apache.org
Date: 12/26/2016 08:11 PM
Subject: Re: Hive Hook - what is missing ?
------------------------------



Hi,

A few questions to help debug:

1) Are you using Hive CLI or Beeline? Best results would be to
configure these hooks with HiveServer2 and use Beeline.

2) Could you please check hive logs and look for AtlasHook related messages?

Thanks
Hemanth

On Mon, Dec 26, 2016 at 8:59 PM, Russell Anderson <rg...@us.ibm.com> wrote:
>
>
> Hi dev list,
>
> I have built the Atlas7rc2 - have working the Hive Import, the Dashboard,
> and generally things appear to be working as expected.
>
> However, I have followed the instruction below but cannot seem to get the
> Hive Hook to detect table hive creations to be added to the metadata /
> lineage.
>
> Can anyone suggest how to debug what could be missing regardless of
> following the instructions below ? Any and all help appreciated !!!
>
> Regards,
>
> Russ.
>
> ============================================================
=================
>
> Hive Hook
>
>
> Hive supports listeners on hive command execution using hive hooks. This
is
> used to add/update/remove entities in Atlas using the model defined in
> org.apache.atlas.hive.model.HiveDataModelGenerator. The hook submits the
> request to a thread pool executor to avoid blocking the command execution.
> The thread submits the entities as message to the notification server and
> atlas server reads these messages and registers the entities. Follow these
> instructions in your hive set-up to add hive hook for Atlas:
>       Set-up atlas hook in hive-site.xml of your hive configuration:
>     <property>
>       <name>hive.exec.post.hooks</name>
>       <value>org.apache.atlas.hive.hook.HiveHook</value>
>     </property>
>
>
>     <property>
>       <name>atlas.cluster.name</name>
>       <value>primary</value>
>     </property>
>
>
>       Add 'export HIVE_AUX_JARS_PATH=<atlas package>/hook/hive' in
>       hive-env.sh of your hive configuration
>       Copy <atlas-conf>/atlas-application.properties to the hive conf
>       directory.
>
>
> The following properties in <atlas-conf>/atlas-application.properties
> control the thread pool and notification details:
>       atlas.hook.hive.synchronous - boolean, true to run the hook
>       synchronously. default false. Recommended to be set to false to
avoid
>       delays in hive query completion.
>       atlas.hook.hive.numRetries - number of retries for notification
>       failure. default 3
>       atlas.hook.hive.minThreads - core number of threads. default 5
>       atlas.hook.hive.maxThreads - maximum number of threads. default 5
>       atlas.hook.hive.keepAliveTime - keep alive time in msecs. default 10
>       atlas.hook.hive.queueSize - queue size for the threadpool. default
>       10000

Re: Hive Hook - what is missing ? [ RESOLVED }

Posted by Russell Anderson <rg...@us.ibm.com>.
Hi Memanth Yamijala,

The magic of your thoughts !!!

So I did as you suggested but then I noticed a couple of things that hadn't
seemed important at the time :

1) In my Hadoop there were multiple entries for PRE, and FAILURE for Hive
Hook. These were left as default - so i changed these to match the one
recommended in the documentation.

2) I re-copied the libraries i had built where my AUX_PATH was pointing to.

3) I changed the permissions to match all the other jars in the hive/lib
directory - which were 'root'.

4) I then carefully examined the var/log/hive/hiveserver2.log for Hive Hook
messages

I now have the Hive Hook working as desired !!!

Thank you for getting me to look and think about things....!!!

Russ.





From:	Hemanth Yamijala <yh...@gmail.com>
To:	dev@atlas.incubator.apache.org
Date:	12/26/2016 08:11 PM
Subject:	Re: Hive Hook - what is missing ?



Hi,

A few questions to help debug:

1) Are you using Hive CLI or Beeline? Best results would be to
configure these hooks with HiveServer2 and use Beeline.

2) Could you please check hive logs and look for AtlasHook related
messages?

Thanks
Hemanth

On Mon, Dec 26, 2016 at 8:59 PM, Russell Anderson <rg...@us.ibm.com> wrote:
>
>
> Hi dev list,
>
> I have built the Atlas7rc2 - have working the Hive Import, the Dashboard,
> and generally things appear to be working as expected.
>
> However, I have followed the instruction below but cannot seem to get the
> Hive Hook to detect table hive creations to be added to the metadata /
> lineage.
>
> Can anyone suggest how to debug what could be missing regardless of
> following the instructions below ? Any and all help appreciated !!!
>
> Regards,
>
> Russ.
>
>
=============================================================================

>
> Hive Hook
>
>
> Hive supports listeners on hive command execution using hive hooks. This
is
> used to add/update/remove entities in Atlas using the model defined in
> org.apache.atlas.hive.model.HiveDataModelGenerator. The hook submits the
> request to a thread pool executor to avoid blocking the command
execution.
> The thread submits the entities as message to the notification server and
> atlas server reads these messages and registers the entities. Follow
these
> instructions in your hive set-up to add hive hook for Atlas:
>       Set-up atlas hook in hive-site.xml of your hive configuration:
>     <property>
>       <name>hive.exec.post.hooks</name>
>       <value>org.apache.atlas.hive.hook.HiveHook</value>
>     </property>
>
>
>     <property>
>       <name>atlas.cluster.name</name>
>       <value>primary</value>
>     </property>
>
>
>       Add 'export HIVE_AUX_JARS_PATH=<atlas package>/hook/hive' in
>       hive-env.sh of your hive configuration
>       Copy <atlas-conf>/atlas-application.properties to the hive conf
>       directory.
>
>
> The following properties in <atlas-conf>/atlas-application.properties
> control the thread pool and notification details:
>       atlas.hook.hive.synchronous - boolean, true to run the hook
>       synchronously. default false. Recommended to be set to false to
avoid
>       delays in hive query completion.
>       atlas.hook.hive.numRetries - number of retries for notification
>       failure. default 3
>       atlas.hook.hive.minThreads - core number of threads. default 5
>       atlas.hook.hive.maxThreads - maximum number of threads. default 5
>       atlas.hook.hive.keepAliveTime - keep alive time in msecs. default
10
>       atlas.hook.hive.queueSize - queue size for the threadpool. default
>       10000




Re: Hive Hook - what is missing ?

Posted by Hemanth Yamijala <yh...@gmail.com>.
Hi,

A few questions to help debug:

1) Are you using Hive CLI or Beeline? Best results would be to
configure these hooks with HiveServer2 and use Beeline.

2) Could you please check hive logs and look for AtlasHook related messages?

Thanks
Hemanth

On Mon, Dec 26, 2016 at 8:59 PM, Russell Anderson <rg...@us.ibm.com> wrote:
>
>
> Hi dev list,
>
> I have built the Atlas7rc2 - have working the Hive Import, the Dashboard,
> and generally things appear to be working as expected.
>
> However, I have followed the instruction below but cannot seem to get the
> Hive Hook to detect table hive creations to be added to the metadata /
> lineage.
>
> Can anyone suggest how to debug what could be missing regardless of
> following the instructions below ? Any and all help appreciated !!!
>
> Regards,
>
> Russ.
>
> =============================================================================
>
> Hive Hook
>
>
> Hive supports listeners on hive command execution using hive hooks. This is
> used to add/update/remove entities in Atlas using the model defined in
> org.apache.atlas.hive.model.HiveDataModelGenerator. The hook submits the
> request to a thread pool executor to avoid blocking the command execution.
> The thread submits the entities as message to the notification server and
> atlas server reads these messages and registers the entities. Follow these
> instructions in your hive set-up to add hive hook for Atlas:
>       Set-up atlas hook in hive-site.xml of your hive configuration:
>     <property>
>       <name>hive.exec.post.hooks</name>
>       <value>org.apache.atlas.hive.hook.HiveHook</value>
>     </property>
>
>
>     <property>
>       <name>atlas.cluster.name</name>
>       <value>primary</value>
>     </property>
>
>
>       Add 'export HIVE_AUX_JARS_PATH=<atlas package>/hook/hive' in
>       hive-env.sh of your hive configuration
>       Copy <atlas-conf>/atlas-application.properties to the hive conf
>       directory.
>
>
> The following properties in <atlas-conf>/atlas-application.properties
> control the thread pool and notification details:
>       atlas.hook.hive.synchronous - boolean, true to run the hook
>       synchronously. default false. Recommended to be set to false to avoid
>       delays in hive query completion.
>       atlas.hook.hive.numRetries - number of retries for notification
>       failure. default 3
>       atlas.hook.hive.minThreads - core number of threads. default 5
>       atlas.hook.hive.maxThreads - maximum number of threads. default 5
>       atlas.hook.hive.keepAliveTime - keep alive time in msecs. default 10
>       atlas.hook.hive.queueSize - queue size for the threadpool. default
>       10000