You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@nifi.apache.org by "Ryan Persaud (JIRA)" <ji...@apache.org> on 2017/03/27 07:29:41 UTC

[jira] [Comment Edited] (NIFI-3625) Add JSON support to PutHiveStreaming

    [ https://issues.apache.org/jira/browse/NIFI-3625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15942751#comment-15942751 ] 

Ryan Persaud edited comment on NIFI-3625 at 3/27/17 7:29 AM:
-------------------------------------------------------------

I was experimenting with streaming some JSON data into a partitioned table in a HDP 2.5 sandbox tonight, and I encountered an Exception (below).  I built from master (552148e9e7d45be4d298ee48afd7471405a5bfad) and tested with the 'old' PutHiveStreaming processor, and I got the same error.   From what I can tell, the error occurs whenever partition columns are specified in the PutHiveStreaming processor.

On a hunch I reverted HiveUtils and HiveWriter back to the versions from 8/4/2016 (3943d72e95ff7b18c32d12020d34f134f4e86125), and I hacked them up a bit to work with the newer versions of PutHiveStreaming and TestPutHiveStreaming.  I was able to successfully stream into a table. 

Has any one else encountered these issues since NIFI-3574 and NIFI-3530 have been resolved?  Any thoughts on how to proceed?

2017-03-26 23:50:31,460 ERROR [Timer-Driven Process Thread-6] hive.log Got exception: java.lang.NullPointerException null
java.lang.NullPointerException: null
        at org.apache.hadoop.hive.ql.security.authorization.plugin.AuthorizationMetaStoreFilterHook.getFilteredObjects(AuthorizationMetaStoreFilterHook.java:77) ~[hive-exec-1.2.1000.2.5.0.0-1245.jar:1.2.1000.2.5.0.0-1245]
        at org.apache.hadoop.hive.ql.security.authorization.plugin.AuthorizationMetaStoreFilterHook.filterDatabases(AuthorizationMetaStoreFilterHook.java:54) ~[hive-exec-1.2.1000.2.5.0.0-1245.jar:1.2.1000.2.5.0.0-1245]
        at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getDatabases(HiveMetaStoreClient.java:1046) ~[hive-metastore-1.2.1000.2.5.0.0-1245.jar:1.2.1000.2.5.0.0-1245]
        at org.apache.hive.hcatalog.common.HiveClientCache$CacheableHiveMetaStoreClient.isOpen(HiveClientCache.java:367) [hive-hcatalog-core-1.2.1000.2.5.0.0-1245.jar:1.2.1000.2.5.0.0-1245]
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[na:1.8.0_121]
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[na:1.8.0_121]
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.8.0_121]
        at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_121]
        at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:155) [hive-metastore-1.2.1000.2.5.0.0-1245.jar:1.2.1000.2.5.0.0-1245]
        at com.sun.proxy.$Proxy130.isOpen(Unknown Source) [na:na]
        at org.apache.hive.hcatalog.common.HiveClientCache.get(HiveClientCache.java:205) [hive-hcatalog-core-1.2.1000.2.5.0.0-1245.jar:1.2.1000.2.5.0.0-1245]
        at org.apache.hive.hcatalog.common.HCatUtil.getHiveMetastoreClient(HCatUtil.java:558) [hive-hcatalog-core-1.2.1000.2.5.0.0-1245.jar:1.2.1000.2.5.0.0-1245]
        at org.apache.hive.hcatalog.streaming.AbstractRecordWriter.<init>(AbstractRecordWriter.java:94) [hive-hcatalog-streaming-1.2.1000.2.5.0.0-1245.jar:1.2.1000.2.5.0.0-1245]
        at org.apache.hive.hcatalog.streaming.StrictJsonWriter.<init>(StrictJsonWriter.java:82) [hive-hcatalog-streaming-1.2.1000.2.5.0.0-1245.jar:1.2.1000.2.5.0.0-1245]
        at org.apache.hive.hcatalog.streaming.StrictJsonWriter.<init>(StrictJsonWriter.java:60) [hive-hcatalog-streaming-1.2.1000.2.5.0.0-1245.jar:1.2.1000.2.5.0.0-1245]
        at org.apache.nifi.util.hive.HiveWriter.getRecordWriter(HiveWriter.java:84) [nifi-hive-processors-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
        at org.apache.nifi.util.hive.HiveWriter.<init>(HiveWriter.java:71) [nifi-hive-processors-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
        at org.apache.nifi.util.hive.HiveUtils.makeHiveWriter(HiveUtils.java:46) [nifi-hive-processors-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
        at org.apache.nifi.processors.hive.PutHiveStreaming.makeHiveWriter(PutHiveStreaming.java:1011) [nifi-hive-processors-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
        at org.apache.nifi.processors.hive.PutHiveStreaming.getOrCreateWriter(PutHiveStreaming.java:922) [nifi-hive-processors-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
        at org.apache.nifi.processors.hive.PutHiveStreaming.writeToHive(PutHiveStreaming.java:405) [nifi-hive-processors-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
        at org.apache.nifi.processors.hive.PutHiveStreaming.lambda$processJSON$5(PutHiveStreaming.java:702) [nifi-hive-processors-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
        at org.apache.nifi.controller.repository.StandardProcessSession.read(StandardProcessSession.java:2120) ~[na:na]
        at org.apache.nifi.controller.repository.StandardProcessSession.read(StandardProcessSession.java:2090) ~[na:na]
        at org.apache.nifi.processors.hive.PutHiveStreaming.processJSON(PutHiveStreaming.java:669) [nifi-hive-processors-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
        at org.apache.nifi.processors.hive.PutHiveStreaming.onTrigger(PutHiveStreaming.java:803) [nifi-hive-processors-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
        at org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27) ~[nifi-api-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
        at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1099) ~[na:na]
        at org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:144) ~[na:na]
        at org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:47) ~[na:na]
        at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:132) ~[na:na]
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[na:1.8.0_121]
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) ~[na:1.8.0_121]
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) ~[na:1.8.0_121]
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) ~[na:1.8.0_121]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[na:1.8.0_121]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) ~[na:1.8.0_121]
        at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_121]


was (Author: rpersaud):
I was experimenting with streaming some JSON data into a partitioned table in a HDP 2.5 sandbox tonight, and I encountered an Exception (below).  I built from master (552148e9e7d45be4d298ee48afd7471405a5bfad) and tested with the 'old' PutHiveStreaming processor, and I got the same error.   From what I can tell, the error occurs whenever partition columns are specified in the PutHiveStreaming processor.

On a hunch I reverted HiveUtils and HiveWriter back to the versions from 8/4/2016 (3943d72e95ff7b18c32d12020d34f134f4e86125), and I hacked them up a bit to work with the newer versions of PutHiveStreaming and TestPutHiveStreaming.  I was able to successfully stream into a table. 

Has any one else encountered these issues after NIFI-3574?  Any thoughts on how to proceed?

2017-03-26 23:50:31,460 ERROR [Timer-Driven Process Thread-6] hive.log Got exception: java.lang.NullPointerException null
java.lang.NullPointerException: null
        at org.apache.hadoop.hive.ql.security.authorization.plugin.AuthorizationMetaStoreFilterHook.getFilteredObjects(AuthorizationMetaStoreFilterHook.java:77) ~[hive-exec-1.2.1000.2.5.0.0-1245.jar:1.2.1000.2.5.0.0-1245]
        at org.apache.hadoop.hive.ql.security.authorization.plugin.AuthorizationMetaStoreFilterHook.filterDatabases(AuthorizationMetaStoreFilterHook.java:54) ~[hive-exec-1.2.1000.2.5.0.0-1245.jar:1.2.1000.2.5.0.0-1245]
        at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getDatabases(HiveMetaStoreClient.java:1046) ~[hive-metastore-1.2.1000.2.5.0.0-1245.jar:1.2.1000.2.5.0.0-1245]
        at org.apache.hive.hcatalog.common.HiveClientCache$CacheableHiveMetaStoreClient.isOpen(HiveClientCache.java:367) [hive-hcatalog-core-1.2.1000.2.5.0.0-1245.jar:1.2.1000.2.5.0.0-1245]
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[na:1.8.0_121]
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[na:1.8.0_121]
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.8.0_121]
        at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_121]
        at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:155) [hive-metastore-1.2.1000.2.5.0.0-1245.jar:1.2.1000.2.5.0.0-1245]
        at com.sun.proxy.$Proxy130.isOpen(Unknown Source) [na:na]
        at org.apache.hive.hcatalog.common.HiveClientCache.get(HiveClientCache.java:205) [hive-hcatalog-core-1.2.1000.2.5.0.0-1245.jar:1.2.1000.2.5.0.0-1245]
        at org.apache.hive.hcatalog.common.HCatUtil.getHiveMetastoreClient(HCatUtil.java:558) [hive-hcatalog-core-1.2.1000.2.5.0.0-1245.jar:1.2.1000.2.5.0.0-1245]
        at org.apache.hive.hcatalog.streaming.AbstractRecordWriter.<init>(AbstractRecordWriter.java:94) [hive-hcatalog-streaming-1.2.1000.2.5.0.0-1245.jar:1.2.1000.2.5.0.0-1245]
        at org.apache.hive.hcatalog.streaming.StrictJsonWriter.<init>(StrictJsonWriter.java:82) [hive-hcatalog-streaming-1.2.1000.2.5.0.0-1245.jar:1.2.1000.2.5.0.0-1245]
        at org.apache.hive.hcatalog.streaming.StrictJsonWriter.<init>(StrictJsonWriter.java:60) [hive-hcatalog-streaming-1.2.1000.2.5.0.0-1245.jar:1.2.1000.2.5.0.0-1245]
        at org.apache.nifi.util.hive.HiveWriter.getRecordWriter(HiveWriter.java:84) [nifi-hive-processors-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
        at org.apache.nifi.util.hive.HiveWriter.<init>(HiveWriter.java:71) [nifi-hive-processors-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
        at org.apache.nifi.util.hive.HiveUtils.makeHiveWriter(HiveUtils.java:46) [nifi-hive-processors-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
        at org.apache.nifi.processors.hive.PutHiveStreaming.makeHiveWriter(PutHiveStreaming.java:1011) [nifi-hive-processors-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
        at org.apache.nifi.processors.hive.PutHiveStreaming.getOrCreateWriter(PutHiveStreaming.java:922) [nifi-hive-processors-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
        at org.apache.nifi.processors.hive.PutHiveStreaming.writeToHive(PutHiveStreaming.java:405) [nifi-hive-processors-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
        at org.apache.nifi.processors.hive.PutHiveStreaming.lambda$processJSON$5(PutHiveStreaming.java:702) [nifi-hive-processors-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
        at org.apache.nifi.controller.repository.StandardProcessSession.read(StandardProcessSession.java:2120) ~[na:na]
        at org.apache.nifi.controller.repository.StandardProcessSession.read(StandardProcessSession.java:2090) ~[na:na]
        at org.apache.nifi.processors.hive.PutHiveStreaming.processJSON(PutHiveStreaming.java:669) [nifi-hive-processors-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
        at org.apache.nifi.processors.hive.PutHiveStreaming.onTrigger(PutHiveStreaming.java:803) [nifi-hive-processors-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
        at org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27) ~[nifi-api-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
        at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1099) ~[na:na]
        at org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:144) ~[na:na]
        at org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:47) ~[na:na]
        at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:132) ~[na:na]
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[na:1.8.0_121]
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) ~[na:1.8.0_121]
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) ~[na:1.8.0_121]
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) ~[na:1.8.0_121]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[na:1.8.0_121]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) ~[na:1.8.0_121]
        at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_121]

> Add JSON support to PutHiveStreaming
> ------------------------------------
>
>                 Key: NIFI-3625
>                 URL: https://issues.apache.org/jira/browse/NIFI-3625
>             Project: Apache NiFi
>          Issue Type: Improvement
>          Components: Extensions
>    Affects Versions: 1.2.0
>            Reporter: Ryan Persaud
>             Fix For: 1.2.0
>
>
> As noted in a Hortonworks Community Connection post (https://community.hortonworks.com/questions/88424/nifi-puthivestreaming-requires-avro.html), PutHiveStreaming does not currently support JSON Flow File content.  I've completed the code to allow JSON flow files to be streamed into hive, and I'm currently working on test cases and updated documentation.  I should have a PR to submit this week.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)