You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@storm.apache.org by Sunit Swain <su...@gmail.com> on 2015/03/18 05:38:29 UTC

Re: How to use storm's HiveBolt

I am using storm 0.9.3 and trying to make use of the HiveBolt to stream the
data directly into hive tables.

I am following this example:
https://github.com/hkropp/storm-hive-streaming-example

But instead of a kafkaspout I am using a random data generating spout to
keep things simple.

When I submit the topology, the bolt thread is able to connect to my
hive-metastore but then fails with these  errors
once it was: *" Non-local session path expected to be non-null"*
and then  :* "Failed connecting to EndPoint hive " *

I also made sure that the table is already created and accessible. I also
tried it on partitioned and un-partitioned  tables.

Any help as to what is going wrong?

Regards,
Sunit

On Wed, Mar 18, 2015 at 9:51 AM, Sunit Swain <su...@gmail.com> wrote:

> I am using storm 0.9.3 and trying to make use of the HiveBolt to stream
> the data directly into hive tables.
>
> I am following this example:
> https://github.com/hkropp/storm-hive-streaming-example
>
> But instead of a kafkaspout I am using a random data generating spout to
> keep things simple.
>
> When I submit the topology, the bolt thread is able to connect to my
> hive-metastore but then fails with these  errors
> once it was: *" Non-local session path expected to be non-null"*
> and then  :* "Failed connecting to EndPoint hive " *
>
> I also made sure that the table is already created and accessible. I also
> tried it on partitioned and un-partitioned  tables.
>
> Any help as to what is going wrong?
>
> Regards,
> Sunit
>

Re: How to use storm's HiveBolt

Posted by Sunit Swain <su...@gmail.com>.
Using cloudera CDH 5.3,  that has hive 0.13 onwards.

On Wed, Mar 18, 2015, 10:34 AM Harsha <st...@harsha.io> wrote:

> Hi Sunit,
>      Which version of Hive are you using. Hive streaming supported from
> 0.13 onwards.
> Here is the doc for enabling hive streaming. checking streaming
> requirements
> https://cwiki.apache.org/confluence/display/Hive/Streaming+Data+Ingest
>
> After that you can refer to the doc here
> https://github.com/apache/storm/blob/master/external/storm-hive/README.md
>
>
> --
> Harsha
>
> On March 17, 2015 at 9:39:44 PM, Sunit Swain (sunitswn91@gmail.com) wrote:
>
> I am using storm 0.9.3 and trying to make use of the HiveBolt to stream
> the data directly into hive tables.
>
> I am following this example:
> https://github.com/hkropp/storm-hive-streaming-example
>
> But instead of a kafkaspout I am using a random data generating spout to
> keep things simple.
>
> When I submit the topology, the bolt thread is able to connect to my
> hive-metastore but then fails with these  errors
> once it was: *" Non-local session path expected to be non-null"*
> and then  :* "Failed connecting to EndPoint hive " *
>
> I also made sure that the table is already created and accessible. I also
> tried it on partitioned and un-partitioned  tables.
>
> Any help as to what is going wrong?
>
> Regards,
> Sunit
>
> On Wed, Mar 18, 2015 at 9:51 AM, Sunit Swain <su...@gmail.com> wrote:
>
>> I am using storm 0.9.3 and trying to make use of the HiveBolt to stream
>> the data directly into hive tables.
>>
>> I am following this example:
>> https://github.com/hkropp/storm-hive-streaming-example
>>
>> But instead of a kafkaspout I am using a random data generating spout to
>> keep things simple.
>>
>> When I submit the topology, the bolt thread is able to connect to my
>> hive-metastore but then fails with these  errors
>> once it was: *" Non-local session path expected to be non-null"*
>> and then  :* "Failed connecting to EndPoint hive " *
>>
>> I also made sure that the table is already created and accessible. I also
>> tried it on partitioned and un-partitioned  tables.
>>
>> Any help as to what is going wrong?
>>
>> Regards,
>> Sunit
>>
>
>

Re: How to use storm's HiveBolt

Posted by Harsha <st...@harsha.io>.
Hi Sunit,
     Which version of Hive are you using. Hive streaming supported from 0.13 onwards. 
Here is the doc for enabling hive streaming. checking streaming requirements
https://cwiki.apache.org/confluence/display/Hive/Streaming+Data+Ingest

After that you can refer to the doc here 
https://github.com/apache/storm/blob/master/external/storm-hive/README.md


-- 
Harsha
On March 17, 2015 at 9:39:44 PM, Sunit Swain (sunitswn91@gmail.com) wrote:

I am using storm 0.9.3 and trying to make use of the HiveBolt to stream the data directly into hive tables. 

I am following this example: 
https://github.com/hkropp/storm-hive-streaming-example 

But instead of a kafkaspout I am using a random data generating spout to keep things simple. 

When I submit the topology, the bolt thread is able to connect to my hive-metastore but then fails with these  errors 
once it was: " Non-local session path expected to be non-null" 
and then  : "Failed connecting to EndPoint hive "  

I also made sure that the table is already created and accessible. I also tried it on partitioned and un-partitioned  tables. 

Any help as to what is going wrong? 

Regards,
Sunit 

On Wed, Mar 18, 2015 at 9:51 AM, Sunit Swain <su...@gmail.com> wrote:
I am using storm 0.9.3 and trying to make use of the HiveBolt to stream the data directly into hive tables. 

I am following this example: 
https://github.com/hkropp/storm-hive-streaming-example 

But instead of a kafkaspout I am using a random data generating spout to keep things simple. 

When I submit the topology, the bolt thread is able to connect to my hive-metastore but then fails with these  errors 
once it was: " Non-local session path expected to be non-null" 
and then  : "Failed connecting to EndPoint hive "  

I also made sure that the table is already created and accessible. I also tried it on partitioned and un-partitioned  tables. 

Any help as to what is going wrong? 

Regards,
Sunit 


Re: How to use storm's HiveBolt

Posted by "Grant Overby (groverby)" <gr...@cisco.com>.
If storm and hive run as different users, check your permissions in HDFS. The storm user will need permissions to write to your table (this is likely under /apps/hive/warehouse ). Similarly, the hive user will need to be able to access and modify the files created by the storm user. This can cause the errors you are seeing.

You may need to set superusers and/or umasks in hdfs config to help with this. You may also need to set the umask for the local file system or assign users to groups.

You may also need to set permissions on /tmp/hive on the local file systems.


[http://www.cisco.com/web/europe/images/email/signature/est2014/logo_06.png?ct=1398192119726]

Grant Overby
Software Engineer
Cisco.com<http://www.cisco.com/>
groverby@cisco.com<ma...@cisco.com>
Mobile: 865 724 4910






[http://www.cisco.com/assets/swa/img/thinkbeforeyouprint.gif] Think before you print.

This email may contain confidential and privileged material for the sole use of the intended recipient. Any review, use, distribution or disclosure by others is strictly prohibited. If you are not the intended recipient (or authorized to receive for the recipient), please contact the sender by reply email and delete all copies of this message.

Please click here<http://www.cisco.com/web/about/doing_business/legal/cri/index.html> for Company Registration Information.





From: Sunit Swain <su...@gmail.com>>
Reply-To: "user@storm.apache.org<ma...@storm.apache.org>" <us...@storm.apache.org>>
Date: Wednesday, March 18, 2015 at 12:38 AM
To: "user@storm.apache.org<ma...@storm.apache.org>" <us...@storm.apache.org>>
Subject: Re: How to use storm's HiveBolt

I am using storm 0.9.3 and trying to make use of the HiveBolt to stream the data directly into hive tables.

I am following this example:
https://github.com/hkropp/storm-hive-streaming-example

But instead of a kafkaspout I am using a random data generating spout to keep things simple.

When I submit the topology, the bolt thread is able to connect to my hive-metastore but then fails with these  errors
once it was: " Non-local session path expected to be non-null"
and then  : "Failed connecting to EndPoint hive "

I also made sure that the table is already created and accessible. I also tried it on partitioned and un-partitioned  tables.

Any help as to what is going wrong?

Regards,
Sunit

On Wed, Mar 18, 2015 at 9:51 AM, Sunit Swain <su...@gmail.com>> wrote:
I am using storm 0.9.3 and trying to make use of the HiveBolt to stream the data directly into hive tables.

I am following this example:
https://github.com/hkropp/storm-hive-streaming-example

But instead of a kafkaspout I am using a random data generating spout to keep things simple.

When I submit the topology, the bolt thread is able to connect to my hive-metastore but then fails with these  errors
once it was: " Non-local session path expected to be non-null"
and then  : "Failed connecting to EndPoint hive "

I also made sure that the table is already created and accessible. I also tried it on partitioned and un-partitioned  tables.

Any help as to what is going wrong?

Regards,
Sunit