You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Guy Doulberg <Gu...@conduit.com> on 2011/05/31 14:31:09 UTC

RDBS and Hive

Hey friends,

I have been using Hive to explore my HDFS for a while now.

In my organization we have some applicative information in SqlServer that occasionally I need to join with data I have in the HDFS.

When I need to do such a join, I import the data from the SqlServer using sqoop.

I wanted to know, if there is a way in which hive will import the data on demand, according to the query, for example, if I am selecting from a table that hive recognizes as a SqlServer table, it will import that table or(update if already exists) from sqlserver, and later will run the select...


Thanks,
Guy Doulberg


RE: RDBS and Hive

Posted by Guy Doulberg <Gu...@conduit.com>.
Thanks Andrew,

I would really like to try the patch...




From: Andrew Wilson [mailto:awilson@conductor.com]
Sent: Tuesday, May 31, 2011 6:22 PM
To: user@hive.apache.org
Subject: Re: RDBS and Hive

Yes, here is the ticket: https://issues.apache.org/jira/browse/HIVE-1555?page=com.atlassian.streams.streams-jira-plugin%3Aactivity-stream-issue-tab#issue-tabs<https://issues.apache.org/jira/browse/HIVE-1555?page=com.atlassian.streams.streams-jira-plugin:activity-stream-issue-tab#issue-tabs>

We use it fairly regularly in production and it works well. I will try to get a patch uploaded soon.

On May 31, 2011, at 11:02 AM, Edward Capriolo wrote:


There is a ticket open for a JDBC storage handler. I envision this as a sqoop on demand.

Edward
On Tue, May 31, 2011 at 8:31 AM, Guy Doulberg <Gu...@conduit.com>> wrote:
Hey friends,

I have been using Hive to explore my HDFS for a while now.

In my organization we have some applicative information in SqlServer that occasionally I need to join with data I have in the HDFS.

When I need to do such a join, I import the data from the SqlServer using sqoop.

I wanted to know, if there is a way in which hive will import the data on demand, according to the query, for example, if I am selecting from a table that hive recognizes as a SqlServer table, it will import that table or(update if already exists) from sqlserver, and later will run the select...


Thanks,
Guy Doulberg




Re: RDBS and Hive

Posted by Andrew Wilson <aw...@conductor.com>.
Yes, here is the ticket: https://issues.apache.org/jira/browse/HIVE-1555?page=com.atlassian.streams.streams-jira-plugin%3Aactivity-stream-issue-tab#issue-tabs<https://issues.apache.org/jira/browse/HIVE-1555?page=com.atlassian.streams.streams-jira-plugin:activity-stream-issue-tab#issue-tabs>

We use it fairly regularly in production and it works well. I will try to get a patch uploaded soon.
<https://issues.apache.org/jira/browse/HIVE-1555?page=com.atlassian.streams.streams-jira-plugin:activity-stream-issue-tab#issue-tabs>
On May 31, 2011, at 11:02 AM, Edward Capriolo wrote:

There is a ticket open for a JDBC storage handler. I envision this as a sqoop on demand.

Edward

On Tue, May 31, 2011 at 8:31 AM, Guy Doulberg <Gu...@conduit.com>> wrote:
Hey friends,

I have been using Hive to explore my HDFS for a while now.

In my organization we have some applicative information in SqlServer that occasionally I need to join with data I have in the HDFS.

When I need to do such a join, I import the data from the SqlServer using sqoop.

I wanted to know, if there is a way in which hive will import the data on demand, according to the query, for example, if I am selecting from a table that hive recognizes as a SqlServer table, it will import that table or(update if already exists) from sqlserver, and later will run the select…


Thanks,
Guy Doulberg




Re: RDBS and Hive

Posted by Edward Capriolo <ed...@gmail.com>.
There is a ticket open for a JDBC storage handler. I envision this as a
sqoop on demand.

Edward

On Tue, May 31, 2011 at 8:31 AM, Guy Doulberg <Gu...@conduit.com>wrote:

> Hey friends,
>
>
>
> I have been using Hive to explore my HDFS for a while now.
>
>
>
> In my organization we have some applicative information in SqlServer that
> occasionally I need to join with data I have in the HDFS.
>
>
>
> When I need to do such a join, I import the data from the SqlServer using
> sqoop.
>
>
>
> I wanted to know, if there is a way in which hive will import the data on
> demand, according to the query, for example, if I am selecting from a table
> that hive recognizes as a SqlServer table, it will import that table
> or(update if already exists) from sqlserver, and later will run the select…
>
>
>
>
>
> Thanks,
>
> Guy Doulberg
>
>
>