You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by "V.Senthil Kumar" <va...@yahoo.com> on 2011/05/03 00:41:28 UTC
HIVE Server multiple instances
Hello,
I have one instance of HIVE JDBC server running on port 10000. Can I run another
instance on different port ? Would it cause a concurrency issue on the
underlying data warehouse files ? Please clarify.
Thanks,
V.Senthil Kumar
Re: HIVE Server multiple instances
Posted by "V.Senthil Kumar" <va...@yahoo.com>.
Thanks Paul. That is really useful information.
----- Original Message ----
From: Matthew Rathbone <ma...@foursquare.com>
To: user@hive.apache.org
Sent: Tue, May 3, 2011 11:18:17 AM
Subject: Re: HIVE Server multiple instances
Hey Paul,
I'd be very interested in reading about your hadoop/hive setup, do you have a
blog post or anything describing this setup, or some of the issues you've have
with hive?
--
Matthew Rathbone
Foursquare | Software Engineer | Server Engineering Team
matthew@foursquare.com | @rathboma | 4sq
On Tuesday, May 3, 2011 at 2:15 PM, Paul Ingles wrote:
HiveServer does seem to support multiple connections but I think it still has
thread-safety problems (https://issues.apache.org/jira/browse/HIVE-80).
>
> We've (www.forward.co.uk) certainly had instability problems with the thrift
>server in the past and now run 5 or so instances behind the HAProxy
>load-balancer (http://haproxy.1wt.eu/). Since we did that it's been
>significantly better.
>
>
> I think the JDBC server still operates using thrift to connect to the
>HiveServer so I would expect it to have similar problems (but I may have got
>that wrong :)
>
>
> On 3 May 2011, at 18:59, Matthew Rathbone wrote:
>
> > Even if it is single threaded it certainly seems to support multiple
>connections.
>
> >
> > We run 5 workers all connected at the same time executing a different query
>each ( with a different connection per worker).
> >
> > Hope that helps
> >
> > Matthew
> > On Tuesday, May 3, 2011 at 1:40 PM, V.Senthil Kumar wrote:
> > Thanks Matthew. The wiki page http://wiki.apache.org/hadoop/Hive/HiveServer
>says
>
> > > its single threaded. I have a queue of queries which gets added dynamically
>all
>
> > > the time. By the time I run 1 query using 1 JDBC connection, the queue gets
>
> > > added more queries and builds up a backlog. So, I was that's why I was
>wondering
>
> > > whether I can run two or more instances to avoid having a big backlog in
>queue.
> > >
> > >
> > >
> > > ----- Original Message ----
> > > From: Matthew Rathbone <ma...@foursquare.com>
> > > To: user@hive.apache.org
> > > Sent: Tue, May 3, 2011 7:46:49 AM
> > > Subject: Re: HIVE Server multiple instances
> > >
> > > Why would you want to run two? I think it is multithreaded, so you can
>query it
>
> > > from two different connections
> > >
> > > --
> > > Matthew Rathbone
> > > Foursquare | Software Engineer | Server Engineering Team
> > > matthew@foursquare.com | @rathboma | 4sq
> > >
> > > On Monday, May 2, 2011 at 6:41 PM, V.Senthil Kumar wrote:
> > > Hello,
> > > >
> > > > I have one instance of HIVE JDBC server running on port 10000. Can I run
> > > > another
> > > >
> > > > instance on different port ? Would it cause a concurrency issue on the
> > > > underlying data warehouse files ? Please clarify.
> > > >
> > > > Thanks,
> > > > V.Senthil Kumar
>
Re: HIVE Server multiple instances
Posted by Marcos Ortiz <ml...@uci.cu>.
El 5/4/2011 7:48 AM, Paul Ingles escribió:
> For future reference I've posted a little more about our setup here:
> http://oobaloo.co.uk/multiple-connections-with-hive
>
>
> On Tue, May 3, 2011 at 8:01 PM, Paul Ingles <paul@oobaloo.co.uk
> <ma...@oobaloo.co.uk>> wrote:
>
> Nothing specifically about our Hive setup although some of us at
> Forward have blogged bits and pieces about Hive + Hadoop and have
> a few Hadoop/Hive related libs on our GitHub account:
> https://github.com/forward.
>
> I've blogged a few bits (http://www.oobaloo.co.uk/) as has one of
> my colleagues
> (http://blog.fingertap.org/post/1255463384/hive-thrift-client).
>
> Another colleague also presented a little about our setup during a
> Hadoop meetup last summer
> (http://skillsmatter.com/podcast/home/hadoop-in-context-1591). The
> numbers Andy mentioned will be a little out of date but it does
> include some screenshots of a few of the surrounding apps we built
> that connect to Hive and Hadoop (including a web based Hive query
> tool + work queue).
>
> I had a quick search through the mailing lists when we had
> connection problems but I think most of it was discussed/resolved
> during a chat I had with Shevek from Karmasphere at a London pub
> following a Hadoop meetup :)
>
> If you're interested, I've posted a gist
> (https://gist.github.com/953926) that contains our HAProxy config;
> clients connect to 10000 and are balanced between :10001 and
> :10005 on 2 servers (so actually 10 backend servers).
>
> Be happy to talk more about our experience- feel free to ping me
> an email off list if you'd like.
>
>
> On 3 May 2011, at 19:18, Matthew Rathbone wrote:
>
> > Hey Paul,
> >
> > I'd be very interested in reading about your hadoop/hive setup,
> do you have a blog post or anything describing this setup, or some
> of the issues you've have with hive?
> >
> > --
> > Matthew Rathbone
> > Foursquare | Software Engineer | Server Engineering Team
> > matthew@foursquare.com <ma...@foursquare.com> |
> @rathboma | 4sq
> >
> > On Tuesday, May 3, 2011 at 2:15 PM, Paul Ingles wrote:
> > HiveServer does seem to support multiple connections but I think
> it still has thread-safety problems
> (https://issues.apache.org/jira/browse/HIVE-80).
> >>
> >> We've (www.forward.co.uk <http://www.forward.co.uk>) certainly
> had instability problems with the thrift server in the past and
> now run 5 or so instances behind the HAProxy load-balancer
> (http://haproxy.1wt.eu/). Since we did that it's been
> significantly better.
> >>
> >> I think the JDBC server still operates using thrift to connect
> to the HiveServer so I would expect it to have similar problems
> (but I may have got that wrong :)
> >>
> >>
> >> On 3 May 2011, at 18:59, Matthew Rathbone wrote:
> >>
> >>> Even if it is single threaded it certainly seems to support
> multiple connections.
> >>>
> >>> We run 5 workers all connected at the same time executing a
> different query each ( with a different connection per worker).
> >>>
> >>> Hope that helps
> >>>
> >>> Matthew
> >>> On Tuesday, May 3, 2011 at 1:40 PM, V.Senthil Kumar wrote:
> >>> Thanks Matthew. The wiki page
> http://wiki.apache.org/hadoop/Hive/HiveServer says
> >>>> its single threaded. I have a queue of queries which gets
> added dynamically all
> >>>> the time. By the time I run 1 query using 1 JDBC connection,
> the queue gets
> >>>> added more queries and builds up a backlog. So, I was that's
> why I was wondering
> >>>> whether I can run two or more instances to avoid having a big
> backlog in queue.
> >>>>
> >>>>
> >>>>
> >>>> ----- Original Message ----
> >>>> From: Matthew Rathbone <matthew@foursquare.com
> <ma...@foursquare.com>>
> >>>> To: user@hive.apache.org <ma...@hive.apache.org>
> >>>> Sent: Tue, May 3, 2011 7:46:49 AM
> >>>> Subject: Re: HIVE Server multiple instances
> >>>>
> >>>> Why would you want to run two? I think it is multithreaded,
> so you can query it
> >>>> from two different connections
> >>>>
> >>>> --
> >>>> Matthew Rathbone
> >>>> Foursquare | Software Engineer | Server Engineering Team
> >>>> matthew@foursquare.com <ma...@foursquare.com> |
> @rathboma | 4sq
> >>>>
> >>>> On Monday, May 2, 2011 at 6:41 PM, V.Senthil Kumar wrote:
> >>>> Hello,
> >>>>>
> >>>>> I have one instance of HIVE JDBC server running on port
> 10000. Can I run
> >>>>> another
> >>>>>
> >>>>> instance on different port ? Would it cause a concurrency
> issue on the
> >>>>> underlying data warehouse files ? Please clarify.
> >>>>>
> >>>>> Thanks,
> >>>>> V.Senthil Kumar
> >>
> >
>
>
Wow, good piece of information.
Thanks for share it
--
Marcos Luís Ortíz Valmaseda
Software Engineer (Large-Scaled Distributed Systems)
University of Information Sciences,
La Habana, Cuba
Linux User # 418229
http://about.me/marcosortiz
Re: HIVE Server multiple instances
Posted by "V.Senthil Kumar" <va...@yahoo.com>.
This is great info. Thanks a lot for sharing :)
________________________________
From: Paul Ingles <pa...@oobaloo.co.uk>
To: user@hive.apache.org
Sent: Wed, May 4, 2011 4:48:20 AM
Subject: Re: HIVE Server multiple instances
For future reference I've posted a little more about our setup
here: http://oobaloo.co.uk/multiple-connections-with-hive
On Tue, May 3, 2011 at 8:01 PM, Paul Ingles <pa...@oobaloo.co.uk> wrote:
Nothing specifically about our Hive setup although some of us at Forward have
blogged bits and pieces about Hive + Hadoop and have a few Hadoop/Hive related
libs on our GitHub account: https://github.com/forward.
>
>I've blogged a few bits (http://www.oobaloo.co.uk/) as has one of my colleagues
>(http://blog.fingertap.org/post/1255463384/hive-thrift-client).
>
>Another colleague also presented a little about our setup during a Hadoop meetup
>last summer (http://skillsmatter.com/podcast/home/hadoop-in-context-1591). The
>numbers Andy mentioned will be a little out of date but it does include some
>screenshots of a few of the surrounding apps we built that connect to Hive and
>Hadoop (including a web based Hive query tool + work queue).
>
>I had a quick search through the mailing lists when we had connection problems
>but I think most of it was discussed/resolved during a chat I had with Shevek
>from Karmasphere at a London pub following a Hadoop meetup :)
>
>If you're interested, I've posted a gist (https://gist.github.com/953926) that
>contains our HAProxy config; clients connect to 10000 and are balanced between
>:10001 and :10005 on 2 servers (so actually 10 backend servers).
>
>Be happy to talk more about our experience- feel free to ping me an email off
>list if you'd like.
>
>
>
>On 3 May 2011, at 19:18, Matthew Rathbone wrote:
>
>> Hey Paul,
>>
>> I'd be very interested in reading about your hadoop/hive setup, do you have a
>>blog post or anything describing this setup, or some of the issues you've have
>>with hive?
>>
>> --
>> Matthew Rathbone
>> Foursquare | Software Engineer | Server Engineering Team
>> matthew@foursquare.com | @rathboma | 4sq
>>
>> On Tuesday, May 3, 2011 at 2:15 PM, Paul Ingles wrote:
>> HiveServer does seem to support multiple connections but I think it still has
>>thread-safety problems (https://issues.apache.org/jira/browse/HIVE-80).
>>>
>>> We've (www.forward.co.uk) certainly had instability problems with the thrift
>>>server in the past and now run 5 or so instances behind the HAProxy
>>>load-balancer (http://haproxy.1wt.eu/). Since we did that it's been
>>>significantly better.
>>>
>>> I think the JDBC server still operates using thrift to connect to the
>>>HiveServer so I would expect it to have similar problems (but I may have got
>>>that wrong :)
>>>
>>>
>>> On 3 May 2011, at 18:59, Matthew Rathbone wrote:
>>>
>>>> Even if it is single threaded it certainly seems to support multiple
>>>>connections.
>>>>
>>>> We run 5 workers all connected at the same time executing a different query
>>>>each ( with a different connection per worker).
>>>>
>>>> Hope that helps
>>>>
>>>> Matthew
>>>> On Tuesday, May 3, 2011 at 1:40 PM, V.Senthil Kumar wrote:
>>>> Thanks Matthew. The wiki page http://wiki.apache.org/hadoop/Hive/HiveServer
>>>>says
>>>>> its single threaded. I have a queue of queries which gets added dynamically
>>>>all
>>>>> the time. By the time I run 1 query using 1 JDBC connection, the queue
gets
>>>>> added more queries and builds up a backlog. So, I was that's why I was
>>>>>wondering
>>>>> whether I can run two or more instances to avoid having a big backlog in
>>>>queue.
>>>>>
>>>>>
>>>>>
>>>>> ----- Original Message ----
>>>>> From: Matthew Rathbone <ma...@foursquare.com>
>>>>> To: user@hive.apache.org
>>>>> Sent: Tue, May 3, 2011 7:46:49 AM
>>>>> Subject: Re: HIVE Server multiple instances
>>>>>
>>>>> Why would you want to run two? I think it is multithreaded, so you can query
>>>>it
>>>>> from two different connections
>>>>>
>>>>> --
>>>>> Matthew Rathbone
>>>>> Foursquare | Software Engineer | Server Engineering Team
>>>>> matthew@foursquare.com | @rathboma | 4sq
>>>>>
>>>>> On Monday, May 2, 2011 at 6:41 PM, V.Senthil Kumar wrote:
>>>>> Hello,
>>>>>>
>>>>>> I have one instance of HIVE JDBC server running on port 10000. Can I run
>>>>>> another
>>>>>>
>>>>>> instance on different port ? Would it cause a concurrency issue on the
>>>>>> underlying data warehouse files ? Please clarify.
>>>>>>
>>>>>> Thanks,
>>>>>> V.Senthil Kumar
>>>
>>
>
>
Re: HIVE Server multiple instances
Posted by Paul Ingles <pa...@oobaloo.co.uk>.
For future reference I've posted a little more about our setup here:
http://oobaloo.co.uk/multiple-connections-with-hive
On Tue, May 3, 2011 at 8:01 PM, Paul Ingles <pa...@oobaloo.co.uk> wrote:
> Nothing specifically about our Hive setup although some of us at Forward
> have blogged bits and pieces about Hive + Hadoop and have a few Hadoop/Hive
> related libs on our GitHub account: https://github.com/forward.
>
> I've blogged a few bits (http://www.oobaloo.co.uk/) as has one of my
> colleagues (http://blog.fingertap.org/post/1255463384/hive-thrift-client).
>
> Another colleague also presented a little about our setup during a Hadoop
> meetup last summer (
> http://skillsmatter.com/podcast/home/hadoop-in-context-1591). The numbers
> Andy mentioned will be a little out of date but it does include some
> screenshots of a few of the surrounding apps we built that connect to Hive
> and Hadoop (including a web based Hive query tool + work queue).
>
> I had a quick search through the mailing lists when we had connection
> problems but I think most of it was discussed/resolved during a chat I had
> with Shevek from Karmasphere at a London pub following a Hadoop meetup :)
>
> If you're interested, I've posted a gist (https://gist.github.com/953926)
> that contains our HAProxy config; clients connect to 10000 and are balanced
> between :10001 and :10005 on 2 servers (so actually 10 backend servers).
>
> Be happy to talk more about our experience- feel free to ping me an email
> off list if you'd like.
>
>
> On 3 May 2011, at 19:18, Matthew Rathbone wrote:
>
> > Hey Paul,
> >
> > I'd be very interested in reading about your hadoop/hive setup, do you
> have a blog post or anything describing this setup, or some of the issues
> you've have with hive?
> >
> > --
> > Matthew Rathbone
> > Foursquare | Software Engineer | Server Engineering Team
> > matthew@foursquare.com | @rathboma | 4sq
> >
> > On Tuesday, May 3, 2011 at 2:15 PM, Paul Ingles wrote:
> > HiveServer does seem to support multiple connections but I think it still
> has thread-safety problems (https://issues.apache.org/jira/browse/HIVE-80
> ).
> >>
> >> We've (www.forward.co.uk) certainly had instability problems with the
> thrift server in the past and now run 5 or so instances behind the HAProxy
> load-balancer (http://haproxy.1wt.eu/). Since we did that it's been
> significantly better.
> >>
> >> I think the JDBC server still operates using thrift to connect to the
> HiveServer so I would expect it to have similar problems (but I may have got
> that wrong :)
> >>
> >>
> >> On 3 May 2011, at 18:59, Matthew Rathbone wrote:
> >>
> >>> Even if it is single threaded it certainly seems to support multiple
> connections.
> >>>
> >>> We run 5 workers all connected at the same time executing a different
> query each ( with a different connection per worker).
> >>>
> >>> Hope that helps
> >>>
> >>> Matthew
> >>> On Tuesday, May 3, 2011 at 1:40 PM, V.Senthil Kumar wrote:
> >>> Thanks Matthew. The wiki page
> http://wiki.apache.org/hadoop/Hive/HiveServer says
> >>>> its single threaded. I have a queue of queries which gets added
> dynamically all
> >>>> the time. By the time I run 1 query using 1 JDBC connection, the queue
> gets
> >>>> added more queries and builds up a backlog. So, I was that's why I was
> wondering
> >>>> whether I can run two or more instances to avoid having a big backlog
> in queue.
> >>>>
> >>>>
> >>>>
> >>>> ----- Original Message ----
> >>>> From: Matthew Rathbone <ma...@foursquare.com>
> >>>> To: user@hive.apache.org
> >>>> Sent: Tue, May 3, 2011 7:46:49 AM
> >>>> Subject: Re: HIVE Server multiple instances
> >>>>
> >>>> Why would you want to run two? I think it is multithreaded, so you can
> query it
> >>>> from two different connections
> >>>>
> >>>> --
> >>>> Matthew Rathbone
> >>>> Foursquare | Software Engineer | Server Engineering Team
> >>>> matthew@foursquare.com | @rathboma | 4sq
> >>>>
> >>>> On Monday, May 2, 2011 at 6:41 PM, V.Senthil Kumar wrote:
> >>>> Hello,
> >>>>>
> >>>>> I have one instance of HIVE JDBC server running on port 10000. Can I
> run
> >>>>> another
> >>>>>
> >>>>> instance on different port ? Would it cause a concurrency issue on
> the
> >>>>> underlying data warehouse files ? Please clarify.
> >>>>>
> >>>>> Thanks,
> >>>>> V.Senthil Kumar
> >>
> >
>
>
Re: HIVE Server multiple instances
Posted by Paul Ingles <pa...@oobaloo.co.uk>.
Nothing specifically about our Hive setup although some of us at Forward have blogged bits and pieces about Hive + Hadoop and have a few Hadoop/Hive related libs on our GitHub account: https://github.com/forward.
I've blogged a few bits (http://www.oobaloo.co.uk/) as has one of my colleagues (http://blog.fingertap.org/post/1255463384/hive-thrift-client).
Another colleague also presented a little about our setup during a Hadoop meetup last summer (http://skillsmatter.com/podcast/home/hadoop-in-context-1591). The numbers Andy mentioned will be a little out of date but it does include some screenshots of a few of the surrounding apps we built that connect to Hive and Hadoop (including a web based Hive query tool + work queue).
I had a quick search through the mailing lists when we had connection problems but I think most of it was discussed/resolved during a chat I had with Shevek from Karmasphere at a London pub following a Hadoop meetup :)
If you're interested, I've posted a gist (https://gist.github.com/953926) that contains our HAProxy config; clients connect to 10000 and are balanced between :10001 and :10005 on 2 servers (so actually 10 backend servers).
Be happy to talk more about our experience- feel free to ping me an email off list if you'd like.
On 3 May 2011, at 19:18, Matthew Rathbone wrote:
> Hey Paul,
>
> I'd be very interested in reading about your hadoop/hive setup, do you have a blog post or anything describing this setup, or some of the issues you've have with hive?
>
> --
> Matthew Rathbone
> Foursquare | Software Engineer | Server Engineering Team
> matthew@foursquare.com | @rathboma | 4sq
>
> On Tuesday, May 3, 2011 at 2:15 PM, Paul Ingles wrote:
> HiveServer does seem to support multiple connections but I think it still has thread-safety problems (https://issues.apache.org/jira/browse/HIVE-80).
>>
>> We've (www.forward.co.uk) certainly had instability problems with the thrift server in the past and now run 5 or so instances behind the HAProxy load-balancer (http://haproxy.1wt.eu/). Since we did that it's been significantly better.
>>
>> I think the JDBC server still operates using thrift to connect to the HiveServer so I would expect it to have similar problems (but I may have got that wrong :)
>>
>>
>> On 3 May 2011, at 18:59, Matthew Rathbone wrote:
>>
>>> Even if it is single threaded it certainly seems to support multiple connections.
>>>
>>> We run 5 workers all connected at the same time executing a different query each ( with a different connection per worker).
>>>
>>> Hope that helps
>>>
>>> Matthew
>>> On Tuesday, May 3, 2011 at 1:40 PM, V.Senthil Kumar wrote:
>>> Thanks Matthew. The wiki page http://wiki.apache.org/hadoop/Hive/HiveServer says
>>>> its single threaded. I have a queue of queries which gets added dynamically all
>>>> the time. By the time I run 1 query using 1 JDBC connection, the queue gets
>>>> added more queries and builds up a backlog. So, I was that's why I was wondering
>>>> whether I can run two or more instances to avoid having a big backlog in queue.
>>>>
>>>>
>>>>
>>>> ----- Original Message ----
>>>> From: Matthew Rathbone <ma...@foursquare.com>
>>>> To: user@hive.apache.org
>>>> Sent: Tue, May 3, 2011 7:46:49 AM
>>>> Subject: Re: HIVE Server multiple instances
>>>>
>>>> Why would you want to run two? I think it is multithreaded, so you can query it
>>>> from two different connections
>>>>
>>>> --
>>>> Matthew Rathbone
>>>> Foursquare | Software Engineer | Server Engineering Team
>>>> matthew@foursquare.com | @rathboma | 4sq
>>>>
>>>> On Monday, May 2, 2011 at 6:41 PM, V.Senthil Kumar wrote:
>>>> Hello,
>>>>>
>>>>> I have one instance of HIVE JDBC server running on port 10000. Can I run
>>>>> another
>>>>>
>>>>> instance on different port ? Would it cause a concurrency issue on the
>>>>> underlying data warehouse files ? Please clarify.
>>>>>
>>>>> Thanks,
>>>>> V.Senthil Kumar
>>
>
Re: HIVE Server multiple instances
Posted by Matthew Rathbone <ma...@foursquare.com>.
Hey Paul,
I'd be very interested in reading about your hadoop/hive setup, do you have a blog post or anything describing this setup, or some of the issues you've have with hive?
--
Matthew Rathbone
Foursquare | Software Engineer | Server Engineering Team
matthew@foursquare.com | @rathboma | 4sq
On Tuesday, May 3, 2011 at 2:15 PM, Paul Ingles wrote:
HiveServer does seem to support multiple connections but I think it still has thread-safety problems (https://issues.apache.org/jira/browse/HIVE-80).
>
> We've (www.forward.co.uk) certainly had instability problems with the thrift server in the past and now run 5 or so instances behind the HAProxy load-balancer (http://haproxy.1wt.eu/). Since we did that it's been significantly better.
>
> I think the JDBC server still operates using thrift to connect to the HiveServer so I would expect it to have similar problems (but I may have got that wrong :)
>
>
> On 3 May 2011, at 18:59, Matthew Rathbone wrote:
>
> > Even if it is single threaded it certainly seems to support multiple connections.
> >
> > We run 5 workers all connected at the same time executing a different query each ( with a different connection per worker).
> >
> > Hope that helps
> >
> > Matthew
> > On Tuesday, May 3, 2011 at 1:40 PM, V.Senthil Kumar wrote:
> > Thanks Matthew. The wiki page http://wiki.apache.org/hadoop/Hive/HiveServer says
> > > its single threaded. I have a queue of queries which gets added dynamically all
> > > the time. By the time I run 1 query using 1 JDBC connection, the queue gets
> > > added more queries and builds up a backlog. So, I was that's why I was wondering
> > > whether I can run two or more instances to avoid having a big backlog in queue.
> > >
> > >
> > >
> > > ----- Original Message ----
> > > From: Matthew Rathbone <ma...@foursquare.com>
> > > To: user@hive.apache.org
> > > Sent: Tue, May 3, 2011 7:46:49 AM
> > > Subject: Re: HIVE Server multiple instances
> > >
> > > Why would you want to run two? I think it is multithreaded, so you can query it
> > > from two different connections
> > >
> > > --
> > > Matthew Rathbone
> > > Foursquare | Software Engineer | Server Engineering Team
> > > matthew@foursquare.com | @rathboma | 4sq
> > >
> > > On Monday, May 2, 2011 at 6:41 PM, V.Senthil Kumar wrote:
> > > Hello,
> > > >
> > > > I have one instance of HIVE JDBC server running on port 10000. Can I run
> > > > another
> > > >
> > > > instance on different port ? Would it cause a concurrency issue on the
> > > > underlying data warehouse files ? Please clarify.
> > > >
> > > > Thanks,
> > > > V.Senthil Kumar
>
Re: HIVE Server multiple instances
Posted by Paul Ingles <pa...@oobaloo.co.uk>.
HiveServer does seem to support multiple connections but I think it still has thread-safety problems (https://issues.apache.org/jira/browse/HIVE-80).
We've (www.forward.co.uk) certainly had instability problems with the thrift server in the past and now run 5 or so instances behind the HAProxy load-balancer (http://haproxy.1wt.eu/). Since we did that it's been significantly better.
I think the JDBC server still operates using thrift to connect to the HiveServer so I would expect it to have similar problems (but I may have got that wrong :)
On 3 May 2011, at 18:59, Matthew Rathbone wrote:
> Even if it is single threaded it certainly seems to support multiple connections.
>
> We run 5 workers all connected at the same time executing a different query each ( with a different connection per worker).
>
> Hope that helps
>
> Matthew
> On Tuesday, May 3, 2011 at 1:40 PM, V.Senthil Kumar wrote:
> Thanks Matthew. The wiki page http://wiki.apache.org/hadoop/Hive/HiveServer says
>> its single threaded. I have a queue of queries which gets added dynamically all
>> the time. By the time I run 1 query using 1 JDBC connection, the queue gets
>> added more queries and builds up a backlog. So, I was that's why I was wondering
>> whether I can run two or more instances to avoid having a big backlog in queue.
>>
>>
>>
>> ----- Original Message ----
>> From: Matthew Rathbone <ma...@foursquare.com>
>> To: user@hive.apache.org
>> Sent: Tue, May 3, 2011 7:46:49 AM
>> Subject: Re: HIVE Server multiple instances
>>
>> Why would you want to run two? I think it is multithreaded, so you can query it
>> from two different connections
>>
>> --
>> Matthew Rathbone
>> Foursquare | Software Engineer | Server Engineering Team
>> matthew@foursquare.com | @rathboma | 4sq
>>
>> On Monday, May 2, 2011 at 6:41 PM, V.Senthil Kumar wrote:
>> Hello,
>>>
>>> I have one instance of HIVE JDBC server running on port 10000. Can I run
>>> another
>>>
>>> instance on different port ? Would it cause a concurrency issue on the
>>> underlying data warehouse files ? Please clarify.
>>>
>>> Thanks,
>>> V.Senthil Kumar
>>
>
Re: HIVE Server multiple instances
Posted by "V.Senthil Kumar" <va...@yahoo.com>.
Thanks. That really helps and answers my question.
----- Original Message ----
From: Matthew Rathbone <ma...@foursquare.com>
To: user@hive.apache.org
Sent: Tue, May 3, 2011 10:59:37 AM
Subject: Re: HIVE Server multiple instances
Even if it is single threaded it certainly seems to support multiple
connections.
We run 5 workers all connected at the same time executing a different query each
( with a different connection per worker).
Hope that helps
Matthew
On Tuesday, May 3, 2011 at 1:40 PM, V.Senthil Kumar wrote:
Thanks Matthew. The wiki page http://wiki.apache.org/hadoop/Hive/HiveServer says
> its single threaded. I have a queue of queries which gets added dynamically all
>
> the time. By the time I run 1 query using 1 JDBC connection, the queue gets
> added more queries and builds up a backlog. So, I was that's why I was
>wondering
>
> whether I can run two or more instances to avoid having a big backlog in
queue.
>
>
>
> ----- Original Message ----
> From: Matthew Rathbone <ma...@foursquare.com>
> To: user@hive.apache.org
> Sent: Tue, May 3, 2011 7:46:49 AM
> Subject: Re: HIVE Server multiple instances
>
> Why would you want to run two? I think it is multithreaded, so you can query it
>
> from two different connections
>
> --
> Matthew Rathbone
> Foursquare | Software Engineer | Server Engineering Team
> matthew@foursquare.com | @rathboma | 4sq
>
> On Monday, May 2, 2011 at 6:41 PM, V.Senthil Kumar wrote:
> Hello,
> >
> > I have one instance of HIVE JDBC server running on port 10000. Can I run
> > another
> >
> > instance on different port ? Would it cause a concurrency issue on the
> > underlying data warehouse files ? Please clarify.
> >
> > Thanks,
> > V.Senthil Kumar
>
Re: HIVE Server multiple instances
Posted by Matthew Rathbone <ma...@foursquare.com>.
Even if it is single threaded it certainly seems to support multiple connections.
We run 5 workers all connected at the same time executing a different query each ( with a different connection per worker).
Hope that helps
Matthew
On Tuesday, May 3, 2011 at 1:40 PM, V.Senthil Kumar wrote:
Thanks Matthew. The wiki page http://wiki.apache.org/hadoop/Hive/HiveServer says
> its single threaded. I have a queue of queries which gets added dynamically all
> the time. By the time I run 1 query using 1 JDBC connection, the queue gets
> added more queries and builds up a backlog. So, I was that's why I was wondering
> whether I can run two or more instances to avoid having a big backlog in queue.
>
>
>
> ----- Original Message ----
> From: Matthew Rathbone <ma...@foursquare.com>
> To: user@hive.apache.org
> Sent: Tue, May 3, 2011 7:46:49 AM
> Subject: Re: HIVE Server multiple instances
>
> Why would you want to run two? I think it is multithreaded, so you can query it
> from two different connections
>
> --
> Matthew Rathbone
> Foursquare | Software Engineer | Server Engineering Team
> matthew@foursquare.com | @rathboma | 4sq
>
> On Monday, May 2, 2011 at 6:41 PM, V.Senthil Kumar wrote:
> Hello,
> >
> > I have one instance of HIVE JDBC server running on port 10000. Can I run
> > another
> >
> > instance on different port ? Would it cause a concurrency issue on the
> > underlying data warehouse files ? Please clarify.
> >
> > Thanks,
> > V.Senthil Kumar
>
Re: HIVE Server multiple instances
Posted by "V.Senthil Kumar" <va...@yahoo.com>.
Thanks Matthew. The wiki page http://wiki.apache.org/hadoop/Hive/HiveServer says
its single threaded. I have a queue of queries which gets added dynamically all
the time. By the time I run 1 query using 1 JDBC connection, the queue gets
added more queries and builds up a backlog. So, I was that's why I was wondering
whether I can run two or more instances to avoid having a big backlog in queue.
----- Original Message ----
From: Matthew Rathbone <ma...@foursquare.com>
To: user@hive.apache.org
Sent: Tue, May 3, 2011 7:46:49 AM
Subject: Re: HIVE Server multiple instances
Why would you want to run two? I think it is multithreaded, so you can query it
from two different connections
--
Matthew Rathbone
Foursquare | Software Engineer | Server Engineering Team
matthew@foursquare.com | @rathboma | 4sq
On Monday, May 2, 2011 at 6:41 PM, V.Senthil Kumar wrote:
Hello,
>
> I have one instance of HIVE JDBC server running on port 10000. Can I run
>another
>
> instance on different port ? Would it cause a concurrency issue on the
> underlying data warehouse files ? Please clarify.
>
> Thanks,
> V.Senthil Kumar
>
Re: HIVE Server multiple instances
Posted by Matthew Rathbone <ma...@foursquare.com>.
Why would you want to run two? I think it is multithreaded, so you can query it from two different connections
--
Matthew Rathbone
Foursquare | Software Engineer | Server Engineering Team
matthew@foursquare.com | @rathboma | 4sq
On Monday, May 2, 2011 at 6:41 PM, V.Senthil Kumar wrote:
Hello,
>
> I have one instance of HIVE JDBC server running on port 10000. Can I run another
> instance on different port ? Would it cause a concurrency issue on the
> underlying data warehouse files ? Please clarify.
>
> Thanks,
> V.Senthil Kumar
>