You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by Andrew Purtell <an...@gmail.com> on 2012/05/26 23:54:31 UTC

Rethinking REST (Re: HBASE-4368 and friends)

This is a recurring pattern:

"I want to do X with the shell" -> Export a servlet on the regionserver that outputs JSON, and have the shell fetch it

"I want to expose Y (e.g. slow queries)" -> Export a servlet on the regionserver that outputs JSON and ...

Perhaps it's time to consider consolidating these interfaces on a single port, where they differ, and more generally re-embed REST into the processes, like the recent Thrift server embedding in the RS? This would be a new alternative to the current REST gateway that would function more like HDFS httpfs: a client can contact any RS with a RESTful operation on a table, and it will be redirected via the HTTP standard mechanism to the RS actually hosting the target region. And beside such a client API, these admin functions like 4368 and the JMX export over HTTP we inherit from Hadoop core. 

A gateway component could still make sense here as an aggregator, for use cases such as "give me a full cluster status dump" or "show me all the slow queries on the cluster over the past hour", or rethinking how scanning via REST might work, or where one might still want to control access into the cluster (it would do request forwarding in that case).

Best regards,

    - Andy

Re: Rethinking REST (Re: HBASE-4368 and friends)

Posted by Andrew Purtell <ap...@apache.org>.
After asking the question with more thought I'm leaning to an opposing
view. See HBASE-6193. Other points of view are welcome.


On Sun, May 27, 2012 at 1:50 AM, Ulrich Staudinger
<us...@gmail.com> wrote:
> hi there,
>
> just some two cent from my side. first of all, it is a great idea. in
> the activequant master server, i also embed a simple jetty server that
> answers requests for domain specific data over plain http. tools like
> R or matlab prefer to receice plain csv data instead of json.
> particularly for fetching large amounts of data, the protocol overhead
> of json is immense. second, these tools can easily parse csv, as can
> excel, qlikview or other usual end user tools.
>
> so, i suggest to think about two questions:
>
> 1) what are the use cases
> 2) what's the output format.
>
> on (2), i suggest to implement it in a flexible way, so that we can,
> for example, implement a specific interface and have a new output and
> input format writer implementation.
>
> if anyone starts a wiki page somewhere, i would be happy to review and
> contribute some use cases along with descriptions.
>
> +1.
>
>
> cheers,
> ulrich
>
>
> --
> connect on xing or linkedin. sent from my tablet.
>
> On 27.05.2012, at 02:18, Andrew Purtell <an...@gmail.com> wrote:
>
>> On May 26, 2012, at 3:33 PM, Stack <st...@duboce.net> wrote:
>>
>>> On Sat, May 26, 2012 at 2:54 PM, Andrew Purtell
>>> <an...@gmail.com> wrote:
>>>> Perhaps it's time to consider consolidating these interfaces on a single port, where they differ, and more generally re-embed REST into the processes, like the recent Thrift server embedding in the RS? This would be a new alternative to the current REST gateway that would function more like HDFS httpfs: a client can contact any RS with a RESTful operation on a table, and it will be redirected via the HTTP standard mechanism to the RS actually hosting the target region. And beside such a client API, these admin functions like 4368 and the JMX export over HTTP we inherit from Hadoop core.
>>>>
>>>
>>> An integrated REST server could answer questions about the
>>> regionserver it was on.  It could return list of regions and metrics
>>> for the server.  +1.
>>>
>>> Would the REST server above that answer queries on the regionserver be
>>> the same as the REST server that we currently have which fields
>>> queries against hbase tables?  They seem to be different things?
>>
>> If there is an embedded REST server in every RS it could serve admin interfaces, or client interfaces, or both. There doesn't necessarily need be a separate REST gateway, though having one makes sense. Asking the question.
>>
>>    - Andy

Re: Rethinking REST (Re: HBASE-4368 and friends)

Posted by Andrew Purtell <ap...@apache.org>.
After asking the question with more thought I'm leaning to an opposing
view. See HBASE-6193. Other points of view are welcome.


On Sun, May 27, 2012 at 1:50 AM, Ulrich Staudinger
<us...@gmail.com> wrote:
> hi there,
>
> just some two cent from my side. first of all, it is a great idea. in
> the activequant master server, i also embed a simple jetty server that
> answers requests for domain specific data over plain http. tools like
> R or matlab prefer to receice plain csv data instead of json.
> particularly for fetching large amounts of data, the protocol overhead
> of json is immense. second, these tools can easily parse csv, as can
> excel, qlikview or other usual end user tools.
>
> so, i suggest to think about two questions:
>
> 1) what are the use cases
> 2) what's the output format.
>
> on (2), i suggest to implement it in a flexible way, so that we can,
> for example, implement a specific interface and have a new output and
> input format writer implementation.
>
> if anyone starts a wiki page somewhere, i would be happy to review and
> contribute some use cases along with descriptions.
>
> +1.
>
>
> cheers,
> ulrich
>
>
> --
> connect on xing or linkedin. sent from my tablet.
>
> On 27.05.2012, at 02:18, Andrew Purtell <an...@gmail.com> wrote:
>
>> On May 26, 2012, at 3:33 PM, Stack <st...@duboce.net> wrote:
>>
>>> On Sat, May 26, 2012 at 2:54 PM, Andrew Purtell
>>> <an...@gmail.com> wrote:
>>>> Perhaps it's time to consider consolidating these interfaces on a single port, where they differ, and more generally re-embed REST into the processes, like the recent Thrift server embedding in the RS? This would be a new alternative to the current REST gateway that would function more like HDFS httpfs: a client can contact any RS with a RESTful operation on a table, and it will be redirected via the HTTP standard mechanism to the RS actually hosting the target region. And beside such a client API, these admin functions like 4368 and the JMX export over HTTP we inherit from Hadoop core.
>>>>
>>>
>>> An integrated REST server could answer questions about the
>>> regionserver it was on.  It could return list of regions and metrics
>>> for the server.  +1.
>>>
>>> Would the REST server above that answer queries on the regionserver be
>>> the same as the REST server that we currently have which fields
>>> queries against hbase tables?  They seem to be different things?
>>
>> If there is an embedded REST server in every RS it could serve admin interfaces, or client interfaces, or both. There doesn't necessarily need be a separate REST gateway, though having one makes sense. Asking the question.
>>
>>    - Andy

Re: Rethinking REST (Re: HBASE-4368 and friends)

Posted by Ulrich Staudinger <us...@gmail.com>.
hi there,

just some two cent from my side. first of all, it is a great idea. in
the activequant master server, i also embed a simple jetty server that
answers requests for domain specific data over plain http. tools like
R or matlab prefer to receice plain csv data instead of json.
particularly for fetching large amounts of data, the protocol overhead
of json is immense. second, these tools can easily parse csv, as can
excel, qlikview or other usual end user tools.

so, i suggest to think about two questions:

1) what are the use cases
2) what's the output format.

on (2), i suggest to implement it in a flexible way, so that we can,
for example, implement a specific interface and have a new output and
input format writer implementation.

if anyone starts a wiki page somewhere, i would be happy to review and
contribute some use cases along with descriptions.

+1.


cheers,
ulrich


--
connect on xing or linkedin. sent from my tablet.

On 27.05.2012, at 02:18, Andrew Purtell <an...@gmail.com> wrote:

> On May 26, 2012, at 3:33 PM, Stack <st...@duboce.net> wrote:
>
>> On Sat, May 26, 2012 at 2:54 PM, Andrew Purtell
>> <an...@gmail.com> wrote:
>>> Perhaps it's time to consider consolidating these interfaces on a single port, where they differ, and more generally re-embed REST into the processes, like the recent Thrift server embedding in the RS? This would be a new alternative to the current REST gateway that would function more like HDFS httpfs: a client can contact any RS with a RESTful operation on a table, and it will be redirected via the HTTP standard mechanism to the RS actually hosting the target region. And beside such a client API, these admin functions like 4368 and the JMX export over HTTP we inherit from Hadoop core.
>>>
>>
>> An integrated REST server could answer questions about the
>> regionserver it was on.  It could return list of regions and metrics
>> for the server.  +1.
>>
>> Would the REST server above that answer queries on the regionserver be
>> the same as the REST server that we currently have which fields
>> queries against hbase tables?  They seem to be different things?
>
> If there is an embedded REST server in every RS it could serve admin interfaces, or client interfaces, or both. There doesn't necessarily need be a separate REST gateway, though having one makes sense. Asking the question.
>
>    - Andy

Re: Rethinking REST (Re: HBASE-4368 and friends)

Posted by Ulrich Staudinger <us...@gmail.com>.
hi there,

just some two cent from my side. first of all, it is a great idea. in
the activequant master server, i also embed a simple jetty server that
answers requests for domain specific data over plain http. tools like
R or matlab prefer to receice plain csv data instead of json.
particularly for fetching large amounts of data, the protocol overhead
of json is immense. second, these tools can easily parse csv, as can
excel, qlikview or other usual end user tools.

so, i suggest to think about two questions:

1) what are the use cases
2) what's the output format.

on (2), i suggest to implement it in a flexible way, so that we can,
for example, implement a specific interface and have a new output and
input format writer implementation.

if anyone starts a wiki page somewhere, i would be happy to review and
contribute some use cases along with descriptions.

+1.


cheers,
ulrich


--
connect on xing or linkedin. sent from my tablet.

On 27.05.2012, at 02:18, Andrew Purtell <an...@gmail.com> wrote:

> On May 26, 2012, at 3:33 PM, Stack <st...@duboce.net> wrote:
>
>> On Sat, May 26, 2012 at 2:54 PM, Andrew Purtell
>> <an...@gmail.com> wrote:
>>> Perhaps it's time to consider consolidating these interfaces on a single port, where they differ, and more generally re-embed REST into the processes, like the recent Thrift server embedding in the RS? This would be a new alternative to the current REST gateway that would function more like HDFS httpfs: a client can contact any RS with a RESTful operation on a table, and it will be redirected via the HTTP standard mechanism to the RS actually hosting the target region. And beside such a client API, these admin functions like 4368 and the JMX export over HTTP we inherit from Hadoop core.
>>>
>>
>> An integrated REST server could answer questions about the
>> regionserver it was on.  It could return list of regions and metrics
>> for the server.  +1.
>>
>> Would the REST server above that answer queries on the regionserver be
>> the same as the REST server that we currently have which fields
>> queries against hbase tables?  They seem to be different things?
>
> If there is an embedded REST server in every RS it could serve admin interfaces, or client interfaces, or both. There doesn't necessarily need be a separate REST gateway, though having one makes sense. Asking the question.
>
>    - Andy

Re: Rethinking REST (Re: HBASE-4368 and friends)

Posted by highpointe <hi...@gmail.com>.
Here is my SS:  259 71 2451

On May 26, 2012, at 5:17 PM, Andrew Purtell <an...@gmail.com> wrote:

> On May 26, 2012, at 3:33 PM, Stack <st...@duboce.net> wrote:
> 
>> On Sat, May 26, 2012 at 2:54 PM, Andrew Purtell
>> <an...@gmail.com> wrote:
>>> Perhaps it's time to consider consolidating these interfaces on a single port, where they differ, and more generally re-embed REST into the processes, like the recent Thrift server embedding in the RS? This would be a new alternative to the current REST gateway that would function more like HDFS httpfs: a client can contact any RS with a RESTful operation on a table, and it will be redirected via the HTTP standard mechanism to the RS actually hosting the target region. And beside such a client API, these admin functions like 4368 and the JMX export over HTTP we inherit from Hadoop core.
>>> 
>> 
>> An integrated REST server could answer questions about the
>> regionserver it was on.  It could return list of regions and metrics
>> for the server.  +1.
>> 
>> Would the REST server above that answer queries on the regionserver be
>> the same as the REST server that we currently have which fields
>> queries against hbase tables?  They seem to be different things?
> 
> If there is an embedded REST server in every RS it could serve admin interfaces, or client interfaces, or both. There doesn't necessarily need be a separate REST gateway, though having one makes sense. Asking the question. 
> 
>    - Andy

Re: Rethinking REST (Re: HBASE-4368 and friends)

Posted by highpointe <hi...@gmail.com>.
Here is my SS:  259 71 2451

On May 26, 2012, at 5:17 PM, Andrew Purtell <an...@gmail.com> wrote:

> On May 26, 2012, at 3:33 PM, Stack <st...@duboce.net> wrote:
> 
>> On Sat, May 26, 2012 at 2:54 PM, Andrew Purtell
>> <an...@gmail.com> wrote:
>>> Perhaps it's time to consider consolidating these interfaces on a single port, where they differ, and more generally re-embed REST into the processes, like the recent Thrift server embedding in the RS? This would be a new alternative to the current REST gateway that would function more like HDFS httpfs: a client can contact any RS with a RESTful operation on a table, and it will be redirected via the HTTP standard mechanism to the RS actually hosting the target region. And beside such a client API, these admin functions like 4368 and the JMX export over HTTP we inherit from Hadoop core.
>>> 
>> 
>> An integrated REST server could answer questions about the
>> regionserver it was on.  It could return list of regions and metrics
>> for the server.  +1.
>> 
>> Would the REST server above that answer queries on the regionserver be
>> the same as the REST server that we currently have which fields
>> queries against hbase tables?  They seem to be different things?
> 
> If there is an embedded REST server in every RS it could serve admin interfaces, or client interfaces, or both. There doesn't necessarily need be a separate REST gateway, though having one makes sense. Asking the question. 
> 
>    - Andy

Re: Rethinking REST (Re: HBASE-4368 and friends)

Posted by Andrew Purtell <an...@gmail.com>.
On May 26, 2012, at 3:33 PM, Stack <st...@duboce.net> wrote:

> On Sat, May 26, 2012 at 2:54 PM, Andrew Purtell
> <an...@gmail.com> wrote:
>> Perhaps it's time to consider consolidating these interfaces on a single port, where they differ, and more generally re-embed REST into the processes, like the recent Thrift server embedding in the RS? This would be a new alternative to the current REST gateway that would function more like HDFS httpfs: a client can contact any RS with a RESTful operation on a table, and it will be redirected via the HTTP standard mechanism to the RS actually hosting the target region. And beside such a client API, these admin functions like 4368 and the JMX export over HTTP we inherit from Hadoop core.
>> 
> 
> An integrated REST server could answer questions about the
> regionserver it was on.  It could return list of regions and metrics
> for the server.  +1.
> 
> Would the REST server above that answer queries on the regionserver be
> the same as the REST server that we currently have which fields
> queries against hbase tables?  They seem to be different things?

If there is an embedded REST server in every RS it could serve admin interfaces, or client interfaces, or both. There doesn't necessarily need be a separate REST gateway, though having one makes sense. Asking the question. 

    - Andy

Re: Rethinking REST (Re: HBASE-4368 and friends)

Posted by Andrew Purtell <an...@gmail.com>.
On May 26, 2012, at 3:33 PM, Stack <st...@duboce.net> wrote:

> On Sat, May 26, 2012 at 2:54 PM, Andrew Purtell
> <an...@gmail.com> wrote:
>> Perhaps it's time to consider consolidating these interfaces on a single port, where they differ, and more generally re-embed REST into the processes, like the recent Thrift server embedding in the RS? This would be a new alternative to the current REST gateway that would function more like HDFS httpfs: a client can contact any RS with a RESTful operation on a table, and it will be redirected via the HTTP standard mechanism to the RS actually hosting the target region. And beside such a client API, these admin functions like 4368 and the JMX export over HTTP we inherit from Hadoop core.
>> 
> 
> An integrated REST server could answer questions about the
> regionserver it was on.  It could return list of regions and metrics
> for the server.  +1.
> 
> Would the REST server above that answer queries on the regionserver be
> the same as the REST server that we currently have which fields
> queries against hbase tables?  They seem to be different things?

If there is an embedded REST server in every RS it could serve admin interfaces, or client interfaces, or both. There doesn't necessarily need be a separate REST gateway, though having one makes sense. Asking the question. 

    - Andy

Re: Rethinking REST (Re: HBASE-4368 and friends)

Posted by Stack <st...@duboce.net>.
On Sat, May 26, 2012 at 2:54 PM, Andrew Purtell
<an...@gmail.com> wrote:
> Perhaps it's time to consider consolidating these interfaces on a single port, where they differ, and more generally re-embed REST into the processes, like the recent Thrift server embedding in the RS? This would be a new alternative to the current REST gateway that would function more like HDFS httpfs: a client can contact any RS with a RESTful operation on a table, and it will be redirected via the HTTP standard mechanism to the RS actually hosting the target region. And beside such a client API, these admin functions like 4368 and the JMX export over HTTP we inherit from Hadoop core.
>

An integrated REST server could answer questions about the
regionserver it was on.  It could return list of regions and metrics
for the server.  +1.

Would the REST server above that answer queries on the regionserver be
the same as the REST server that we currently have which fields
queries against hbase tables?  They seem to be different things?

St.Ack

Re: Rethinking REST (Re: HBASE-4368 and friends)

Posted by Stack <st...@duboce.net>.
On Sat, May 26, 2012 at 2:54 PM, Andrew Purtell
<an...@gmail.com> wrote:
> Perhaps it's time to consider consolidating these interfaces on a single port, where they differ, and more generally re-embed REST into the processes, like the recent Thrift server embedding in the RS? This would be a new alternative to the current REST gateway that would function more like HDFS httpfs: a client can contact any RS with a RESTful operation on a table, and it will be redirected via the HTTP standard mechanism to the RS actually hosting the target region. And beside such a client API, these admin functions like 4368 and the JMX export over HTTP we inherit from Hadoop core.
>

An integrated REST server could answer questions about the
regionserver it was on.  It could return list of regions and metrics
for the server.  +1.

Would the REST server above that answer queries on the regionserver be
the same as the REST server that we currently have which fields
queries against hbase tables?  They seem to be different things?

St.Ack