You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@aurora.apache.org by Isaac Councill <is...@hioscar.com> on 2014/12/29 07:27:33 UTC

api vs apibeta

tl;dr;
apibeta seems way faster (and arguably better) than thrift api. What are
the long term objectives for apibeta?


Hi,

I've been working on some aurora integrations, primarily a blackbox
monitoring tool at present, and was looking for the best way to communicate
with the scheduler.

For a large read-only example, I wanted to dump the latest scheduler status
info for all our prod jobs, basically:

for all roles:
  for all jobs in role:
    get scheduler status

We have about 120 prod jobs in aurora right now (growing fast). I
benchmarked 3 strategies against our prod cluster (mean of 5 tries each
from remote vpn, variance was small in each case):

1) aurora2 client: ./aurora2 job status cluster/<role>/prod > /dev/null
126.0sec

2) golang thrift API
584.3sec (I might be able make a better task query, but still... this is
for only ~120 calls)

3) Pure json apibeta client in golang
13.4sec (again, might be able to optimize query strategy)

As a side note, getting the golang thrift client to work was a very painful
and illuminating experience.

I'm inclined to stick with apibeta. It's fast and the documentation is
great. If api changes become a concern, well after today I'd honestly
prefer rolling my own binding generator.

Are there plans for /apibeta wrt /api?

Re: api vs apibeta

Posted by Bill Farner <wf...@apache.org>.
I added it as a proof of concept, and a temporary solution to retrofit our
current API to make it more consumable.  Unfortunately, i don't think
there's any sane way for it to become supported, and we should instead
focus on a first-class REST-like API.  I realized while we intended to do
this, we did not create a ticket for it.  I've created
https://issues.apache.org/jira/browse/AURORA-987.

-=Bill

On Mon, Jan 5, 2015 at 10:45 AM, Zameer Manji <zm...@twopensource.com>
wrote:

> What is the history/origin of /apibeta anyways?
>
> On Mon, Dec 29, 2014 at 1:32 PM, Isaac Councill <is...@hioscar.com> wrote:
>
> > Ok, here's code for the pure json apibeta bench. I realized it was a bit
> > unfair because I was making concurrent requests (max 3). Stripping out
> all
> > concurrency, it's coming in at 17.2sec.
> >
> > drop in src/aurora/, no dependencies required.
> >
> > go build o build -o dist/aurora2 aurora/client2/bench
> > dist/aurora2 -url="http://<aurora_master>:8081/apibeta"
> >
> >
> > On Mon, Dec 29, 2014 at 3:36 PM, Isaac Councill <is...@hioscar.com>
> wrote:
> >
> >> Here's source from go thrift (warning: very ugly). I had to make a few
> >> modifications to the ttypes and client libraries to get it working. It
> >> requires git.apache.org/thrift.git from the 0.9.2 tree (0.9.1 generates
> >> code that is much farther from correct).
> >>
> >> after putting it in src/aurora/client:
> >> go build -o dist/aurora aurora/client/thrift/read_only_scheduler-remote
> >>
> >> then:
> >> time dist/aurora -u="http://<aurora_master>:8081/api" -http=true
> -P=json
> >> doBench
> >>
> >> Perhaps my general unhappiness with thrift right now has to due with
> >> immature go support. Just take a look at that generated code, and it
> didn't
> >> actually work, at all, without some messing around.
> >>
> >> Re: Kevin's question about differences between getTaskStatus requests,
> >> the contents I was filling in are the same (role, environment, jobName),
> >> but inspecting network traffic, it did appear that the taskStatusRequest
> >> from go thrift is adding empty values for nil list fields, which could
> >> impact the results.
> >>
> >> As for the apibeta client, it will take me a little extra time to rip
> >> that out into a sharable form.
> >>
> >> On Mon, Dec 29, 2014 at 2:17 PM, Jake Farrell <jf...@apache.org>
> >> wrote:
> >>
> >>> Hey Isaac
> >>> Would love to hear your pain points with Thrift and also can you share
> >>> your
> >>> source for the test clients
> >>>
> >>> -Jake
> >>>
> >>> On Mon, Dec 29, 2014 at 1:27 AM, Isaac Councill <is...@hioscar.com>
> >>> wrote:
> >>>
> >>> > tl;dr;
> >>> > apibeta seems way faster (and arguably better) than thrift api. What
> >>> are
> >>> > the long term objectives for apibeta?
> >>> >
> >>> >
> >>> > Hi,
> >>> >
> >>> > I've been working on some aurora integrations, primarily a blackbox
> >>> > monitoring tool at present, and was looking for the best way to
> >>> communicate
> >>> > with the scheduler.
> >>> >
> >>> > For a large read-only example, I wanted to dump the latest scheduler
> >>> status
> >>> > info for all our prod jobs, basically:
> >>> >
> >>> > for all roles:
> >>> >   for all jobs in role:
> >>> >     get scheduler status
> >>> >
> >>> > We have about 120 prod jobs in aurora right now (growing fast). I
> >>> > benchmarked 3 strategies against our prod cluster (mean of 5 tries
> each
> >>> > from remote vpn, variance was small in each case):
> >>> >
> >>> > 1) aurora2 client: ./aurora2 job status cluster/<role>/prod >
> /dev/null
> >>> > 126.0sec
> >>> >
> >>> > 2) golang thrift API
> >>> > 584.3sec (I might be able make a better task query, but still... this
> >>> is
> >>> > for only ~120 calls)
> >>> >
> >>> > 3) Pure json apibeta client in golang
> >>> > 13.4sec (again, might be able to optimize query strategy)
> >>> >
> >>> > As a side note, getting the golang thrift client to work was a very
> >>> painful
> >>> > and illuminating experience.
> >>> >
> >>> > I'm inclined to stick with apibeta. It's fast and the documentation
> is
> >>> > great. If api changes become a concern, well after today I'd honestly
> >>> > prefer rolling my own binding generator.
> >>> >
> >>> > Are there plans for /apibeta wrt /api?
> >>> >
> >>>
> >>
> >>
> >
>
>
> --
> Zameer Manji
>

Re: api vs apibeta

Posted by Zameer Manji <zm...@twopensource.com>.
What is the history/origin of /apibeta anyways?

On Mon, Dec 29, 2014 at 1:32 PM, Isaac Councill <is...@hioscar.com> wrote:

> Ok, here's code for the pure json apibeta bench. I realized it was a bit
> unfair because I was making concurrent requests (max 3). Stripping out all
> concurrency, it's coming in at 17.2sec.
>
> drop in src/aurora/, no dependencies required.
>
> go build o build -o dist/aurora2 aurora/client2/bench
> dist/aurora2 -url="http://<aurora_master>:8081/apibeta"
>
>
> On Mon, Dec 29, 2014 at 3:36 PM, Isaac Councill <is...@hioscar.com> wrote:
>
>> Here's source from go thrift (warning: very ugly). I had to make a few
>> modifications to the ttypes and client libraries to get it working. It
>> requires git.apache.org/thrift.git from the 0.9.2 tree (0.9.1 generates
>> code that is much farther from correct).
>>
>> after putting it in src/aurora/client:
>> go build -o dist/aurora aurora/client/thrift/read_only_scheduler-remote
>>
>> then:
>> time dist/aurora -u="http://<aurora_master>:8081/api" -http=true -P=json
>> doBench
>>
>> Perhaps my general unhappiness with thrift right now has to due with
>> immature go support. Just take a look at that generated code, and it didn't
>> actually work, at all, without some messing around.
>>
>> Re: Kevin's question about differences between getTaskStatus requests,
>> the contents I was filling in are the same (role, environment, jobName),
>> but inspecting network traffic, it did appear that the taskStatusRequest
>> from go thrift is adding empty values for nil list fields, which could
>> impact the results.
>>
>> As for the apibeta client, it will take me a little extra time to rip
>> that out into a sharable form.
>>
>> On Mon, Dec 29, 2014 at 2:17 PM, Jake Farrell <jf...@apache.org>
>> wrote:
>>
>>> Hey Isaac
>>> Would love to hear your pain points with Thrift and also can you share
>>> your
>>> source for the test clients
>>>
>>> -Jake
>>>
>>> On Mon, Dec 29, 2014 at 1:27 AM, Isaac Councill <is...@hioscar.com>
>>> wrote:
>>>
>>> > tl;dr;
>>> > apibeta seems way faster (and arguably better) than thrift api. What
>>> are
>>> > the long term objectives for apibeta?
>>> >
>>> >
>>> > Hi,
>>> >
>>> > I've been working on some aurora integrations, primarily a blackbox
>>> > monitoring tool at present, and was looking for the best way to
>>> communicate
>>> > with the scheduler.
>>> >
>>> > For a large read-only example, I wanted to dump the latest scheduler
>>> status
>>> > info for all our prod jobs, basically:
>>> >
>>> > for all roles:
>>> >   for all jobs in role:
>>> >     get scheduler status
>>> >
>>> > We have about 120 prod jobs in aurora right now (growing fast). I
>>> > benchmarked 3 strategies against our prod cluster (mean of 5 tries each
>>> > from remote vpn, variance was small in each case):
>>> >
>>> > 1) aurora2 client: ./aurora2 job status cluster/<role>/prod > /dev/null
>>> > 126.0sec
>>> >
>>> > 2) golang thrift API
>>> > 584.3sec (I might be able make a better task query, but still... this
>>> is
>>> > for only ~120 calls)
>>> >
>>> > 3) Pure json apibeta client in golang
>>> > 13.4sec (again, might be able to optimize query strategy)
>>> >
>>> > As a side note, getting the golang thrift client to work was a very
>>> painful
>>> > and illuminating experience.
>>> >
>>> > I'm inclined to stick with apibeta. It's fast and the documentation is
>>> > great. If api changes become a concern, well after today I'd honestly
>>> > prefer rolling my own binding generator.
>>> >
>>> > Are there plans for /apibeta wrt /api?
>>> >
>>>
>>
>>
>


-- 
Zameer Manji

Re: api vs apibeta

Posted by Isaac Councill <is...@hioscar.com>.
Ok, here's code for the pure json apibeta bench. I realized it was a bit
unfair because I was making concurrent requests (max 3). Stripping out all
concurrency, it's coming in at 17.2sec.

drop in src/aurora/, no dependencies required.

go build o build -o dist/aurora2 aurora/client2/bench
dist/aurora2 -url="http://<aurora_master>:8081/apibeta"


On Mon, Dec 29, 2014 at 3:36 PM, Isaac Councill <is...@hioscar.com> wrote:

> Here's source from go thrift (warning: very ugly). I had to make a few
> modifications to the ttypes and client libraries to get it working. It
> requires git.apache.org/thrift.git from the 0.9.2 tree (0.9.1 generates
> code that is much farther from correct).
>
> after putting it in src/aurora/client:
> go build -o dist/aurora aurora/client/thrift/read_only_scheduler-remote
>
> then:
> time dist/aurora -u="http://<aurora_master>:8081/api" -http=true -P=json
> doBench
>
> Perhaps my general unhappiness with thrift right now has to due with
> immature go support. Just take a look at that generated code, and it didn't
> actually work, at all, without some messing around.
>
> Re: Kevin's question about differences between getTaskStatus requests, the
> contents I was filling in are the same (role, environment, jobName), but
> inspecting network traffic, it did appear that the taskStatusRequest from
> go thrift is adding empty values for nil list fields, which could impact
> the results.
>
> As for the apibeta client, it will take me a little extra time to rip that
> out into a sharable form.
>
> On Mon, Dec 29, 2014 at 2:17 PM, Jake Farrell <jf...@apache.org> wrote:
>
>> Hey Isaac
>> Would love to hear your pain points with Thrift and also can you share
>> your
>> source for the test clients
>>
>> -Jake
>>
>> On Mon, Dec 29, 2014 at 1:27 AM, Isaac Councill <is...@hioscar.com>
>> wrote:
>>
>> > tl;dr;
>> > apibeta seems way faster (and arguably better) than thrift api. What are
>> > the long term objectives for apibeta?
>> >
>> >
>> > Hi,
>> >
>> > I've been working on some aurora integrations, primarily a blackbox
>> > monitoring tool at present, and was looking for the best way to
>> communicate
>> > with the scheduler.
>> >
>> > For a large read-only example, I wanted to dump the latest scheduler
>> status
>> > info for all our prod jobs, basically:
>> >
>> > for all roles:
>> >   for all jobs in role:
>> >     get scheduler status
>> >
>> > We have about 120 prod jobs in aurora right now (growing fast). I
>> > benchmarked 3 strategies against our prod cluster (mean of 5 tries each
>> > from remote vpn, variance was small in each case):
>> >
>> > 1) aurora2 client: ./aurora2 job status cluster/<role>/prod > /dev/null
>> > 126.0sec
>> >
>> > 2) golang thrift API
>> > 584.3sec (I might be able make a better task query, but still... this is
>> > for only ~120 calls)
>> >
>> > 3) Pure json apibeta client in golang
>> > 13.4sec (again, might be able to optimize query strategy)
>> >
>> > As a side note, getting the golang thrift client to work was a very
>> painful
>> > and illuminating experience.
>> >
>> > I'm inclined to stick with apibeta. It's fast and the documentation is
>> > great. If api changes become a concern, well after today I'd honestly
>> > prefer rolling my own binding generator.
>> >
>> > Are there plans for /apibeta wrt /api?
>> >
>>
>
>

Re: api vs apibeta

Posted by Isaac Councill <is...@hioscar.com>.
Here's source from go thrift (warning: very ugly). I had to make a few
modifications to the ttypes and client libraries to get it working. It
requires git.apache.org/thrift.git from the 0.9.2 tree (0.9.1 generates
code that is much farther from correct).

after putting it in src/aurora/client:
go build -o dist/aurora aurora/client/thrift/read_only_scheduler-remote

then:
time dist/aurora -u="http://<aurora_master>:8081/api" -http=true -P=json
doBench

Perhaps my general unhappiness with thrift right now has to due with
immature go support. Just take a look at that generated code, and it didn't
actually work, at all, without some messing around.

Re: Kevin's question about differences between getTaskStatus requests, the
contents I was filling in are the same (role, environment, jobName), but
inspecting network traffic, it did appear that the taskStatusRequest from
go thrift is adding empty values for nil list fields, which could impact
the results.

As for the apibeta client, it will take me a little extra time to rip that
out into a sharable form.

On Mon, Dec 29, 2014 at 2:17 PM, Jake Farrell <jf...@apache.org> wrote:

> Hey Isaac
> Would love to hear your pain points with Thrift and also can you share your
> source for the test clients
>
> -Jake
>
> On Mon, Dec 29, 2014 at 1:27 AM, Isaac Councill <is...@hioscar.com> wrote:
>
> > tl;dr;
> > apibeta seems way faster (and arguably better) than thrift api. What are
> > the long term objectives for apibeta?
> >
> >
> > Hi,
> >
> > I've been working on some aurora integrations, primarily a blackbox
> > monitoring tool at present, and was looking for the best way to
> communicate
> > with the scheduler.
> >
> > For a large read-only example, I wanted to dump the latest scheduler
> status
> > info for all our prod jobs, basically:
> >
> > for all roles:
> >   for all jobs in role:
> >     get scheduler status
> >
> > We have about 120 prod jobs in aurora right now (growing fast). I
> > benchmarked 3 strategies against our prod cluster (mean of 5 tries each
> > from remote vpn, variance was small in each case):
> >
> > 1) aurora2 client: ./aurora2 job status cluster/<role>/prod > /dev/null
> > 126.0sec
> >
> > 2) golang thrift API
> > 584.3sec (I might be able make a better task query, but still... this is
> > for only ~120 calls)
> >
> > 3) Pure json apibeta client in golang
> > 13.4sec (again, might be able to optimize query strategy)
> >
> > As a side note, getting the golang thrift client to work was a very
> painful
> > and illuminating experience.
> >
> > I'm inclined to stick with apibeta. It's fast and the documentation is
> > great. If api changes become a concern, well after today I'd honestly
> > prefer rolling my own binding generator.
> >
> > Are there plans for /apibeta wrt /api?
> >
>

Re: api vs apibeta

Posted by Jake Farrell <jf...@apache.org>.
Hey Isaac
Would love to hear your pain points with Thrift and also can you share your
source for the test clients

-Jake

On Mon, Dec 29, 2014 at 1:27 AM, Isaac Councill <is...@hioscar.com> wrote:

> tl;dr;
> apibeta seems way faster (and arguably better) than thrift api. What are
> the long term objectives for apibeta?
>
>
> Hi,
>
> I've been working on some aurora integrations, primarily a blackbox
> monitoring tool at present, and was looking for the best way to communicate
> with the scheduler.
>
> For a large read-only example, I wanted to dump the latest scheduler status
> info for all our prod jobs, basically:
>
> for all roles:
>   for all jobs in role:
>     get scheduler status
>
> We have about 120 prod jobs in aurora right now (growing fast). I
> benchmarked 3 strategies against our prod cluster (mean of 5 tries each
> from remote vpn, variance was small in each case):
>
> 1) aurora2 client: ./aurora2 job status cluster/<role>/prod > /dev/null
> 126.0sec
>
> 2) golang thrift API
> 584.3sec (I might be able make a better task query, but still... this is
> for only ~120 calls)
>
> 3) Pure json apibeta client in golang
> 13.4sec (again, might be able to optimize query strategy)
>
> As a side note, getting the golang thrift client to work was a very painful
> and illuminating experience.
>
> I'm inclined to stick with apibeta. It's fast and the documentation is
> great. If api changes become a concern, well after today I'd honestly
> prefer rolling my own binding generator.
>
> Are there plans for /apibeta wrt /api?
>

Re: api vs apibeta

Posted by Kevin Sweeney <ke...@apache.org>.
Interesting data, any chance you could share the source of these benchmarks
for others to reproduce? Can you confirm you used the getTaskStatus API
call with the same TaskQuery for both the JSON and the thrift client calls?

On Sun, Dec 28, 2014 at 10:27 PM, Isaac Councill <is...@hioscar.com> wrote:

> tl;dr;
> apibeta seems way faster (and arguably better) than thrift api. What are
> the long term objectives for apibeta?
>
>
> Hi,
>
> I've been working on some aurora integrations, primarily a blackbox
> monitoring tool at present, and was looking for the best way to communicate
> with the scheduler.
>
> For a large read-only example, I wanted to dump the latest scheduler status
> info for all our prod jobs, basically:
>
> for all roles:
>   for all jobs in role:
>     get scheduler status
>
> We have about 120 prod jobs in aurora right now (growing fast). I
> benchmarked 3 strategies against our prod cluster (mean of 5 tries each
> from remote vpn, variance was small in each case):
>
> 1) aurora2 client: ./aurora2 job status cluster/<role>/prod > /dev/null
> 126.0sec
>
> 2) golang thrift API
> 584.3sec (I might be able make a better task query, but still... this is
> for only ~120 calls)
>
> 3) Pure json apibeta client in golang
> 13.4sec (again, might be able to optimize query strategy)
>
> As a side note, getting the golang thrift client to work was a very painful
> and illuminating experience.
>
> I'm inclined to stick with apibeta. It's fast and the documentation is
> great. If api changes become a concern, well after today I'd honestly
> prefer rolling my own binding generator.
>
> Are there plans for /apibeta wrt /api?
>