You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by David Alves <da...@gmail.com> on 2013/04/19 05:18:16 UTC

update on dist exec work

Hi All

	I just submitted a pr (https://github.com/apache/incubator-drill/pull/16) agains the execwork that does the following:
	- fixed the bug in hazelcache where it would always try to start a new cluster (now checks if one exists first)
	- fixed the bug in zk coordinator where endpoints wouldn't get refreshed prior to a cluster change (why it was failing before)
	- added a DrillClient wrapper class that does the lookup through zookeeper
	- added a base sys test class that starts an embedded zk cluster and starts multiple drillbits
	- removed sleep() in Driillbit startup and separated Drillbit startup form the main() method.
	- added a toString() DrillConfig
	
	If you're working on dist exec please take the new DrillSystemTestBase for a spin (boots up a mini zk cluster and a Drillbit cluster), a good example is DrillClientSystemTest.
	It'd be cool if we could use that to have repeatable tests.
	Let me know is anything is a miss.

Best
David

Re: update on dist exec work

Posted by Jacques Nadeau <ja...@apache.org>.
> >    Otherwise we have to encode and decode a buffer of length zero.  If no
> >   buffer is passed, we just drop that field all together.  Also, I wasn't
> >   thinking that the second submitQuery() exists.  The goal of the buffer
> >   field is to be able to keep data off heap.  We probably shouldn't use
> it
> >   for for metadata (such as the query plan).
>         - Not sure what you're saying… Are you suggesting that the query
> itself should be kept in RunQuery
>

Basically yes.  I'd like to be strongly typed unless we're specifically
dealing with data we want to manage off heap.  As such, I was thinking that
only record batches should be passed via the open ended bytebuf interface.

I see that you've made the other updates.  I'll merge those into the
execwork wip branch.

Thanks for all your effort.

Jacques

Re: update on dist exec work

Posted by David Alves <da...@gmail.com>.
Hi Jacques

	Thanks for reviewing.
	I'll be addressing most of those today.
	Responses to your points inline.

Best
David

>   - DrillConfig: You import VideoTrackSizeListener.  I think this probably
>   just an accident
	- right will correct that

>   - DrillClient: You should pass null rather than an empty buffer.
	- ok, did that defensively, but if null works I'll change that

>    Otherwise we have to encode and decode a buffer of length zero.  If no
>   buffer is passed, we just drop that field all together.  Also, I wasn't
>   thinking that the second submitQuery() exists.  The goal of the buffer
>   field is to be able to keep data off heap.  We probably shouldn't use it
>   for for metadata (such as the query plan).
	- Not sure what you're saying… Are you suggesting that the query itself should be kept in RunQuery

>   - drill-module.conf: you've added a user.address.  What is this for?
	- In general it's useful to define an address to which we'll bind to in the case where there are multiple ifaces. I'm not using it for the moment though and I'll remove it

>   - In SystemSystemTestBase you go about updating the port numbers in the
>   config.  I was thinking that it would be preferred that the RPC system
>   simply started with the initial binding ports and then worked their way
>   north until they found an available one.  Thoughts?  That way it doesn't
>   take special configuration to run multiple Drillbits simultaneously (for
>   some reason).
	- makes sense, I'll catch bind exception increase the port number and keep trying.
> 
> Otherwise look great!  Let me know about the items above and then I'll
> merge into the WIP.  My goal is for us to get the WIP to a certain state
> then have a more set of substantial reviews before we commit to default.
	- +1 on that, there's a lot of work (besides impl) that needs to be done before we move it to default like:
	- formatting
	- comments
	- copyright
	- tests (a lot of tests)
	
	I agree that for now we should just move along as fast as we can to put something that works together.


> 
> J
> 
> On Thu, Apr 18, 2013 at 8:18 PM, David Alves <da...@gmail.com> wrote:
> 
>> Hi All
>> 
>>        I just submitted a pr (
>> https://github.com/apache/incubator-drill/pull/16) agains the execwork
>> that does the following:
>>        - fixed the bug in hazelcache where it would always try to start a
>> new cluster (now checks if one exists first)
>>        - fixed the bug in zk coordinator where endpoints wouldn't get
>> refreshed prior to a cluster change (why it was failing before)
>>        - added a DrillClient wrapper class that does the lookup through
>> zookeeper
>>        - added a base sys test class that starts an embedded zk cluster
>> and starts multiple drillbits
>>        - removed sleep() in Driillbit startup and separated Drillbit
>> startup form the main() method.
>>        - added a toString() DrillConfig
>> 
>>        If you're working on dist exec please take the new
>> DrillSystemTestBase for a spin (boots up a mini zk cluster and a Drillbit
>> cluster), a good example is DrillClientSystemTest.
>>        It'd be cool if we could use that to have repeatable tests.
>>        Let me know is anything is a miss.
>> 
>> Best
>> David


Re: update on dist exec work

Posted by Jacques Nadeau <ja...@apache.org>.
Couple quick notes:


   - DrillConfig: You import VideoTrackSizeListener.  I think this probably
   just an accident
   - DrillClient: You should pass null rather than an empty buffer.
    Otherwise we have to encode and decode a buffer of length zero.  If no
   buffer is passed, we just drop that field all together.  Also, I wasn't
   thinking that the second submitQuery() exists.  The goal of the buffer
   field is to be able to keep data off heap.  We probably shouldn't use it
   for for metadata (such as the query plan).
   - drill-module.conf: you've added a user.address.  What is this for?
   - In SystemSystemTestBase you go about updating the port numbers in the
   config.  I was thinking that it would be preferred that the RPC system
   simply started with the initial binding ports and then worked their way
   north until they found an available one.  Thoughts?  That way it doesn't
   take special configuration to run multiple Drillbits simultaneously (for
   some reason).


Otherwise look great!  Let me know about the items above and then I'll
merge into the WIP.  My goal is for us to get the WIP to a certain state
then have a more set of substantial reviews before we commit to default.

J

On Thu, Apr 18, 2013 at 8:18 PM, David Alves <da...@gmail.com> wrote:

> Hi All
>
>         I just submitted a pr (
> https://github.com/apache/incubator-drill/pull/16) agains the execwork
> that does the following:
>         - fixed the bug in hazelcache where it would always try to start a
> new cluster (now checks if one exists first)
>         - fixed the bug in zk coordinator where endpoints wouldn't get
> refreshed prior to a cluster change (why it was failing before)
>         - added a DrillClient wrapper class that does the lookup through
> zookeeper
>         - added a base sys test class that starts an embedded zk cluster
> and starts multiple drillbits
>         - removed sleep() in Driillbit startup and separated Drillbit
> startup form the main() method.
>         - added a toString() DrillConfig
>
>         If you're working on dist exec please take the new
> DrillSystemTestBase for a spin (boots up a mini zk cluster and a Drillbit
> cluster), a good example is DrillClientSystemTest.
>         It'd be cool if we could use that to have repeatable tests.
>         Let me know is anything is a miss.
>
> Best
> David