You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@drill.apache.org by Ted Dunning <te...@gmail.com> on 2014/11/07 17:04:54 UTC

Re: Test Case for Inability to run a Distributed Drill Query (on JSON files on FS).

Does anybody have a thought about this?

It seems to me like Chris has drillbits that are running their own ZK nodes
internally.

Is that true?

Would it be possible to have drill clients check for consistent ZK cluster
definitions when connecting to drill bits?



On Wed, Oct 29, 2014 at 4:30 AM, Chris Drawater <ch...@jdsu.com>
wrote:

> Hi Ramana,
>
>
>
>
>
> We've isolated the problem!   Below is our simple test case whereby a SQL
> query only runs local and not distributed.
>
>
>
> Our single zookeeper instance is running only on node Drill1  amd the
> foreman drillbit is running on Drill2.
>
>
>
> SQL -->
>
>         use dfs;
>
>         select xbin,ybin,numberOfConnections  from
> `/geo/appdata/rawdata/2014/SameNamedata.json`;
>
>
>
> where SameNamedata.json is present on  nodes : Drill1, Drill2 and Drill3,
> and is of the form -->
>
>
>
> { "NumberOfConnections": 1, "StartTime": "2014-10-
> 02T14:30:00Z", "EndTime": "2014-10-02T14:45:00Z", "XBin": 566150, "YBin":
> 3270100 }
>
> { "NumberOfUniqueSubscribers": 2, "NumberOfConnections":
> 2, "StartTime": "2014-10-02T14:30:00Z", "EndTime": "2014-10-
> 02T14:45:00Z", "XBin": 571350, "YBin": 3268100, "AverageSnr": -
> 6.2870720195528573 }
>
>
>
> On the 'foreman' node (Drill2) ,  drillbit.log  reveals the following
> trace -->
>
>
>
> 2014-10-28 10:55:06,883 [UserServer-1] DEBUG
> o.a.drill.exec.rpc.user.UserServer - Received query to run.  Returning
> query handle.
>
> 2014-10-28 10:55:06,893 [WorkManager Event Thread] DEBUG
> o.apache.drill.exec.work.WorkManager - Starting pending task
> org.apache.drill.exec.work.WorkManager$RunnableWrapper@bec4752
>
> 2014-10-28 10:55:06,911 [091c7c1d-fb6b-4685-9f1d-04f08afd1053:foreman]
> DEBUG o.a.d.e.s.h.HBaseStoragePluginConfig - Initializing HBase
> StoragePlugin configuration with zookeeper quorum 'localhost', port '2181'.
>
> 2014-10-28 10:55:06,912 [091c7c1d-fb6b-4685-9f1d-04f08afd1053:foreman]
> DEBUG o.a.drill.exec.store.SchemaFactory - Took 17 ms to register schemas.
>
> 2014-10-28 10:55:06,953 [091c7c1d-fb6b-4685-9f1d-04f08afd1053:foreman]
> DEBUG o.a.d.e.p.s.h.DefaultSqlHandler - Optiq Logical :
>
> ProjectRel(xbin=[$1], ybin=[$2], numberOfConnections=[$3]): rowcount =
> 100.0, cumulative cost = {200.0 rows, 401.0 cpu, 0.0 io, 0.0 network, 0.0
> memory}, id = 1259
>
>   EnumerableTableAccessRel(table=
> [[dfs, /geo/appdata/rawdata/2014/SameNamedata.json]]): rowcount = 100.0,
> cumulative cost = {100.0 rows, 101.0 cpu, 0.0 io, 0.0 network, 0.0 memory},
> id = 1256
>
>
>
> The line -->
>
>         2014-10-28 10:55:06,911 [091c7c1d-fb6b-4685-9f1d-
> 04f08afd1053:foreman] DEBUG o.a.d.e.s.h.HBaseStoragePluginConfig -
> Initializing HBase StoragePlugin configuration with zookeeper
> quorum 'localhost', port '2181'.
>
> may be the key.
>
>
>
> We're not using HBase and their is no Zookeeper instance running local on
> Drill2 !!
>
>
>
> May be a bug ?  (Although it might still be our configuration I suppose).
>
>
>
>
>
> Cheers,
>
>    Chris
>
>

Re: Test Case for Inability to run a Distributed Drill Query (on JSON files on FS).

Posted by Ramana Inukonda <ri...@maprtech.com>.
I did not realize this was addressed specifically at me.

@Chris,

I would suspect that the drill-override.conf does not contain correct ZK
information. Could you please paste an output of that file and lets check
if the config is right.


Regards
Ramana


On Fri, Nov 7, 2014 at 1:18 PM, Ted Dunning <te...@gmail.com> wrote:

> On Fri, Nov 7, 2014 at 11:06 AM, Jacques Nadeau <ja...@apache.org>
> wrote:
>
> > Ted,
> > I'm not sure what you're suggesting.  Drill uses zk to find out about
> other
> > nodes.  As such, the only nodes it will know about are the nodes that
> have
> > the same zk setup.
> >
>
> If a ZK client has a connection string that contains members of multiple ZK
> clusters, then the client connect to different ZK nodes each time it
> connects.  This can lead to massive confusion.  If a Drillbit uses multiple
> connections, those connections could actually connect to different ZK
> nodes.  Thus, the drill bit might register with one ZK on one connection,
> but use another to find other drill bits.  The Curator library tends to use
> a style in which connections are used for fairly local purposes, so this
> isn't all that far-fetched.
>
> Similarly, if each drill bit actually is running a solo embedded ZK, a user
> could be confused about what is happening.  If somebody touched multiple
> drill bits and checked the ZK configs on each node, the problem would be
> obvious.
>

Re: Test Case for Inability to run a Distributed Drill Query (on JSON files on FS).

Posted by Ted Dunning <te...@gmail.com>.
OK.  Then it sounds like the config has localhost for the ZK address
somewhere.  Or Drill client is connecting that way and that is leaking
through to affect how the Drillbit does a query.



On Fri, Nov 7, 2014 at 3:29 PM, Aditya <ad...@gmail.com> wrote:

> I do not think Drillbits, in any mode, starts an embedded Zookeeper server.
>
> I know because I was once looking for how to do it but could not find.
>
> On Fri, Nov 7, 2014 at 1:18 PM, Ted Dunning <te...@gmail.com> wrote:
>
> > On Fri, Nov 7, 2014 at 11:06 AM, Jacques Nadeau <ja...@apache.org>
> > wrote:
> >
> > > Ted,
> > > I'm not sure what you're suggesting.  Drill uses zk to find out about
> > other
> > > nodes.  As such, the only nodes it will know about are the nodes that
> > have
> > > the same zk setup.
> > >
> >
> > If a ZK client has a connection string that contains members of multiple
> ZK
> > clusters, then the client connect to different ZK nodes each time it
> > connects.  This can lead to massive confusion.  If a Drillbit uses
> multiple
> > connections, those connections could actually connect to different ZK
> > nodes.  Thus, the drill bit might register with one ZK on one connection,
> > but use another to find other drill bits.  The Curator library tends to
> use
> > a style in which connections are used for fairly local purposes, so this
> > isn't all that far-fetched.
> >
> > Similarly, if each drill bit actually is running a solo embedded ZK, a
> user
> > could be confused about what is happening.  If somebody touched multiple
> > drill bits and checked the ZK configs on each node, the problem would be
> > obvious.
> >
>

Re: Test Case for Inability to run a Distributed Drill Query (on JSON files on FS).

Posted by Aditya <ad...@gmail.com>.
I do not think Drillbits, in any mode, starts an embedded Zookeeper server.

I know because I was once looking for how to do it but could not find.

On Fri, Nov 7, 2014 at 1:18 PM, Ted Dunning <te...@gmail.com> wrote:

> On Fri, Nov 7, 2014 at 11:06 AM, Jacques Nadeau <ja...@apache.org>
> wrote:
>
> > Ted,
> > I'm not sure what you're suggesting.  Drill uses zk to find out about
> other
> > nodes.  As such, the only nodes it will know about are the nodes that
> have
> > the same zk setup.
> >
>
> If a ZK client has a connection string that contains members of multiple ZK
> clusters, then the client connect to different ZK nodes each time it
> connects.  This can lead to massive confusion.  If a Drillbit uses multiple
> connections, those connections could actually connect to different ZK
> nodes.  Thus, the drill bit might register with one ZK on one connection,
> but use another to find other drill bits.  The Curator library tends to use
> a style in which connections are used for fairly local purposes, so this
> isn't all that far-fetched.
>
> Similarly, if each drill bit actually is running a solo embedded ZK, a user
> could be confused about what is happening.  If somebody touched multiple
> drill bits and checked the ZK configs on each node, the problem would be
> obvious.
>

Re: Test Case for Inability to run a Distributed Drill Query (on JSON files on FS).

Posted by Ted Dunning <te...@gmail.com>.
On Fri, Nov 7, 2014 at 11:06 AM, Jacques Nadeau <ja...@apache.org> wrote:

> Ted,
> I'm not sure what you're suggesting.  Drill uses zk to find out about other
> nodes.  As such, the only nodes it will know about are the nodes that have
> the same zk setup.
>

If a ZK client has a connection string that contains members of multiple ZK
clusters, then the client connect to different ZK nodes each time it
connects.  This can lead to massive confusion.  If a Drillbit uses multiple
connections, those connections could actually connect to different ZK
nodes.  Thus, the drill bit might register with one ZK on one connection,
but use another to find other drill bits.  The Curator library tends to use
a style in which connections are used for fairly local purposes, so this
isn't all that far-fetched.

Similarly, if each drill bit actually is running a solo embedded ZK, a user
could be confused about what is happening.  If somebody touched multiple
drill bits and checked the ZK configs on each node, the problem would be
obvious.

Re: Test Case for Inability to run a Distributed Drill Query (on JSON files on FS).

Posted by Jacques Nadeau <ja...@apache.org>.
Chris,
I'm not sure what is going on in this instance.  The HBase storage plugin
should be disabled by default.  If it were enabled and didn't have
settings, it would likely cause problems.

I'd suggest clearing the drill config data out of zk and restarting the
bits.

Ted,
I'm not sure what you're suggesting.  Drill uses zk to find out about other
nodes.  As such, the only nodes it will know about are the nodes that have
the same zk setup.

On Fri, Nov 7, 2014 at 8:04 AM, Ted Dunning <te...@gmail.com> wrote:

> Does anybody have a thought about this?
>
> It seems to me like Chris has drillbits that are running their own ZK nodes
> internally.
>
> Is that true?
>
> Would it be possible to have drill clients check for consistent ZK cluster
> definitions when connecting to drill bits?
>
>
>
> On Wed, Oct 29, 2014 at 4:30 AM, Chris Drawater <ch...@jdsu.com>
> wrote:
>
> > Hi Ramana,
> >
> >
> >
> >
> >
> > We've isolated the problem!   Below is our simple test case whereby a SQL
> > query only runs local and not distributed.
> >
> >
> >
> > Our single zookeeper instance is running only on node Drill1  amd the
> > foreman drillbit is running on Drill2.
> >
> >
> >
> > SQL -->
> >
> >         use dfs;
> >
> >         select xbin,ybin,numberOfConnections  from
> > `/geo/appdata/rawdata/2014/SameNamedata.json`;
> >
> >
> >
> > where SameNamedata.json is present on  nodes : Drill1, Drill2 and Drill3,
> > and is of the form -->
> >
> >
> >
> > { "NumberOfConnections": 1, "StartTime": "2014-10-
> > 02T14:30:00Z", "EndTime": "2014-10-02T14:45:00Z", "XBin": 566150, "YBin":
> > 3270100 }
> >
> > { "NumberOfUniqueSubscribers": 2, "NumberOfConnections":
> > 2, "StartTime": "2014-10-02T14:30:00Z", "EndTime": "2014-10-
> > 02T14:45:00Z", "XBin": 571350, "YBin": 3268100, "AverageSnr": -
> > 6.2870720195528573 }
> >
> >
> >
> > On the 'foreman' node (Drill2) ,  drillbit.log  reveals the following
> > trace -->
> >
> >
> >
> > 2014-10-28 10:55:06,883 [UserServer-1] DEBUG
> > o.a.drill.exec.rpc.user.UserServer - Received query to run.  Returning
> > query handle.
> >
> > 2014-10-28 10:55:06,893 [WorkManager Event Thread] DEBUG
> > o.apache.drill.exec.work.WorkManager - Starting pending task
> > org.apache.drill.exec.work.WorkManager$RunnableWrapper@bec4752
> >
> > 2014-10-28 10:55:06,911 [091c7c1d-fb6b-4685-9f1d-04f08afd1053:foreman]
> > DEBUG o.a.d.e.s.h.HBaseStoragePluginConfig - Initializing HBase
> > StoragePlugin configuration with zookeeper quorum 'localhost', port
> '2181'.
> >
> > 2014-10-28 10:55:06,912 [091c7c1d-fb6b-4685-9f1d-04f08afd1053:foreman]
> > DEBUG o.a.drill.exec.store.SchemaFactory - Took 17 ms to register
> schemas.
> >
> > 2014-10-28 10:55:06,953 [091c7c1d-fb6b-4685-9f1d-04f08afd1053:foreman]
> > DEBUG o.a.d.e.p.s.h.DefaultSqlHandler - Optiq Logical :
> >
> > ProjectRel(xbin=[$1], ybin=[$2], numberOfConnections=[$3]): rowcount =
> > 100.0, cumulative cost = {200.0 rows, 401.0 cpu, 0.0 io, 0.0 network, 0.0
> > memory}, id = 1259
> >
> >   EnumerableTableAccessRel(table=
> > [[dfs, /geo/appdata/rawdata/2014/SameNamedata.json]]): rowcount = 100.0,
> > cumulative cost = {100.0 rows, 101.0 cpu, 0.0 io, 0.0 network, 0.0
> memory},
> > id = 1256
> >
> >
> >
> > The line -->
> >
> >         2014-10-28 10:55:06,911 [091c7c1d-fb6b-4685-9f1d-
> > 04f08afd1053:foreman] DEBUG o.a.d.e.s.h.HBaseStoragePluginConfig -
> > Initializing HBase StoragePlugin configuration with zookeeper
> > quorum 'localhost', port '2181'.
> >
> > may be the key.
> >
> >
> >
> > We're not using HBase and their is no Zookeeper instance running local on
> > Drill2 !!
> >
> >
> >
> > May be a bug ?  (Although it might still be our configuration I suppose).
> >
> >
> >
> >
> >
> > Cheers,
> >
> >    Chris
> >
> >
>