You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@drill.apache.org by Matt <bs...@gmail.com> on 2017/05/08 18:41:56 UTC

Drill Cluster without HDFS/MapR-FS?

I have seen some posts in the past about Drill nodes mounted "close to 
the data", and am wondering if its possible to use Drill as a cluster 
without HDFS?

Using ZK would not be an issue in itself, and there are apparently 
options like https://github.com/mhausenblas/dromedar

Any experiences with this?

Re: Drill Cluster without HDFS/MapR-FS?

Posted by ankit beohar <an...@gmail.com>.

Hey Matt,

Yes we can use Drill in distribute mode or install on a cluster we did that
but for dev purpose in prod environment we had hadoop still you can do that
and steps are pretty much available in
https://drill.apache.org/docs/installing-drill-on-the-cluster/

Best Regards,
ANKIT BEOHAR


On Mon, May 8, 2017 at 2:41 PM, Matt <bs...@gmail.com> wrote:

> I have seen some posts in the past about Drill nodes mounted "close to the
> data", and am wondering if its possible to use Drill as a cluster without
> HDFS?
>
> Using ZK would not be an issue in itself, and there are apparently options
> like https://github.com/mhausenblas/dromedar
>
> Any experiences with this?
>

Re: Drill Cluster without HDFS/MapR-FS?

Posted by Ted Dunning <te...@gmail.com>.

Using Drill against any kind of distributed data store is a fine thing. If
data locality matters, then it is nice if Drill can see what data is where.
Regardless, using Drill with out HDFS works great.

I should point out that using Drill with MapR is technically using it
without HDFS, but since MapR FS implements the HDFS API, the distinction is
kind of technical.

On Mon, May 8, 2017 at 11:41 AM, Matt <bs...@gmail.com> wrote:

> I have seen some posts in the past about Drill nodes mounted "close to the
> data", and am wondering if its possible to use Drill as a cluster without
> HDFS?
>
> Using ZK would not be an issue in itself, and there are apparently options
> like https://github.com/mhausenblas/dromedar
>
> Any experiences with this?
>

Re: Drill Cluster without HDFS/MapR-FS?

Posted by Ted Dunning <te...@gmail.com>.

I have no such experience.

The performance loss could vary from minor to profound depending on your
query, network and disk setup.



On Tue, May 9, 2017 at 11:56 PM, Rahul Raj <ra...@option3consulting.com>
wrote:

> Any experience of running drill on GlusterFS or similar storage systems?
> How much performance loss would incur because of unavailability of data
> locality?
>
> Regards,
> Rahul
>
> On Wed, May 10, 2017 at 11:11 AM, Abhishek Girish <ag...@apache.org>
> wrote:
>
> > Do you wish to use Drill in distributed mode with each node having it's
> own
> > local file system or do you plan to use it with a different data source
> > which is also a distributed file system (but not HDFS / MapR-FS)?
> >
> > If the former, yes you should be able to form a Drill cluster by bringing
> > up Drillbits in standalone mode on multiple disjoint nodes. You will
> still
> > need ZooKeeper for cluster coordination. But understand that since each
> > node can only talk to files on it's local file system, the Drill cluster
> > will not have a unified view and access of the files for distributed
> > processing. Your queries may fail, as a Drillbit might fail to access
> data.
> > To experiment, you can make sure the directories and files you need to
> > query are identical on each node. However, this is untested and I'm not
> > sure if it will indeed work.
> >
> > If it's the latter, can you share what data source you have in mind?
> >
> > On Mon, May 8, 2017 at 11:41 AM, Matt <bs...@gmail.com> wrote:
> >
> > > I have seen some posts in the past about Drill nodes mounted "close to
> > the
> > > data", and am wondering if its possible to use Drill as a cluster
> without
> > > HDFS?
> > >
> > > Using ZK would not be an issue in itself, and there are apparently
> > options
> > > like https://github.com/mhausenblas/dromedar
> > >
> > > Any experiences with this?
> > >
> >
>
> --
> **** This email and any files transmitted with it are confidential and
> intended solely for the use of the individual or entity to whom it is
> addressed. If you are not the named addressee then you should not
> disseminate, distribute or copy this e-mail. Please notify the sender
> immediately and delete this e-mail from your system.****
>

Re: Drill Cluster without HDFS/MapR-FS?

Posted by Rahul Raj <ra...@option3consulting.com>.

Any experience of running drill on GlusterFS or similar storage systems?
How much performance loss would incur because of unavailability of data
locality?

Regards,
Rahul

On Wed, May 10, 2017 at 11:11 AM, Abhishek Girish <ag...@apache.org>
wrote:

> Do you wish to use Drill in distributed mode with each node having it's own
> local file system or do you plan to use it with a different data source
> which is also a distributed file system (but not HDFS / MapR-FS)?
>
> If the former, yes you should be able to form a Drill cluster by bringing
> up Drillbits in standalone mode on multiple disjoint nodes. You will still
> need ZooKeeper for cluster coordination. But understand that since each
> node can only talk to files on it's local file system, the Drill cluster
> will not have a unified view and access of the files for distributed
> processing. Your queries may fail, as a Drillbit might fail to access data.
> To experiment, you can make sure the directories and files you need to
> query are identical on each node. However, this is untested and I'm not
> sure if it will indeed work.
>
> If it's the latter, can you share what data source you have in mind?
>
> On Mon, May 8, 2017 at 11:41 AM, Matt <bs...@gmail.com> wrote:
>
> > I have seen some posts in the past about Drill nodes mounted "close to
> the
> > data", and am wondering if its possible to use Drill as a cluster without
> > HDFS?
> >
> > Using ZK would not be an issue in itself, and there are apparently
> options
> > like https://github.com/mhausenblas/dromedar
> >
> > Any experiences with this?
> >
>

-- 
**** This email and any files transmitted with it are confidential and 
intended solely for the use of the individual or entity to whom it is 
addressed. If you are not the named addressee then you should not 
disseminate, distribute or copy this e-mail. Please notify the sender 
immediately and delete this e-mail from your system.****

Re: Drill Cluster without HDFS/MapR-FS?

Posted by Abhishek Girish <ag...@apache.org>.

Do you wish to use Drill in distributed mode with each node having it's own
local file system or do you plan to use it with a different data source
which is also a distributed file system (but not HDFS / MapR-FS)?

If the former, yes you should be able to form a Drill cluster by bringing
up Drillbits in standalone mode on multiple disjoint nodes. You will still
need ZooKeeper for cluster coordination. But understand that since each
node can only talk to files on it's local file system, the Drill cluster
will not have a unified view and access of the files for distributed
processing. Your queries may fail, as a Drillbit might fail to access data.
To experiment, you can make sure the directories and files you need to
query are identical on each node. However, this is untested and I'm not
sure if it will indeed work.

If it's the latter, can you share what data source you have in mind?

On Mon, May 8, 2017 at 11:41 AM, Matt <bs...@gmail.com> wrote:

> I have seen some posts in the past about Drill nodes mounted "close to the
> data", and am wondering if its possible to use Drill as a cluster without
> HDFS?
>
> Using ZK would not be an issue in itself, and there are apparently options
> like https://github.com/mhausenblas/dromedar
>
> Any experiences with this?
>