You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@drill.apache.org by Paul Rogers <pr...@maprtech.com> on 2016/05/29 22:44:55 UTC

Improvements to the drillbit.sh --config option

Hi All,

The discussion with John and Charles about the drillbit scripts reminded me to get your thoughts on another change we’re working on.

Today, drillbit.sh has the --config option so you can put your config files in a location separate from DRILL_HOME:

$DRILL_HOME/bin/drillbit.sh —config /some/path/to/conf start

This is handy, but it only holds config files (drill-env.sh, drill-override.sh). If you have custom code, it still must go into $DRILL_HOME/jars/3rdparty.

This presents two challenges:

* On upgrades, you have to grab your files from the old $DRILL_HOME and copy them into the new one.
* With YARN, we have to create an archive of your entire $DRILL_HOME just to grab your “site” files.

So, we propose to extend the —config option to include code as well as config. We call this “complete” set of files the “site” directory (using Hadoop terminology.) (See DRILL-4591.) This way:

* Upgrade is easy, throw away the old $DRILL_HOME and extract the Drill archive to create the new one.
* With YARN, we upload the “stock” drill archive plus your (much smaller) site files.
* We can more easily support multiple Drill “clusters” (each with its own site files, including assigned ports.)

With YARN, you only need one copy of the DRILL_HOME and site directory; YARN copies (“localizes”) the files to all your worker nodes. Without YARN, you have to do the copy, probably with your favorite system admin tool.

So, the question is this: is the site directory a help for those of you that won’t be using YARN? Or, does everyone just copy site files from one DRILL_HOME to the next on upgrade, then push the merged directory to all your worker nodes?

Thoughts?

Thanks,

- Paul


Re: Improvements to the drillbit.sh --config option

Posted by John Omernik <jo...@omernik.com>.
In Mesos, we could just have a directory off drillhome being /customjars
that we could add to the classpath. If I include the jars in the executor
package, then it would always be there, or, with Mesos, I can set two
packages, first drill, and then the package with the jars, and it would
just pull it from the location, as long as the classpath is all we need,
then my vote is we use that, and don't put anything in the /conf.  I think
it's a bad thing to start doing to put the jars in conf.  I just think it
could make a lot of things messy depending on the environment.  Happy to
discuss more if you'd like my perspective.

John


On Wed, Jun 1, 2016 at 11:55 AM, Paul Rogers <pr...@maprtech.com> wrote:

> One last thought on this… Another alternative is available for custom code.
>
> The scripts allow you to specify additions to the class path that are not
> in the usual locations. This exists in 1.6, but is made clearer in the
> revisions we’ve been discussing. The feature allows you to place code
> anywhere (assuming, of course, that the path is valid on all nodes):
>
> export DRILL_CLASSPATH=/your/code/*:/a/second/path/*
>
> The above would typically reside in drill-env.sh. Or, it can be set in the
> environment.
>
> - Paul
>
> > On May 29, 2016, at 7:12 PM, Paul Rogers <pr...@maprtech.com> wrote:
> >
> > How’s about this? The proposed structure is:
> >
> > $DRILL_CONF_DIR
> > |- drill-override.conf
> > |- drill-env.sh
> > |- jars
> >   |- your.jar
> >
> > So, could you replace jars with a link to the (shared) code?
> >
> > $DRILL_CONF_DIR
> > |- …
> > |- jars —> /your/shared/code
> >
> > With YARN, the step to archive the site directory will follow sym links.
> With non-YARN, sym links should “just work.”
> >
> > Will this work for the Mesos case?
> >
> > - Paul
> >
> >> On May 29, 2016, at 6:41 PM, John Omernik <jo...@omernik.com> wrote:
> >>
> >> I am all for this, it would make Mesos easier as well. (I do have to
> merge
> >> the 3rd party code and the libjpam.so into my archive at upgrade) . That
> >> said, I'd be interested in the choice to merge a conf directory with
> >> code.  Personally, putting on my admin hat, I'd like the ability to add
> a
> >> Code directory, and have that location be seperate from my conf
> directory.
> >> I may have different qa processes, change management processes, etc
> between
> >> the two.  I am very in favor of what you are proposing, I would just
> prefer
> >> it being a separate option to append to my 3rd party jar search
> locations.
> >> Please let me know if I am not being clear, I am on my iPad, and my
> >> thoughts always appear more jumbled (more than usual) when I reread my
> iPad
> >> posts.  I blame Woz.
> >>
> >> John
> >>
> >> On Sunday, May 29, 2016, Paul Rogers <pr...@maprtech.com> wrote:
> >>
> >>> Hi All,
> >>>
> >>> The discussion with John and Charles about the drillbit scripts
> reminded
> >>> me to get your thoughts on another change we’re working on.
> >>>
> >>> Today, drillbit.sh has the --config option so you can put your config
> >>> files in a location separate from DRILL_HOME:
> >>>
> >>> $DRILL_HOME/bin/drillbit.sh —config /some/path/to/conf start
> >>>
> >>> This is handy, but it only holds config files (drill-env.sh,
> >>> drill-override.sh). If you have custom code, it still must go into
> >>> $DRILL_HOME/jars/3rdparty.
> >>>
> >>> This presents two challenges:
> >>>
> >>> * On upgrades, you have to grab your files from the old $DRILL_HOME and
> >>> copy them into the new one.
> >>> * With YARN, we have to create an archive of your entire $DRILL_HOME
> just
> >>> to grab your “site” files.
> >>>
> >>> So, we propose to extend the —config option to include code as well as
> >>> config. We call this “complete” set of files the “site” directory
> (using
> >>> Hadoop terminology.) (See DRILL-4591.) This way:
> >>>
> >>> * Upgrade is easy, throw away the old $DRILL_HOME and extract the Drill
> >>> archive to create the new one.
> >>> * With YARN, we upload the “stock” drill archive plus your (much
> smaller)
> >>> site files.
> >>> * We can more easily support multiple Drill “clusters” (each with its
> own
> >>> site files, including assigned ports.)
> >>>
> >>> With YARN, you only need one copy of the DRILL_HOME and site directory;
> >>> YARN copies (“localizes”) the files to all your worker nodes. Without
> YARN,
> >>> you have to do the copy, probably with your favorite system admin tool.
> >>>
> >>> So, the question is this: is the site directory a help for those of you
> >>> that won’t be using YARN? Or, does everyone just copy site files from
> one
> >>> DRILL_HOME to the next on upgrade, then push the merged directory to
> all
> >>> your worker nodes?
> >>>
> >>> Thoughts?
> >>>
> >>> Thanks,
> >>>
> >>> - Paul
> >>>
> >>>
> >>
> >> --
> >> Sent from my iThing
> >
>
>

Re: Improvements to the drillbit.sh --config option

Posted by Paul Rogers <pr...@maprtech.com>.
One last thought on this… Another alternative is available for custom code.

The scripts allow you to specify additions to the class path that are not in the usual locations. This exists in 1.6, but is made clearer in the revisions we’ve been discussing. The feature allows you to place code anywhere (assuming, of course, that the path is valid on all nodes):

export DRILL_CLASSPATH=/your/code/*:/a/second/path/*

The above would typically reside in drill-env.sh. Or, it can be set in the environment.

- Paul

> On May 29, 2016, at 7:12 PM, Paul Rogers <pr...@maprtech.com> wrote:
> 
> How’s about this? The proposed structure is:
> 
> $DRILL_CONF_DIR
> |- drill-override.conf
> |- drill-env.sh
> |- jars
>   |- your.jar
> 
> So, could you replace jars with a link to the (shared) code?
> 
> $DRILL_CONF_DIR
> |- …
> |- jars —> /your/shared/code
> 
> With YARN, the step to archive the site directory will follow sym links. With non-YARN, sym links should “just work.”
> 
> Will this work for the Mesos case?
> 
> - Paul
> 
>> On May 29, 2016, at 6:41 PM, John Omernik <jo...@omernik.com> wrote:
>> 
>> I am all for this, it would make Mesos easier as well. (I do have to merge
>> the 3rd party code and the libjpam.so into my archive at upgrade) . That
>> said, I'd be interested in the choice to merge a conf directory with
>> code.  Personally, putting on my admin hat, I'd like the ability to add a
>> Code directory, and have that location be seperate from my conf directory.
>> I may have different qa processes, change management processes, etc between
>> the two.  I am very in favor of what you are proposing, I would just prefer
>> it being a separate option to append to my 3rd party jar search locations.
>> Please let me know if I am not being clear, I am on my iPad, and my
>> thoughts always appear more jumbled (more than usual) when I reread my iPad
>> posts.  I blame Woz.
>> 
>> John
>> 
>> On Sunday, May 29, 2016, Paul Rogers <pr...@maprtech.com> wrote:
>> 
>>> Hi All,
>>> 
>>> The discussion with John and Charles about the drillbit scripts reminded
>>> me to get your thoughts on another change we’re working on.
>>> 
>>> Today, drillbit.sh has the --config option so you can put your config
>>> files in a location separate from DRILL_HOME:
>>> 
>>> $DRILL_HOME/bin/drillbit.sh —config /some/path/to/conf start
>>> 
>>> This is handy, but it only holds config files (drill-env.sh,
>>> drill-override.sh). If you have custom code, it still must go into
>>> $DRILL_HOME/jars/3rdparty.
>>> 
>>> This presents two challenges:
>>> 
>>> * On upgrades, you have to grab your files from the old $DRILL_HOME and
>>> copy them into the new one.
>>> * With YARN, we have to create an archive of your entire $DRILL_HOME just
>>> to grab your “site” files.
>>> 
>>> So, we propose to extend the —config option to include code as well as
>>> config. We call this “complete” set of files the “site” directory (using
>>> Hadoop terminology.) (See DRILL-4591.) This way:
>>> 
>>> * Upgrade is easy, throw away the old $DRILL_HOME and extract the Drill
>>> archive to create the new one.
>>> * With YARN, we upload the “stock” drill archive plus your (much smaller)
>>> site files.
>>> * We can more easily support multiple Drill “clusters” (each with its own
>>> site files, including assigned ports.)
>>> 
>>> With YARN, you only need one copy of the DRILL_HOME and site directory;
>>> YARN copies (“localizes”) the files to all your worker nodes. Without YARN,
>>> you have to do the copy, probably with your favorite system admin tool.
>>> 
>>> So, the question is this: is the site directory a help for those of you
>>> that won’t be using YARN? Or, does everyone just copy site files from one
>>> DRILL_HOME to the next on upgrade, then push the merged directory to all
>>> your worker nodes?
>>> 
>>> Thoughts?
>>> 
>>> Thanks,
>>> 
>>> - Paul
>>> 
>>> 
>> 
>> -- 
>> Sent from my iThing
> 


Re: Improvements to the drillbit.sh --config option

Posted by Paul Rogers <pr...@maprtech.com>.
How’s about this? The proposed structure is:

$DRILL_CONF_DIR
|- drill-override.conf
|- drill-env.sh
|- jars
   |- your.jar

So, could you replace jars with a link to the (shared) code?

$DRILL_CONF_DIR
|- …
|- jars —> /your/shared/code

With YARN, the step to archive the site directory will follow sym links. With non-YARN, sym links should “just work.”

Will this work for the Mesos case?

- Paul

> On May 29, 2016, at 6:41 PM, John Omernik <jo...@omernik.com> wrote:
> 
> I am all for this, it would make Mesos easier as well. (I do have to merge
> the 3rd party code and the libjpam.so into my archive at upgrade) . That
> said, I'd be interested in the choice to merge a conf directory with
> code.  Personally, putting on my admin hat, I'd like the ability to add a
> Code directory, and have that location be seperate from my conf directory.
> I may have different qa processes, change management processes, etc between
> the two.  I am very in favor of what you are proposing, I would just prefer
> it being a separate option to append to my 3rd party jar search locations.
> Please let me know if I am not being clear, I am on my iPad, and my
> thoughts always appear more jumbled (more than usual) when I reread my iPad
> posts.  I blame Woz.
> 
> John
> 
> On Sunday, May 29, 2016, Paul Rogers <pr...@maprtech.com> wrote:
> 
>> Hi All,
>> 
>> The discussion with John and Charles about the drillbit scripts reminded
>> me to get your thoughts on another change we’re working on.
>> 
>> Today, drillbit.sh has the --config option so you can put your config
>> files in a location separate from DRILL_HOME:
>> 
>> $DRILL_HOME/bin/drillbit.sh —config /some/path/to/conf start
>> 
>> This is handy, but it only holds config files (drill-env.sh,
>> drill-override.sh). If you have custom code, it still must go into
>> $DRILL_HOME/jars/3rdparty.
>> 
>> This presents two challenges:
>> 
>> * On upgrades, you have to grab your files from the old $DRILL_HOME and
>> copy them into the new one.
>> * With YARN, we have to create an archive of your entire $DRILL_HOME just
>> to grab your “site” files.
>> 
>> So, we propose to extend the —config option to include code as well as
>> config. We call this “complete” set of files the “site” directory (using
>> Hadoop terminology.) (See DRILL-4591.) This way:
>> 
>> * Upgrade is easy, throw away the old $DRILL_HOME and extract the Drill
>> archive to create the new one.
>> * With YARN, we upload the “stock” drill archive plus your (much smaller)
>> site files.
>> * We can more easily support multiple Drill “clusters” (each with its own
>> site files, including assigned ports.)
>> 
>> With YARN, you only need one copy of the DRILL_HOME and site directory;
>> YARN copies (“localizes”) the files to all your worker nodes. Without YARN,
>> you have to do the copy, probably with your favorite system admin tool.
>> 
>> So, the question is this: is the site directory a help for those of you
>> that won’t be using YARN? Or, does everyone just copy site files from one
>> DRILL_HOME to the next on upgrade, then push the merged directory to all
>> your worker nodes?
>> 
>> Thoughts?
>> 
>> Thanks,
>> 
>> - Paul
>> 
>> 
> 
> -- 
> Sent from my iThing


Re: Improvements to the drillbit.sh --config option

Posted by John Omernik <jo...@omernik.com>.
I am all for this, it would make Mesos easier as well. (I do have to merge
the 3rd party code and the libjpam.so into my archive at upgrade) . That
said, I'd be interested in the choice to merge a conf directory with
 code.  Personally, putting on my admin hat, I'd like the ability to add a
Code directory, and have that location be seperate from my conf directory.
I may have different qa processes, change management processes, etc between
the two.  I am very in favor of what you are proposing, I would just prefer
it being a separate option to append to my 3rd party jar search locations.
Please let me know if I am not being clear, I am on my iPad, and my
thoughts always appear more jumbled (more than usual) when I reread my iPad
posts.  I blame Woz.

John

On Sunday, May 29, 2016, Paul Rogers <pr...@maprtech.com> wrote:

> Hi All,
>
> The discussion with John and Charles about the drillbit scripts reminded
> me to get your thoughts on another change we’re working on.
>
> Today, drillbit.sh has the --config option so you can put your config
> files in a location separate from DRILL_HOME:
>
> $DRILL_HOME/bin/drillbit.sh —config /some/path/to/conf start
>
> This is handy, but it only holds config files (drill-env.sh,
> drill-override.sh). If you have custom code, it still must go into
> $DRILL_HOME/jars/3rdparty.
>
> This presents two challenges:
>
> * On upgrades, you have to grab your files from the old $DRILL_HOME and
> copy them into the new one.
> * With YARN, we have to create an archive of your entire $DRILL_HOME just
> to grab your “site” files.
>
> So, we propose to extend the —config option to include code as well as
> config. We call this “complete” set of files the “site” directory (using
> Hadoop terminology.) (See DRILL-4591.) This way:
>
> * Upgrade is easy, throw away the old $DRILL_HOME and extract the Drill
> archive to create the new one.
> * With YARN, we upload the “stock” drill archive plus your (much smaller)
> site files.
> * We can more easily support multiple Drill “clusters” (each with its own
> site files, including assigned ports.)
>
> With YARN, you only need one copy of the DRILL_HOME and site directory;
> YARN copies (“localizes”) the files to all your worker nodes. Without YARN,
> you have to do the copy, probably with your favorite system admin tool.
>
> So, the question is this: is the site directory a help for those of you
> that won’t be using YARN? Or, does everyone just copy site files from one
> DRILL_HOME to the next on upgrade, then push the merged directory to all
> your worker nodes?
>
> Thoughts?
>
> Thanks,
>
> - Paul
>
>

-- 
Sent from my iThing