You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@accumulo.apache.org by John Vines <vi...@apache.org> on 2012/09/25 20:04:03 UTC

A discussion on scripts

I've been mulling ideas for script rework and one idea that I've been
leaning toward is a master script that all calls go through. It's not that
different from how we operate now, just things like start all, tup, etc all
get called through a single source. This would allow a single source for
script actions against Accumulo.

Just to be clear, this doesn't mean all of the code will necessarily be in
a single location. But it does mean we can remove code duplication in our
scripts.

The biggest con is the change in behavior. Though this may not be a bad
thing since we do have a lot of scripts in bin already, so it may improve
user experience as long as we make the usage info of the new script
informative enough.

What say ye, devs?

Sent from my phone, so pardon the typos and brevity.

Re: A discussion on scripts

Posted by Eric Newton <er...@gmail.com>.
We also read things like default replication levels, maximum
replication, block size, etc.  The HDFS client code does this, at a
minimum, but some Accumulo code does this directly.

-Eric

On Mon, Oct 1, 2012 at 2:01 AM, John Vines <vi...@apache.org> wrote:
> It's been a while since I poked through it, but I honestly think the only
> thing we pull out of the configs are the namenode and jobtracker URIs. But
> I could very well mistaken.
>
> On Sun, Sep 30, 2012 at 9:38 PM, Kristopher Kane <kk...@gmail.com>wrote:
>
>> >> /usr/bin/accumulo-tserver --config=/etc/accumulo/tserver.conf
>> >> /usr/bin/accumulo-master --config=/etc/accumulo/master.conf
>> >>
>> >> and for the shell
>> >> /usr/bin/accumulo --config=~/.accumulo/config
>>
>> In addition, what about a hook into HDFS for calling configurations
>> from a shared workspace? --config could parse a protocol or assume
>> local if not in protocol format.
>>
>> -Kris
>>

Re: A discussion on scripts

Posted by John Vines <vi...@apache.org>.
It's been a while since I poked through it, but I honestly think the only
thing we pull out of the configs are the namenode and jobtracker URIs. But
I could very well mistaken.

On Sun, Sep 30, 2012 at 9:38 PM, Kristopher Kane <kk...@gmail.com>wrote:

> >> /usr/bin/accumulo-tserver --config=/etc/accumulo/tserver.conf
> >> /usr/bin/accumulo-master --config=/etc/accumulo/master.conf
> >>
> >> and for the shell
> >> /usr/bin/accumulo --config=~/.accumulo/config
>
> In addition, what about a hook into HDFS for calling configurations
> from a shared workspace? --config could parse a protocol or assume
> local if not in protocol format.
>
> -Kris
>

Re: A discussion on scripts

Posted by Christopher Tubbs <ct...@gmail.com>.
It sounds reasonable to me to support an hdfs:// URI for the config
file. I was already thinking about adding support for a config file
that could specify extra libs in HDFS (for iterators, etc.), to
support ACCUMULO-708.

On Sun, Sep 30, 2012 at 9:38 PM, Kristopher Kane <kk...@gmail.com> wrote:
>>> /usr/bin/accumulo-tserver --config=/etc/accumulo/tserver.conf
>>> /usr/bin/accumulo-master --config=/etc/accumulo/master.conf
>>>
>>> and for the shell
>>> /usr/bin/accumulo --config=~/.accumulo/config
>
> In addition, what about a hook into HDFS for calling configurations
> from a shared workspace? --config could parse a protocol or assume
> local if not in protocol format.
>
> -Kris

Re: A discussion on scripts

Posted by Kristopher Kane <kk...@gmail.com>.
>> /usr/bin/accumulo-tserver --config=/etc/accumulo/tserver.conf
>> /usr/bin/accumulo-master --config=/etc/accumulo/master.conf
>>
>> and for the shell
>> /usr/bin/accumulo --config=~/.accumulo/config

In addition, what about a hook into HDFS for calling configurations
from a shared workspace? --config could parse a protocol or assume
local if not in protocol format.

-Kris

Re: A discussion on scripts

Posted by David Medinets <da...@gmail.com>.
I'm leaning towards the Chris's model of separate components. I like
that it has specific benefits. But, the best benefit for me is the
simplicity of separate scripts. For new accumulators, it will be
easier to understand each script.

On Wed, Sep 26, 2012 at 7:49 AM, Christopher Tubbs <ct...@gmail.com> wrote:
> I'd like to see separate call scripts for each separate process. This
> is important for packaging, especially, so we can deliver components
> separately, and make it easier for users to deploy and configure the
> components that they want on each node of their cluster. I also think
> these should be separate from init scripts. (This doesn't mean they
> cannot inherit from a master script that takes parameters, but that
> should be transparent to the user.)
>
> What I'm thinking is that I want something like (config params
> optional, but with a sensible default location like those specified
> below):
>
> /usr/bin/accumulo-tserver --config=/etc/accumulo/tserver.conf
> /usr/bin/accumulo-master --config=/etc/accumulo/master.conf
>
> and for the shell
> /usr/bin/accumulo --config=~/.accumulo/config
>
> With init scripts that call the above scripts:
> /etc/init.d/accumulo-master start|stop|restart|status # calls the
> appropriate commands to start, kill, etc.
> /etc/init.d/accumulo-tserver start|stop|restart|status
>
> I'm fine with the /usr/bin/accumulo* being symlinks to our install
> location (/usr/lib/accumulo/bin?), but the scripts should assume
> default install locations (absolute paths, not relative paths), and
> take a single configuration file as an optional parameter. This config
> file should specify any locations that override the defaults. This
> pattern makes everything very explicit for the user, and very
> convenient to configure. It also makes writing RPMs/DEBs very easy,
> because we know where to put files by default where users will expect
> them.
>
> On Tue, Sep 25, 2012 at 2:04 PM, John Vines <vi...@apache.org> wrote:
>> I've been mulling ideas for script rework and one idea that I've been
>> leaning toward is a master script that all calls go through. It's not that
>> different from how we operate now, just things like start all, tup, etc all
>> get called through a single source. This would allow a single source for
>> script actions against Accumulo.
>>
>> Just to be clear, this doesn't mean all of the code will necessarily be in
>> a single location. But it does mean we can remove code duplication in our
>> scripts.
>>
>> The biggest con is the change in behavior. Though this may not be a bad
>> thing since we do have a lot of scripts in bin already, so it may improve
>> user experience as long as we make the usage info of the new script
>> informative enough.
>>
>> What say ye, devs?
>>
>> Sent from my phone, so pardon the typos and brevity.

Re: A discussion on scripts

Posted by Christopher Tubbs <ct...@gmail.com>.
I'd like to see separate call scripts for each separate process. This
is important for packaging, especially, so we can deliver components
separately, and make it easier for users to deploy and configure the
components that they want on each node of their cluster. I also think
these should be separate from init scripts. (This doesn't mean they
cannot inherit from a master script that takes parameters, but that
should be transparent to the user.)

What I'm thinking is that I want something like (config params
optional, but with a sensible default location like those specified
below):

/usr/bin/accumulo-tserver --config=/etc/accumulo/tserver.conf
/usr/bin/accumulo-master --config=/etc/accumulo/master.conf

and for the shell
/usr/bin/accumulo --config=~/.accumulo/config

With init scripts that call the above scripts:
/etc/init.d/accumulo-master start|stop|restart|status # calls the
appropriate commands to start, kill, etc.
/etc/init.d/accumulo-tserver start|stop|restart|status

I'm fine with the /usr/bin/accumulo* being symlinks to our install
location (/usr/lib/accumulo/bin?), but the scripts should assume
default install locations (absolute paths, not relative paths), and
take a single configuration file as an optional parameter. This config
file should specify any locations that override the defaults. This
pattern makes everything very explicit for the user, and very
convenient to configure. It also makes writing RPMs/DEBs very easy,
because we know where to put files by default where users will expect
them.

On Tue, Sep 25, 2012 at 2:04 PM, John Vines <vi...@apache.org> wrote:
> I've been mulling ideas for script rework and one idea that I've been
> leaning toward is a master script that all calls go through. It's not that
> different from how we operate now, just things like start all, tup, etc all
> get called through a single source. This would allow a single source for
> script actions against Accumulo.
>
> Just to be clear, this doesn't mean all of the code will necessarily be in
> a single location. But it does mean we can remove code duplication in our
> scripts.
>
> The biggest con is the change in behavior. Though this may not be a bad
> thing since we do have a lot of scripts in bin already, so it may improve
> user experience as long as we make the usage info of the new script
> informative enough.
>
> What say ye, devs?
>
> Sent from my phone, so pardon the typos and brevity.