You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@accumulo.apache.org by "Kepner, Jeremy - 0553 - MITLL" <ke...@ll.mit.edu> on 2014/05/17 20:35:26 UTC

Accumulo defaults

As part of our Accumulo benchmarking we have decided to set certain values as defaults for all our databases:

	tserver.compaction.minor.concurrent.max=5
	table.walog.enabled=false

We were wondering which file(s) we would need to modify to apply these defaults?


Re: Accumulo defaults

Posted by Keith Turner <ke...@deenlo.com>.
you can also set these in the shell w/

config -s tserver.compaction.minor.concurrent.max=5
config -s table.walog.enabled=false

Disabling walogs in the shell does not require a tserver restart, but I am
not sure about the minor compaction setting.

The advantage of setting the config in the shell is that you do not have
copy the config file across the cluster or worry about it being
inconsistent.  The advantage of the config file is that it will survive
re-initialization of Accumulo.


On Sat, May 17, 2014 at 6:22 PM, Josh Elser <jo...@gmail.com> wrote:

> <property>
>    <name>tserver.compaction.minor.concurrent.max</name>
>    <value>5</value>
> </property>
>
> <property>
>    <name>table.walog.enabled</name>
>    <value>false</value>
> </property>
>
>
>
> On 5/17/14, 5:34 PM, Kepner, Jeremy - 0553 - MITLL wrote:
>
>> Thanks.  Does anyone know the precise syntax that would be used in
>> conf/accumulo-site.xml?
>>
>> Regards.  -jeremy
>>
>> On May 17, 2014, at 4:12 PM, John Vines <vines@apache.org
>> <ma...@apache.org>>
>>   wrote:
>>
>>  Accumulo-site.xml
>>>
>>> Sent from my phone, please pardon the typos and brevity.
>>>
>>> On May 17, 2014 3:25 PM, "Kepner, Jeremy - 0553 - MITLL"
>>> <kepner@ll.mit.edu <ma...@ll.mit.edu>> wrote:
>>>
>>>     As part of our Accumulo benchmarking we have decided to set
>>>     certain values as defaults for all our databases:
>>>
>>>             tserver.compaction.minor.concurrent.max=5
>>>             table.walog.enabled=false
>>>
>>>     We were wondering which file(s) we would need to modify to apply
>>>     these defaults?
>>>
>>>
>>

Re: Accumulo defaults

Posted by Josh Elser <jo...@gmail.com>.
<property>
    <name>tserver.compaction.minor.concurrent.max</name>
    <value>5</value>
</property>

<property>
    <name>table.walog.enabled</name>
    <value>false</value>
</property>


On 5/17/14, 5:34 PM, Kepner, Jeremy - 0553 - MITLL wrote:
> Thanks.  Does anyone know the precise syntax that would be used in
> conf/accumulo-site.xml?
>
> Regards.  -jeremy
>
> On May 17, 2014, at 4:12 PM, John Vines <vines@apache.org
> <ma...@apache.org>>
>   wrote:
>
>> Accumulo-site.xml
>>
>> Sent from my phone, please pardon the typos and brevity.
>>
>> On May 17, 2014 3:25 PM, "Kepner, Jeremy - 0553 - MITLL"
>> <kepner@ll.mit.edu <ma...@ll.mit.edu>> wrote:
>>
>>     As part of our Accumulo benchmarking we have decided to set
>>     certain values as defaults for all our databases:
>>
>>             tserver.compaction.minor.concurrent.max=5
>>             table.walog.enabled=false
>>
>>     We were wondering which file(s) we would need to modify to apply
>>     these defaults?
>>
>

Re: Accumulo defaults

Posted by "Kepner, Jeremy - 0553 - MITLL" <ke...@ll.mit.edu>.
Thanks.  Does anyone know the precise syntax that would be used in conf/accumulo-site.xml?

Regards.  -jeremy

On May 17, 2014, at 4:12 PM, John Vines <vi...@apache.org>
 wrote:

> Accumulo-site.xml
> 
> Sent from my phone, please pardon the typos and brevity.
> 
> On May 17, 2014 3:25 PM, "Kepner, Jeremy - 0553 - MITLL" <ke...@ll.mit.edu> wrote:
> As part of our Accumulo benchmarking we have decided to set certain values as defaults for all our databases:
> 
>         tserver.compaction.minor.concurrent.max=5
>         table.walog.enabled=false
> 
> We were wondering which file(s) we would need to modify to apply these defaults?
> 


Re: Accumulo defaults

Posted by John Vines <vi...@apache.org>.
Accumulo-site.xml

Sent from my phone, please pardon the typos and brevity.
On May 17, 2014 3:25 PM, "Kepner, Jeremy - 0553 - MITLL" <ke...@ll.mit.edu>
wrote:

> As part of our Accumulo benchmarking we have decided to set certain values
> as defaults for all our databases:
>
>         tserver.compaction.minor.concurrent.max=5
>         table.walog.enabled=false
>
> We were wondering which file(s) we would need to modify to apply these
> defaults?
>
>

Re: Accumulo defaults

Posted by Josh Elser <jo...@gmail.com>.
And, one last thought, be careful about accidentally overriding walogs 
for the metadata table. There isn't ever a reason to turn off walogs for 
the metadata table (that I can think of).

I'm not sure if setting the table.walogs.enabled property in 
accumulo-site.xml would override the value that is initially configured 
on the metadata table or not. Hopefully not :)

On 5/17/14, 6:01 PM, Josh Elser wrote:
> Absolutely, if you restrict a problem, you can work around it in other
> ways. Not going to argue that.
>
> Since this is a user list though, I got very worried seeing something
> that roughly says "I'm benchmarking Accumulo with the WALs off". If
> you're providing resiliency against data lost using other tactics,
> that's fine, I just wanted to make sure that users who read this thread
> later don't think that running tests against Accumulo with the WALs off
> is "normal".
>
> Looking forward to see the full picture of the benchmarks!
>
> On 5/17/14, 5:27 PM, Jeremy Kepner wrote:
>> walog provides data loss protection in a specific set of circumstances.
>> Most of our deployments are under a different set of circumstances.
>> Accumulo is only one part of our systems and we have other
>> mechanisms for protecting against the loss of data.
>> We find the walog actually becomes a bottleneck in certain circumstances
>> and so turning it off increases the overall reliability of our system.
>>
>> On Sat, May 17, 2014 at 04:27:29PM -0400, Josh Elser wrote:
>>> You're likely to lose data in *any* deployment with the walogs turned
>>> off.
>>>
>>> And, to reiterate what Sean says, I wouldn't really consider any
>>> benchmark with the walogs turned off valid except for "internal"
>>> benchmarks (ones where we evaluate components only within Accumulo
>>> for the sake of improving Accumulo itself and not comparing it to
>>> other systems).
>>>
>>> On 5/17/14, 3:30 PM, Sean Busbey wrote:
>>>> You can set both of those in the accumulo-site.xml.
>>>>
>>>> However, it's going to be difficult to use benchmarks with walogs
>>>> disabled for valid comparisons to other systems. Also you are very
>>>> likely to lose data in any significantly sized deployment.
>>>>
>>>>
>>>>
>>>> On Sat, May 17, 2014 at 1:35 PM, Kepner, Jeremy - 0553 - MITLL
>>>> <kepner@ll.mit.edu <ma...@ll.mit.edu>> wrote:
>>>>
>>>>     As part of our Accumulo benchmarking we have decided to set certain
>>>>     values as defaults for all our databases:
>>>>
>>>>              tserver.compaction.minor.concurrent.max=5
>>>>              table.walog.enabled=false
>>>>
>>>>     We were wondering which file(s) we would need to modify to apply
>>>>     these defaults?
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Sean

Re: Accumulo defaults

Posted by Jeremy Kepner <ke...@ll.mit.edu>.
Agreed.

On Sat, May 17, 2014 at 06:01:26PM -0400, Josh Elser wrote:
> Absolutely, if you restrict a problem, you can work around it in
> other ways. Not going to argue that.
> 
> Since this is a user list though, I got very worried seeing
> something that roughly says "I'm benchmarking Accumulo with the WALs
> off". If you're providing resiliency against data lost using other
> tactics, that's fine, I just wanted to make sure that users who read
> this thread later don't think that running tests against Accumulo
> with the WALs off is "normal".
> 
> Looking forward to see the full picture of the benchmarks!
> 
> On 5/17/14, 5:27 PM, Jeremy Kepner wrote:
> >walog provides data loss protection in a specific set of circumstances.
> >Most of our deployments are under a different set of circumstances.
> >Accumulo is only one part of our systems and we have other
> >mechanisms for protecting against the loss of data.
> >We find the walog actually becomes a bottleneck in certain circumstances
> >and so turning it off increases the overall reliability of our system.
> >
> >On Sat, May 17, 2014 at 04:27:29PM -0400, Josh Elser wrote:
> >>You're likely to lose data in *any* deployment with the walogs turned off.
> >>
> >>And, to reiterate what Sean says, I wouldn't really consider any
> >>benchmark with the walogs turned off valid except for "internal"
> >>benchmarks (ones where we evaluate components only within Accumulo
> >>for the sake of improving Accumulo itself and not comparing it to
> >>other systems).
> >>
> >>On 5/17/14, 3:30 PM, Sean Busbey wrote:
> >>>You can set both of those in the accumulo-site.xml.
> >>>
> >>>However, it's going to be difficult to use benchmarks with walogs
> >>>disabled for valid comparisons to other systems. Also you are very
> >>>likely to lose data in any significantly sized deployment.
> >>>
> >>>
> >>>
> >>>On Sat, May 17, 2014 at 1:35 PM, Kepner, Jeremy - 0553 - MITLL
> >>><kepner@ll.mit.edu <ma...@ll.mit.edu>> wrote:
> >>>
> >>>    As part of our Accumulo benchmarking we have decided to set certain
> >>>    values as defaults for all our databases:
> >>>
> >>>             tserver.compaction.minor.concurrent.max=5
> >>>             table.walog.enabled=false
> >>>
> >>>    We were wondering which file(s) we would need to modify to apply
> >>>    these defaults?
> >>>
> >>>
> >>>
> >>>
> >>>--
> >>>Sean

Re: Accumulo defaults

Posted by Josh Elser <jo...@gmail.com>.
Absolutely, if you restrict a problem, you can work around it in other 
ways. Not going to argue that.

Since this is a user list though, I got very worried seeing something 
that roughly says "I'm benchmarking Accumulo with the WALs off". If 
you're providing resiliency against data lost using other tactics, 
that's fine, I just wanted to make sure that users who read this thread 
later don't think that running tests against Accumulo with the WALs off 
is "normal".

Looking forward to see the full picture of the benchmarks!

On 5/17/14, 5:27 PM, Jeremy Kepner wrote:
> walog provides data loss protection in a specific set of circumstances.
> Most of our deployments are under a different set of circumstances.
> Accumulo is only one part of our systems and we have other
> mechanisms for protecting against the loss of data.
> We find the walog actually becomes a bottleneck in certain circumstances
> and so turning it off increases the overall reliability of our system.
>
> On Sat, May 17, 2014 at 04:27:29PM -0400, Josh Elser wrote:
>> You're likely to lose data in *any* deployment with the walogs turned off.
>>
>> And, to reiterate what Sean says, I wouldn't really consider any
>> benchmark with the walogs turned off valid except for "internal"
>> benchmarks (ones where we evaluate components only within Accumulo
>> for the sake of improving Accumulo itself and not comparing it to
>> other systems).
>>
>> On 5/17/14, 3:30 PM, Sean Busbey wrote:
>>> You can set both of those in the accumulo-site.xml.
>>>
>>> However, it's going to be difficult to use benchmarks with walogs
>>> disabled for valid comparisons to other systems. Also you are very
>>> likely to lose data in any significantly sized deployment.
>>>
>>>
>>>
>>> On Sat, May 17, 2014 at 1:35 PM, Kepner, Jeremy - 0553 - MITLL
>>> <kepner@ll.mit.edu <ma...@ll.mit.edu>> wrote:
>>>
>>>     As part of our Accumulo benchmarking we have decided to set certain
>>>     values as defaults for all our databases:
>>>
>>>              tserver.compaction.minor.concurrent.max=5
>>>              table.walog.enabled=false
>>>
>>>     We were wondering which file(s) we would need to modify to apply
>>>     these defaults?
>>>
>>>
>>>
>>>
>>> --
>>> Sean

Re: Accumulo defaults

Posted by Jeremy Kepner <ke...@ll.mit.edu>.
walog provides data loss protection in a specific set of circumstances.
Most of our deployments are under a different set of circumstances.
Accumulo is only one part of our systems and we have other
mechanisms for protecting against the loss of data.
We find the walog actually becomes a bottleneck in certain circumstances
and so turning it off increases the overall reliability of our system.

On Sat, May 17, 2014 at 04:27:29PM -0400, Josh Elser wrote:
> You're likely to lose data in *any* deployment with the walogs turned off.
> 
> And, to reiterate what Sean says, I wouldn't really consider any
> benchmark with the walogs turned off valid except for "internal"
> benchmarks (ones where we evaluate components only within Accumulo
> for the sake of improving Accumulo itself and not comparing it to
> other systems).
> 
> On 5/17/14, 3:30 PM, Sean Busbey wrote:
> >You can set both of those in the accumulo-site.xml.
> >
> >However, it's going to be difficult to use benchmarks with walogs
> >disabled for valid comparisons to other systems. Also you are very
> >likely to lose data in any significantly sized deployment.
> >
> >
> >
> >On Sat, May 17, 2014 at 1:35 PM, Kepner, Jeremy - 0553 - MITLL
> ><kepner@ll.mit.edu <ma...@ll.mit.edu>> wrote:
> >
> >    As part of our Accumulo benchmarking we have decided to set certain
> >    values as defaults for all our databases:
> >
> >             tserver.compaction.minor.concurrent.max=5
> >             table.walog.enabled=false
> >
> >    We were wondering which file(s) we would need to modify to apply
> >    these defaults?
> >
> >
> >
> >
> >--
> >Sean

Re: Accumulo defaults

Posted by Josh Elser <jo...@gmail.com>.
You're likely to lose data in *any* deployment with the walogs turned off.

And, to reiterate what Sean says, I wouldn't really consider any 
benchmark with the walogs turned off valid except for "internal" 
benchmarks (ones where we evaluate components only within Accumulo for 
the sake of improving Accumulo itself and not comparing it to other 
systems).

On 5/17/14, 3:30 PM, Sean Busbey wrote:
> You can set both of those in the accumulo-site.xml.
>
> However, it's going to be difficult to use benchmarks with walogs
> disabled for valid comparisons to other systems. Also you are very
> likely to lose data in any significantly sized deployment.
>
>
>
> On Sat, May 17, 2014 at 1:35 PM, Kepner, Jeremy - 0553 - MITLL
> <kepner@ll.mit.edu <ma...@ll.mit.edu>> wrote:
>
>     As part of our Accumulo benchmarking we have decided to set certain
>     values as defaults for all our databases:
>
>              tserver.compaction.minor.concurrent.max=5
>              table.walog.enabled=false
>
>     We were wondering which file(s) we would need to modify to apply
>     these defaults?
>
>
>
>
> --
> Sean

Re: Accumulo defaults

Posted by Sean Busbey <bu...@cloudera.com>.
You can set both of those in the accumulo-site.xml.

However, it's going to be difficult to use benchmarks with walogs disabled
for valid comparisons to other systems. Also you are very likely to lose
data in any significantly sized deployment.



On Sat, May 17, 2014 at 1:35 PM, Kepner, Jeremy - 0553 - MITLL <
kepner@ll.mit.edu> wrote:

> As part of our Accumulo benchmarking we have decided to set certain values
> as defaults for all our databases:
>
>         tserver.compaction.minor.concurrent.max=5
>         table.walog.enabled=false
>
> We were wondering which file(s) we would need to modify to apply these
> defaults?
>
>


-- 
Sean