You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@accumulo.apache.org by Massimilian Mattetti <MA...@il.ibm.com> on 2016/11/22 15:36:50 UTC

how to reduce re-seeking rate?

Hi all, 

I developed an iterator that pre-loads a set of results every time it 
jumps to a new row. Checking the logs (by the way I am using log4j2 as 
logging library, but it is not able to locate the log4j2.xml that is 
inside the iterator jar, should I put it in a different place?) I noticed 
that Accumulo is re-seeking after each key returned by my iterator. This 
is killing the performance of my system, is there a way to reduce the rate 
in which Accumulo kill and re-seek the iterators?
Thanks

Regards,

Massimiliano Mattetti








Re: how to reduce re-seeking rate?

Posted by Massimilian Mattetti <MA...@il.ibm.com>.
I am not using the HDFS class loader so I need to look in some other place 
to find out what the problem is with log4j2.

I increased the size of table.scan.max.memory up to 1MB and it worked. I 
do not have big Key-Values but I am using a "kind of document-partitioned 
table" in which my iterator is pre-loading a SortedSet of Keys from the 
index section of the row and then using them to go over the data section 
one key-value pair at time. In order to achieve good performance I need 
that my iterator re-calculates the set of Keys from the index as few times 
as possible.
Thanks.

Regards,
Max



From:   Josh Elser <jo...@gmail.com>
To:     user@accumulo.apache.org
Date:   22/11/2016 20:39
Subject:        Re: how to reduce re-seeking rate?



There isn't any funny classloading happening in the normal case, so 
having the log4j2.xml file in your jar should be sufficient. Caveat is 
if you're using the HDFS classloading stuff, but that's something you 
would have enabled by hand if you're using it.

I think the scan max memory that Dave pointed out is the only knob for 
this one. I don't think we have any other sort of policy that governs 
lifecycle of iterators. It's not intended from a framework that 
re-instantiation of a batch of results is costly.

Making a guess: are you returning very large Key-Values?

dlmarion@comcast.net wrote:
> In one case, the tserver will send data back to the client when it fills
> its buffer. When this happens, it?s possible that the iterator could be
> torn down and re-seeked to the last key returned. You could increase the
> size of this buffer to see if that helps
> (
http://accumulo.apache.org/1.8/accumulo_user_manual.html#_table_scan_max_memory
)
>
> *From:*Massimilian Mattetti [mailto:MASSIMIL@il.ibm.com]
> *Sent:* Tuesday, November 22, 2016 10:37 AM
> *To:* user@accumulo.apache.org
> *Subject:* how to reduce re-seeking rate?
>
> Hi all,
>
> I developed an iterator that pre-loads a set of results every time it
> jumps to a new row. Checking the logs (by the way I am using log4j2 as
> logging library, but it is not able to locate the log4j2.xml that is
> inside the iterator jar, should I put it in a different place?) I
> noticed that Accumulo is re-seeking after each key returned by my
> iterator. This is killing the performance of my system, is there a way
> to reduce the rate in which Accumulo kill and re-seek the iterators?
> Thanks
>
> Regards,
>
> *Massimiliano Mattetti*
>
> 
>
>
>






Re: how to reduce re-seeking rate?

Posted by Josh Elser <jo...@gmail.com>.
There isn't any funny classloading happening in the normal case, so 
having the log4j2.xml file in your jar should be sufficient. Caveat is 
if you're using the HDFS classloading stuff, but that's something you 
would have enabled by hand if you're using it.

I think the scan max memory that Dave pointed out is the only knob for 
this one. I don't think we have any other sort of policy that governs 
lifecycle of iterators. It's not intended from a framework that 
re-instantiation of a batch of results is costly.

Making a guess: are you returning very large Key-Values?

dlmarion@comcast.net wrote:
> In one case, the tserver will send data back to the client when it fills
> its buffer. When this happens, its possible that the iterator could be
> torn down and re-seeked to the last key returned. You could increase the
> size of this buffer to see if that helps
> (http://accumulo.apache.org/1.8/accumulo_user_manual.html#_table_scan_max_memory)
>
> *From:*Massimilian Mattetti [mailto:MASSIMIL@il.ibm.com]
> *Sent:* Tuesday, November 22, 2016 10:37 AM
> *To:* user@accumulo.apache.org
> *Subject:* how to reduce re-seeking rate?
>
> Hi all,
>
> I developed an iterator that pre-loads a set of results every time it
> jumps to a new row. Checking the logs (by the way I am using log4j2 as
> logging library, but it is not able to locate the log4j2.xml that is
> inside the iterator jar, should I put it in a different place?) I
> noticed that Accumulo is re-seeking after each key returned by my
> iterator. This is killing the performance of my system, is there a way
> to reduce the rate in which Accumulo kill and re-seek the iterators?
> Thanks
>
> Regards,
>
> *Massimiliano Mattetti*
>
> 	
>
>
>

RE: how to reduce re-seeking rate?

Posted by dl...@comcast.net.
In one case, the tserver will send data back to the client when it fills its
buffer. When this happens, it's possible that the iterator could be torn
down and re-seeked to the last key returned. You could increase the size of
this buffer to see if that helps
(http://accumulo.apache.org/1.8/accumulo_user_manual.html#_table_scan_max_me
mory)

 

From: Massimilian Mattetti [mailto:MASSIMIL@il.ibm.com] 
Sent: Tuesday, November 22, 2016 10:37 AM
To: user@accumulo.apache.org
Subject: how to reduce re-seeking rate?

 

Hi all, 

I developed an iterator that pre-loads a set of results every time it jumps
to a new row. Checking the logs (by the way I am using log4j2 as logging
library, but it is not able to locate the log4j2.xml that is inside the
iterator jar, should I put it in a different place?) I noticed that Accumulo
is re-seeking after each key returned by my iterator. This is killing the
performance of my system, is there a way to reduce the rate in which
Accumulo kill and re-seek the iterators?
Thanks


Regards,

Massimiliano Mattetti