You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@accumulo.apache.org by "Kepner, Jeremy - 1010 - MITLL" <ke...@ll.mit.edu> on 2013/01/02 03:19:22 UTC

ingest performance oscillations and Xceivers

Accumulo Colleagues,
  I am trying to optimize my ingest into a single node Accumulo instance running on a 32 core node with 96 GB of RAM.  I am seeing the follow ingest variations as a I change the number of ingest processes (see attached):

-------------------------------------
Ingestors, Ingest rate
-------------------------------------
1, 60K inserts/sec (stable)
2, 120K inserts/sec (stable)
3, 60K to 180K inserts/sec
4, 90K to 220K inserts/sec
8, 80K to 280K inserts/sec
12, 80K to 280K inserts/sec
-------------------------------------

The only thing I can see that correlates with the ingest rate is the number of Xceivers.  When the ingest rate is high the number of Xceivers is usually low.  Likewise, when the ingest rate drops, the number of Xceivers usually increases significantly.

Question: What role to Xceivers play in ingest?

Request: It would be great to add a plot showing the number of Xceivers over time to the diagnostics.

Regards.  -Jeremy


Re: ingest performance oscillations and Xceivers

Posted by Eric Newton <er...@gmail.com>.
You may be seeing some impact due to ACCUMULO-893:

http://issues.apache.org/jira/browse/ACCUMULO-893

Are you seeing 2-minute hold times popping up?

-Eric


On Wed, Jan 2, 2013 at 7:25 PM, Kepner, Jeremy - 0553 - MITLL <
kepner@ll.mit.edu> wrote:

> Hmmm, that's interesting, because in the past I didn't see this behavior.
>  It might be worth having someone look into because it seems to have a 2x
> impact on sustained ingest.
>
> Regards.  -Jeremy
>
> On Jan 2, 2013, at 2:23 PM, Keith Turner wrote:
>
> > On Wed, Jan 2, 2013 at 2:11 PM, Jeremy Kepner <ke...@ll.mit.edu> wrote:
> >> So what mechanism causes the number of Xceivers to increase?
> >
> > Its been a while since I looked at the data node source code.   When I
> > last look at it an Xceiver was just a thread created to handle a
> > datanode request.   The thread went away after the request was
> > processed.   So major and minor compactions running would cause more
> > Xceivers to be created to read and write data.
> >
> > Newer datanode code may use a thread pool instead of creating a
> > thread/xceiver for each request.   I am not sure.
> >
> >> I am carefully controlling the number of ingestors and the data isn't
> varying too much.
> >> I would expect the number of Xceivers to remain consant.
> >>
> >> Regards.  -Jeremy
> >>
> >> On Tue, Jan 01, 2013 at 09:45:20PM -0500, Eric Newton wrote:
> >>> Hey Jeremy,
> >>>
> >>> Can you compare the ingest rate to the number of tablets, too?
> >>>
> >>> I've found, that if I have 20-80 tablets per server (on similar
> hardware) I
> >>> get the best performance.
> >>>
> >>> # of Xceivers == number of writers when ingest is the primary target.
> >>>
> >>> Also, is this 1.4 or trunk?
> >>>
> >>> -Eric
> >>>
> >>>
> >>>
> >>> On Tue, Jan 1, 2013 at 9:19 PM, Kepner, Jeremy - 1010 - MITLL <
> >>> kepner@ll.mit.edu> wrote:
> >>>
> >>>> Accumulo Colleagues,
> >>>>  I am trying to optimize my ingest into a single node Accumulo
> instance
> >>>> running on a 32 core node with 96 GB of RAM.  I am seeing the follow
> ingest
> >>>> variations as a I change the number of ingest processes (see
> attached):
> >>>>
> >>>> -------------------------------------
> >>>> Ingestors, Ingest rate
> >>>> -------------------------------------
> >>>> 1, 60K inserts/sec (stable)
> >>>> 2, 120K inserts/sec (stable)
> >>>> 3, 60K to 180K inserts/sec
> >>>> 4, 90K to 220K inserts/sec
> >>>> 8, 80K to 280K inserts/sec
> >>>> 12, 80K to 280K inserts/sec
> >>>> -------------------------------------
> >>>>
> >>>> The only thing I can see that correlates with the ingest rate is the
> >>>> number of Xceivers.  When the ingest rate is high the number of
> Xceivers is
> >>>> usually low.  Likewise, when the ingest rate drops, the number of
> Xceivers
> >>>> usually increases significantly.
> >>>>
> >>>> Question: What role to Xceivers play in ingest?
> >>>>
> >>>> Request: It would be great to add a plot showing the number of
> Xceivers
> >>>> over time to the diagnostics.
> >>>>
> >>>> Regards.  -Jeremy
> >>>>
> >>>>
>
>

Re: ingest performance oscillations and Xceivers

Posted by Eric Newton <er...@gmail.com>.
You may be seeing some impact due to ACCUMULO-893:

http://issues.apache.org/jira/browse/ACCUMULO-893

Are you seeing 2-minute hold times popping up?

-Eric


On Wed, Jan 2, 2013 at 7:25 PM, Kepner, Jeremy - 0553 - MITLL <
kepner@ll.mit.edu> wrote:

> Hmmm, that's interesting, because in the past I didn't see this behavior.
>  It might be worth having someone look into because it seems to have a 2x
> impact on sustained ingest.
>
> Regards.  -Jeremy
>
> On Jan 2, 2013, at 2:23 PM, Keith Turner wrote:
>
> > On Wed, Jan 2, 2013 at 2:11 PM, Jeremy Kepner <ke...@ll.mit.edu> wrote:
> >> So what mechanism causes the number of Xceivers to increase?
> >
> > Its been a while since I looked at the data node source code.   When I
> > last look at it an Xceiver was just a thread created to handle a
> > datanode request.   The thread went away after the request was
> > processed.   So major and minor compactions running would cause more
> > Xceivers to be created to read and write data.
> >
> > Newer datanode code may use a thread pool instead of creating a
> > thread/xceiver for each request.   I am not sure.
> >
> >> I am carefully controlling the number of ingestors and the data isn't
> varying too much.
> >> I would expect the number of Xceivers to remain consant.
> >>
> >> Regards.  -Jeremy
> >>
> >> On Tue, Jan 01, 2013 at 09:45:20PM -0500, Eric Newton wrote:
> >>> Hey Jeremy,
> >>>
> >>> Can you compare the ingest rate to the number of tablets, too?
> >>>
> >>> I've found, that if I have 20-80 tablets per server (on similar
> hardware) I
> >>> get the best performance.
> >>>
> >>> # of Xceivers == number of writers when ingest is the primary target.
> >>>
> >>> Also, is this 1.4 or trunk?
> >>>
> >>> -Eric
> >>>
> >>>
> >>>
> >>> On Tue, Jan 1, 2013 at 9:19 PM, Kepner, Jeremy - 1010 - MITLL <
> >>> kepner@ll.mit.edu> wrote:
> >>>
> >>>> Accumulo Colleagues,
> >>>>  I am trying to optimize my ingest into a single node Accumulo
> instance
> >>>> running on a 32 core node with 96 GB of RAM.  I am seeing the follow
> ingest
> >>>> variations as a I change the number of ingest processes (see
> attached):
> >>>>
> >>>> -------------------------------------
> >>>> Ingestors, Ingest rate
> >>>> -------------------------------------
> >>>> 1, 60K inserts/sec (stable)
> >>>> 2, 120K inserts/sec (stable)
> >>>> 3, 60K to 180K inserts/sec
> >>>> 4, 90K to 220K inserts/sec
> >>>> 8, 80K to 280K inserts/sec
> >>>> 12, 80K to 280K inserts/sec
> >>>> -------------------------------------
> >>>>
> >>>> The only thing I can see that correlates with the ingest rate is the
> >>>> number of Xceivers.  When the ingest rate is high the number of
> Xceivers is
> >>>> usually low.  Likewise, when the ingest rate drops, the number of
> Xceivers
> >>>> usually increases significantly.
> >>>>
> >>>> Question: What role to Xceivers play in ingest?
> >>>>
> >>>> Request: It would be great to add a plot showing the number of
> Xceivers
> >>>> over time to the diagnostics.
> >>>>
> >>>> Regards.  -Jeremy
> >>>>
> >>>>
>
>

Re: ingest performance oscillations and Xceivers

Posted by "Kepner, Jeremy - 0553 - MITLL" <ke...@ll.mit.edu>.
Thanks.  I didn't realize this. I have created an issue and attached the performance plots to the issue:

https://issues.apache.org/jira/browse/ACCUMULO-931

On Jan 3, 2013, at 8:26 PM, Drew Farris wrote:

Yep, most apache lists strip attachments.

On Thursday, January 3, 2013, Kepner, Jeremy - 0553 - MITLL wrote:
Here it is again.  It got sent the last time. Does the e-mail list strip out attachments?

On Jan 3, 2013, at 7:47 PM, Josh Elser wrote:

> I think you missed your attachment :)
>


Re: ingest performance oscillations and Xceivers

Posted by Drew Farris <dr...@apache.org>.
Yep, most apache lists strip attachments.

On Thursday, January 3, 2013, Kepner, Jeremy - 0553 - MITLL wrote:

> Here it is again.  It got sent the last time. Does the e-mail list strip
> out attachments?
>
> On Jan 3, 2013, at 7:47 PM, Josh Elser wrote:
>
> > I think you missed your attachment :)
> >
>

Re: ingest performance oscillations and Xceivers

Posted by Josh Elser <jo...@gmail.com>.
Actually, I'm apparently blind. Just kidding, didn't receive. Perhaps 
over a certain size? I've seen some before. You could open something on 
Jira and attach it there.

On 01/03/2013 08:10 PM, Josh Elser wrote:
> I don't think it should/does. I got it this time.
>
> On 01/03/2013 07:55 PM, Kepner, Jeremy - 0553 - MITLL wrote:
>> Here it is again.  It got sent the last time. Does the e-mail list 
>> strip out attachments?
>>

Re: ingest performance oscillations and Xceivers

Posted by Josh Elser <jo...@gmail.com>.
I don't think it should/does. I got it this time.

On 01/03/2013 07:55 PM, Kepner, Jeremy - 0553 - MITLL wrote:
> Here it is again.  It got sent the last time. Does the e-mail list strip out attachments?
>

Re: ingest performance oscillations and Xceivers

Posted by John Vines <jv...@gmail.com>.
I think so, because I got it on the first Terry when I was directly listed

Sent from my phone, please pardon the typos and brevity.
On Jan 3, 2013 8:06 PM, "Kepner, Jeremy - 0553 - MITLL" <ke...@ll.mit.edu>
wrote:

> Here it is again.  It got sent the last time. Does the e-mail list strip
> out attachments?
>
> On Jan 3, 2013, at 7:47 PM, Josh Elser wrote:
>
> > I think you missed your attachment :)
> >
> > On 01/03/2013 07:31 PM, Kepner, Jeremy - 0553 - MITLL wrote:
> >> Attached is the log for a set for 3 ingests.  The first hump is with 1
> ingestor, the second hump is with 2 ingestors, and the third hump is with 3
> ingestors.  The 3 ingestor case starts oscillating about 21:40. I don't see
> any spikes in any of the fields.  It would be nice if there was also a plot
> of Xceivers since I think that anti-correlates.  I do sometimes see the
> hold times highlighted in red.  The values are usually in hundreds of msec.
> >>
> >> On Jan 3, 2013, at 6:04 PM, John Vines wrote:
> >>
> >>> How many hard drives and what's your max minor/maor compactions set to?
> >>> These can severely limit your performance due to potential disk
> thrashing.
> >>> If you observe the monitor, when ingest starts trailing, do you see it
> >>> undergoing, or possibly being backed up on, any form of compaction? And
> >>> lastly, are you see the bug eric mentioned above? See if the monitor is
> >>> reporting any warnings like those mentioned.
>
>

Re: ingest performance oscillations and Xceivers

Posted by "Kepner, Jeremy - 0553 - MITLL" <ke...@ll.mit.edu>.
Here it is again.  It got sent the last time. Does the e-mail list strip out attachments?

On Jan 3, 2013, at 7:47 PM, Josh Elser wrote:

> I think you missed your attachment :)
> 
> On 01/03/2013 07:31 PM, Kepner, Jeremy - 0553 - MITLL wrote:
>> Attached is the log for a set for 3 ingests.  The first hump is with 1 ingestor, the second hump is with 2 ingestors, and the third hump is with 3 ingestors.  The 3 ingestor case starts oscillating about 21:40. I don't see any spikes in any of the fields.  It would be nice if there was also a plot of Xceivers since I think that anti-correlates.  I do sometimes see the hold times highlighted in red.  The values are usually in hundreds of msec.
>> 
>> On Jan 3, 2013, at 6:04 PM, John Vines wrote:
>> 
>>> How many hard drives and what's your max minor/maor compactions set to?
>>> These can severely limit your performance due to potential disk thrashing.
>>> If you observe the monitor, when ingest starts trailing, do you see it
>>> undergoing, or possibly being backed up on, any form of compaction? And
>>> lastly, are you see the bug eric mentioned above? See if the monitor is
>>> reporting any warnings like those mentioned.


Re: ingest performance oscillations and Xceivers

Posted by Josh Elser <jo...@gmail.com>.
I think you missed your attachment :)

On 01/03/2013 07:31 PM, Kepner, Jeremy - 0553 - MITLL wrote:
> Attached is the log for a set for 3 ingests.  The first hump is with 1 ingestor, the second hump is with 2 ingestors, and the third hump is with 3 ingestors.  The 3 ingestor case starts oscillating about 21:40. I don't see any spikes in any of the fields.  It would be nice if there was also a plot of Xceivers since I think that anti-correlates.  I do sometimes see the hold times highlighted in red.  The values are usually in hundreds of msec.
>
> On Jan 3, 2013, at 6:04 PM, John Vines wrote:
>
>> How many hard drives and what's your max minor/maor compactions set to?
>> These can severely limit your performance due to potential disk thrashing.
>> If you observe the monitor, when ingest starts trailing, do you see it
>> undergoing, or possibly being backed up on, any form of compaction? And
>> lastly, are you see the bug eric mentioned above? See if the monitor is
>> reporting any warnings like those mentioned.

Re: ingest performance oscillations and Xceivers

Posted by "Kepner, Jeremy - 0553 - MITLL" <ke...@ll.mit.edu>.
Attached is the log for a set for 3 ingests.  The first hump is with 1 ingestor, the second hump is with 2 ingestors, and the third hump is with 3 ingestors.  The 3 ingestor case starts oscillating about 21:40. I don't see any spikes in any of the fields.  It would be nice if there was also a plot of Xceivers since I think that anti-correlates.  I do sometimes see the hold times highlighted in red.  The values are usually in hundreds of msec.

On Jan 3, 2013, at 6:04 PM, John Vines wrote:

> How many hard drives and what's your max minor/maor compactions set to?
> These can severely limit your performance due to potential disk thrashing.
> If you observe the monitor, when ingest starts trailing, do you see it
> undergoing, or possibly being backed up on, any form of compaction? And
> lastly, are you see the bug eric mentioned above? See if the monitor is
> reporting any warnings like those mentioned.


Re: ingest performance oscillations and Xceivers

Posted by John Vines <vi...@apache.org>.
How many hard drives and what's your max minor/maor compactions set to?
These can severely limit your performance due to potential disk thrashing.
If you observe the monitor, when ingest starts trailing, do you see it
undergoing, or possibly being backed up on, any form of compaction? And
lastly, are you see the bug eric mentioned above? See if the monitor is
reporting any warnings like those mentioned.




On Thu, Jan 3, 2013 at 5:25 PM, Jeremy Kepner <ke...@ll.mit.edu> wrote:

> No correlation with compactions.  No queries.
>
> On Thu, Jan 03, 2013 at 11:24:17AM -0500, William Slacum wrote:
> > Have you also been tracking compactions? Did you have a query load?
> >
> >
> > On Wed, Jan 2, 2013 at 7:25 PM, Kepner, Jeremy - 0553 - MITLL <
> > kepner@ll.mit.edu> wrote:
> >
> > > Hmmm, that's interesting, because in the past I didn't see this
> behavior.
> > >  It might be worth having someone look into because it seems to have a
> 2x
> > > impact on sustained ingest.
> > >
> > > Regards.  -Jeremy
> > >
> > > On Jan 2, 2013, at 2:23 PM, Keith Turner wrote:
> > >
> > > > On Wed, Jan 2, 2013 at 2:11 PM, Jeremy Kepner <ke...@ll.mit.edu>
> wrote:
> > > >> So what mechanism causes the number of Xceivers to increase?
> > > >
> > > > Its been a while since I looked at the data node source code.   When
> I
> > > > last look at it an Xceiver was just a thread created to handle a
> > > > datanode request.   The thread went away after the request was
> > > > processed.   So major and minor compactions running would cause more
> > > > Xceivers to be created to read and write data.
> > > >
> > > > Newer datanode code may use a thread pool instead of creating a
> > > > thread/xceiver for each request.   I am not sure.
> > > >
> > > >> I am carefully controlling the number of ingestors and the data
> isn't
> > > varying too much.
> > > >> I would expect the number of Xceivers to remain consant.
> > > >>
> > > >> Regards.  -Jeremy
> > > >>
> > > >> On Tue, Jan 01, 2013 at 09:45:20PM -0500, Eric Newton wrote:
> > > >>> Hey Jeremy,
> > > >>>
> > > >>> Can you compare the ingest rate to the number of tablets, too?
> > > >>>
> > > >>> I've found, that if I have 20-80 tablets per server (on similar
> > > hardware) I
> > > >>> get the best performance.
> > > >>>
> > > >>> # of Xceivers == number of writers when ingest is the primary
> target.
> > > >>>
> > > >>> Also, is this 1.4 or trunk?
> > > >>>
> > > >>> -Eric
> > > >>>
> > > >>>
> > > >>>
> > > >>> On Tue, Jan 1, 2013 at 9:19 PM, Kepner, Jeremy - 1010 - MITLL <
> > > >>> kepner@ll.mit.edu> wrote:
> > > >>>
> > > >>>> Accumulo Colleagues,
> > > >>>>  I am trying to optimize my ingest into a single node Accumulo
> > > instance
> > > >>>> running on a 32 core node with 96 GB of RAM.  I am seeing the
> follow
> > > ingest
> > > >>>> variations as a I change the number of ingest processes (see
> > > attached):
> > > >>>>
> > > >>>> -------------------------------------
> > > >>>> Ingestors, Ingest rate
> > > >>>> -------------------------------------
> > > >>>> 1, 60K inserts/sec (stable)
> > > >>>> 2, 120K inserts/sec (stable)
> > > >>>> 3, 60K to 180K inserts/sec
> > > >>>> 4, 90K to 220K inserts/sec
> > > >>>> 8, 80K to 280K inserts/sec
> > > >>>> 12, 80K to 280K inserts/sec
> > > >>>> -------------------------------------
> > > >>>>
> > > >>>> The only thing I can see that correlates with the ingest rate is
> the
> > > >>>> number of Xceivers.  When the ingest rate is high the number of
> > > Xceivers is
> > > >>>> usually low.  Likewise, when the ingest rate drops, the number of
> > > Xceivers
> > > >>>> usually increases significantly.
> > > >>>>
> > > >>>> Question: What role to Xceivers play in ingest?
> > > >>>>
> > > >>>> Request: It would be great to add a plot showing the number of
> > > Xceivers
> > > >>>> over time to the diagnostics.
> > > >>>>
> > > >>>> Regards.  -Jeremy
> > > >>>>
> > > >>>>
> > >
> > >
>

Re: ingest performance oscillations and Xceivers

Posted by Jeremy Kepner <ke...@ll.mit.edu>.
No correlation with compactions.  No queries.

On Thu, Jan 03, 2013 at 11:24:17AM -0500, William Slacum wrote:
> Have you also been tracking compactions? Did you have a query load?
> 
> 
> On Wed, Jan 2, 2013 at 7:25 PM, Kepner, Jeremy - 0553 - MITLL <
> kepner@ll.mit.edu> wrote:
> 
> > Hmmm, that's interesting, because in the past I didn't see this behavior.
> >  It might be worth having someone look into because it seems to have a 2x
> > impact on sustained ingest.
> >
> > Regards.  -Jeremy
> >
> > On Jan 2, 2013, at 2:23 PM, Keith Turner wrote:
> >
> > > On Wed, Jan 2, 2013 at 2:11 PM, Jeremy Kepner <ke...@ll.mit.edu> wrote:
> > >> So what mechanism causes the number of Xceivers to increase?
> > >
> > > Its been a while since I looked at the data node source code.   When I
> > > last look at it an Xceiver was just a thread created to handle a
> > > datanode request.   The thread went away after the request was
> > > processed.   So major and minor compactions running would cause more
> > > Xceivers to be created to read and write data.
> > >
> > > Newer datanode code may use a thread pool instead of creating a
> > > thread/xceiver for each request.   I am not sure.
> > >
> > >> I am carefully controlling the number of ingestors and the data isn't
> > varying too much.
> > >> I would expect the number of Xceivers to remain consant.
> > >>
> > >> Regards.  -Jeremy
> > >>
> > >> On Tue, Jan 01, 2013 at 09:45:20PM -0500, Eric Newton wrote:
> > >>> Hey Jeremy,
> > >>>
> > >>> Can you compare the ingest rate to the number of tablets, too?
> > >>>
> > >>> I've found, that if I have 20-80 tablets per server (on similar
> > hardware) I
> > >>> get the best performance.
> > >>>
> > >>> # of Xceivers == number of writers when ingest is the primary target.
> > >>>
> > >>> Also, is this 1.4 or trunk?
> > >>>
> > >>> -Eric
> > >>>
> > >>>
> > >>>
> > >>> On Tue, Jan 1, 2013 at 9:19 PM, Kepner, Jeremy - 1010 - MITLL <
> > >>> kepner@ll.mit.edu> wrote:
> > >>>
> > >>>> Accumulo Colleagues,
> > >>>>  I am trying to optimize my ingest into a single node Accumulo
> > instance
> > >>>> running on a 32 core node with 96 GB of RAM.  I am seeing the follow
> > ingest
> > >>>> variations as a I change the number of ingest processes (see
> > attached):
> > >>>>
> > >>>> -------------------------------------
> > >>>> Ingestors, Ingest rate
> > >>>> -------------------------------------
> > >>>> 1, 60K inserts/sec (stable)
> > >>>> 2, 120K inserts/sec (stable)
> > >>>> 3, 60K to 180K inserts/sec
> > >>>> 4, 90K to 220K inserts/sec
> > >>>> 8, 80K to 280K inserts/sec
> > >>>> 12, 80K to 280K inserts/sec
> > >>>> -------------------------------------
> > >>>>
> > >>>> The only thing I can see that correlates with the ingest rate is the
> > >>>> number of Xceivers.  When the ingest rate is high the number of
> > Xceivers is
> > >>>> usually low.  Likewise, when the ingest rate drops, the number of
> > Xceivers
> > >>>> usually increases significantly.
> > >>>>
> > >>>> Question: What role to Xceivers play in ingest?
> > >>>>
> > >>>> Request: It would be great to add a plot showing the number of
> > Xceivers
> > >>>> over time to the diagnostics.
> > >>>>
> > >>>> Regards.  -Jeremy
> > >>>>
> > >>>>
> >
> >

Re: ingest performance oscillations and Xceivers

Posted by William Slacum <wi...@accumulo.net>.
Have you also been tracking compactions? Did you have a query load?


On Wed, Jan 2, 2013 at 7:25 PM, Kepner, Jeremy - 0553 - MITLL <
kepner@ll.mit.edu> wrote:

> Hmmm, that's interesting, because in the past I didn't see this behavior.
>  It might be worth having someone look into because it seems to have a 2x
> impact on sustained ingest.
>
> Regards.  -Jeremy
>
> On Jan 2, 2013, at 2:23 PM, Keith Turner wrote:
>
> > On Wed, Jan 2, 2013 at 2:11 PM, Jeremy Kepner <ke...@ll.mit.edu> wrote:
> >> So what mechanism causes the number of Xceivers to increase?
> >
> > Its been a while since I looked at the data node source code.   When I
> > last look at it an Xceiver was just a thread created to handle a
> > datanode request.   The thread went away after the request was
> > processed.   So major and minor compactions running would cause more
> > Xceivers to be created to read and write data.
> >
> > Newer datanode code may use a thread pool instead of creating a
> > thread/xceiver for each request.   I am not sure.
> >
> >> I am carefully controlling the number of ingestors and the data isn't
> varying too much.
> >> I would expect the number of Xceivers to remain consant.
> >>
> >> Regards.  -Jeremy
> >>
> >> On Tue, Jan 01, 2013 at 09:45:20PM -0500, Eric Newton wrote:
> >>> Hey Jeremy,
> >>>
> >>> Can you compare the ingest rate to the number of tablets, too?
> >>>
> >>> I've found, that if I have 20-80 tablets per server (on similar
> hardware) I
> >>> get the best performance.
> >>>
> >>> # of Xceivers == number of writers when ingest is the primary target.
> >>>
> >>> Also, is this 1.4 or trunk?
> >>>
> >>> -Eric
> >>>
> >>>
> >>>
> >>> On Tue, Jan 1, 2013 at 9:19 PM, Kepner, Jeremy - 1010 - MITLL <
> >>> kepner@ll.mit.edu> wrote:
> >>>
> >>>> Accumulo Colleagues,
> >>>>  I am trying to optimize my ingest into a single node Accumulo
> instance
> >>>> running on a 32 core node with 96 GB of RAM.  I am seeing the follow
> ingest
> >>>> variations as a I change the number of ingest processes (see
> attached):
> >>>>
> >>>> -------------------------------------
> >>>> Ingestors, Ingest rate
> >>>> -------------------------------------
> >>>> 1, 60K inserts/sec (stable)
> >>>> 2, 120K inserts/sec (stable)
> >>>> 3, 60K to 180K inserts/sec
> >>>> 4, 90K to 220K inserts/sec
> >>>> 8, 80K to 280K inserts/sec
> >>>> 12, 80K to 280K inserts/sec
> >>>> -------------------------------------
> >>>>
> >>>> The only thing I can see that correlates with the ingest rate is the
> >>>> number of Xceivers.  When the ingest rate is high the number of
> Xceivers is
> >>>> usually low.  Likewise, when the ingest rate drops, the number of
> Xceivers
> >>>> usually increases significantly.
> >>>>
> >>>> Question: What role to Xceivers play in ingest?
> >>>>
> >>>> Request: It would be great to add a plot showing the number of
> Xceivers
> >>>> over time to the diagnostics.
> >>>>
> >>>> Regards.  -Jeremy
> >>>>
> >>>>
>
>

Re: ingest performance oscillations and Xceivers

Posted by "Kepner, Jeremy - 0553 - MITLL" <ke...@ll.mit.edu>.
Hmmm, that's interesting, because in the past I didn't see this behavior.  It might be worth having someone look into because it seems to have a 2x impact on sustained ingest. 

Regards.  -Jeremy

On Jan 2, 2013, at 2:23 PM, Keith Turner wrote:

> On Wed, Jan 2, 2013 at 2:11 PM, Jeremy Kepner <ke...@ll.mit.edu> wrote:
>> So what mechanism causes the number of Xceivers to increase?
> 
> Its been a while since I looked at the data node source code.   When I
> last look at it an Xceiver was just a thread created to handle a
> datanode request.   The thread went away after the request was
> processed.   So major and minor compactions running would cause more
> Xceivers to be created to read and write data.
> 
> Newer datanode code may use a thread pool instead of creating a
> thread/xceiver for each request.   I am not sure.
> 
>> I am carefully controlling the number of ingestors and the data isn't varying too much.
>> I would expect the number of Xceivers to remain consant.
>> 
>> Regards.  -Jeremy
>> 
>> On Tue, Jan 01, 2013 at 09:45:20PM -0500, Eric Newton wrote:
>>> Hey Jeremy,
>>> 
>>> Can you compare the ingest rate to the number of tablets, too?
>>> 
>>> I've found, that if I have 20-80 tablets per server (on similar hardware) I
>>> get the best performance.
>>> 
>>> # of Xceivers == number of writers when ingest is the primary target.
>>> 
>>> Also, is this 1.4 or trunk?
>>> 
>>> -Eric
>>> 
>>> 
>>> 
>>> On Tue, Jan 1, 2013 at 9:19 PM, Kepner, Jeremy - 1010 - MITLL <
>>> kepner@ll.mit.edu> wrote:
>>> 
>>>> Accumulo Colleagues,
>>>>  I am trying to optimize my ingest into a single node Accumulo instance
>>>> running on a 32 core node with 96 GB of RAM.  I am seeing the follow ingest
>>>> variations as a I change the number of ingest processes (see attached):
>>>> 
>>>> -------------------------------------
>>>> Ingestors, Ingest rate
>>>> -------------------------------------
>>>> 1, 60K inserts/sec (stable)
>>>> 2, 120K inserts/sec (stable)
>>>> 3, 60K to 180K inserts/sec
>>>> 4, 90K to 220K inserts/sec
>>>> 8, 80K to 280K inserts/sec
>>>> 12, 80K to 280K inserts/sec
>>>> -------------------------------------
>>>> 
>>>> The only thing I can see that correlates with the ingest rate is the
>>>> number of Xceivers.  When the ingest rate is high the number of Xceivers is
>>>> usually low.  Likewise, when the ingest rate drops, the number of Xceivers
>>>> usually increases significantly.
>>>> 
>>>> Question: What role to Xceivers play in ingest?
>>>> 
>>>> Request: It would be great to add a plot showing the number of Xceivers
>>>> over time to the diagnostics.
>>>> 
>>>> Regards.  -Jeremy
>>>> 
>>>> 


Re: ingest performance oscillations and Xceivers

Posted by "Kepner, Jeremy - 0553 - MITLL" <ke...@ll.mit.edu>.
Hmmm, that's interesting, because in the past I didn't see this behavior.  It might be worth having someone look into because it seems to have a 2x impact on sustained ingest. 

Regards.  -Jeremy

On Jan 2, 2013, at 2:23 PM, Keith Turner wrote:

> On Wed, Jan 2, 2013 at 2:11 PM, Jeremy Kepner <ke...@ll.mit.edu> wrote:
>> So what mechanism causes the number of Xceivers to increase?
> 
> Its been a while since I looked at the data node source code.   When I
> last look at it an Xceiver was just a thread created to handle a
> datanode request.   The thread went away after the request was
> processed.   So major and minor compactions running would cause more
> Xceivers to be created to read and write data.
> 
> Newer datanode code may use a thread pool instead of creating a
> thread/xceiver for each request.   I am not sure.
> 
>> I am carefully controlling the number of ingestors and the data isn't varying too much.
>> I would expect the number of Xceivers to remain consant.
>> 
>> Regards.  -Jeremy
>> 
>> On Tue, Jan 01, 2013 at 09:45:20PM -0500, Eric Newton wrote:
>>> Hey Jeremy,
>>> 
>>> Can you compare the ingest rate to the number of tablets, too?
>>> 
>>> I've found, that if I have 20-80 tablets per server (on similar hardware) I
>>> get the best performance.
>>> 
>>> # of Xceivers == number of writers when ingest is the primary target.
>>> 
>>> Also, is this 1.4 or trunk?
>>> 
>>> -Eric
>>> 
>>> 
>>> 
>>> On Tue, Jan 1, 2013 at 9:19 PM, Kepner, Jeremy - 1010 - MITLL <
>>> kepner@ll.mit.edu> wrote:
>>> 
>>>> Accumulo Colleagues,
>>>>  I am trying to optimize my ingest into a single node Accumulo instance
>>>> running on a 32 core node with 96 GB of RAM.  I am seeing the follow ingest
>>>> variations as a I change the number of ingest processes (see attached):
>>>> 
>>>> -------------------------------------
>>>> Ingestors, Ingest rate
>>>> -------------------------------------
>>>> 1, 60K inserts/sec (stable)
>>>> 2, 120K inserts/sec (stable)
>>>> 3, 60K to 180K inserts/sec
>>>> 4, 90K to 220K inserts/sec
>>>> 8, 80K to 280K inserts/sec
>>>> 12, 80K to 280K inserts/sec
>>>> -------------------------------------
>>>> 
>>>> The only thing I can see that correlates with the ingest rate is the
>>>> number of Xceivers.  When the ingest rate is high the number of Xceivers is
>>>> usually low.  Likewise, when the ingest rate drops, the number of Xceivers
>>>> usually increases significantly.
>>>> 
>>>> Question: What role to Xceivers play in ingest?
>>>> 
>>>> Request: It would be great to add a plot showing the number of Xceivers
>>>> over time to the diagnostics.
>>>> 
>>>> Regards.  -Jeremy
>>>> 
>>>> 


Re: ingest performance oscillations and Xceivers

Posted by Keith Turner <ke...@deenlo.com>.
On Wed, Jan 2, 2013 at 2:11 PM, Jeremy Kepner <ke...@ll.mit.edu> wrote:
> So what mechanism causes the number of Xceivers to increase?

Its been a while since I looked at the data node source code.   When I
last look at it an Xceiver was just a thread created to handle a
datanode request.   The thread went away after the request was
processed.   So major and minor compactions running would cause more
Xceivers to be created to read and write data.

Newer datanode code may use a thread pool instead of creating a
thread/xceiver for each request.   I am not sure.

> I am carefully controlling the number of ingestors and the data isn't varying too much.
> I would expect the number of Xceivers to remain consant.
>
> Regards.  -Jeremy
>
> On Tue, Jan 01, 2013 at 09:45:20PM -0500, Eric Newton wrote:
>> Hey Jeremy,
>>
>> Can you compare the ingest rate to the number of tablets, too?
>>
>> I've found, that if I have 20-80 tablets per server (on similar hardware) I
>> get the best performance.
>>
>> # of Xceivers == number of writers when ingest is the primary target.
>>
>> Also, is this 1.4 or trunk?
>>
>> -Eric
>>
>>
>>
>> On Tue, Jan 1, 2013 at 9:19 PM, Kepner, Jeremy - 1010 - MITLL <
>> kepner@ll.mit.edu> wrote:
>>
>> > Accumulo Colleagues,
>> >   I am trying to optimize my ingest into a single node Accumulo instance
>> > running on a 32 core node with 96 GB of RAM.  I am seeing the follow ingest
>> > variations as a I change the number of ingest processes (see attached):
>> >
>> > -------------------------------------
>> > Ingestors, Ingest rate
>> > -------------------------------------
>> > 1, 60K inserts/sec (stable)
>> > 2, 120K inserts/sec (stable)
>> > 3, 60K to 180K inserts/sec
>> > 4, 90K to 220K inserts/sec
>> > 8, 80K to 280K inserts/sec
>> > 12, 80K to 280K inserts/sec
>> > -------------------------------------
>> >
>> > The only thing I can see that correlates with the ingest rate is the
>> > number of Xceivers.  When the ingest rate is high the number of Xceivers is
>> > usually low.  Likewise, when the ingest rate drops, the number of Xceivers
>> > usually increases significantly.
>> >
>> > Question: What role to Xceivers play in ingest?
>> >
>> > Request: It would be great to add a plot showing the number of Xceivers
>> > over time to the diagnostics.
>> >
>> > Regards.  -Jeremy
>> >
>> >

Re: ingest performance oscillations and Xceivers

Posted by Jeremy Kepner <ke...@ll.mit.edu>.
So what mechanism causes the number of Xceivers to increase?
I am carefully controlling the number of ingestors and the data isn't varying too much.
I would expect the number of Xceivers to remain consant.

Regards.  -Jeremy

On Tue, Jan 01, 2013 at 09:45:20PM -0500, Eric Newton wrote:
> Hey Jeremy,
> 
> Can you compare the ingest rate to the number of tablets, too?
> 
> I've found, that if I have 20-80 tablets per server (on similar hardware) I
> get the best performance.
> 
> # of Xceivers == number of writers when ingest is the primary target.
> 
> Also, is this 1.4 or trunk?
> 
> -Eric
> 
> 
> 
> On Tue, Jan 1, 2013 at 9:19 PM, Kepner, Jeremy - 1010 - MITLL <
> kepner@ll.mit.edu> wrote:
> 
> > Accumulo Colleagues,
> >   I am trying to optimize my ingest into a single node Accumulo instance
> > running on a 32 core node with 96 GB of RAM.  I am seeing the follow ingest
> > variations as a I change the number of ingest processes (see attached):
> >
> > -------------------------------------
> > Ingestors, Ingest rate
> > -------------------------------------
> > 1, 60K inserts/sec (stable)
> > 2, 120K inserts/sec (stable)
> > 3, 60K to 180K inserts/sec
> > 4, 90K to 220K inserts/sec
> > 8, 80K to 280K inserts/sec
> > 12, 80K to 280K inserts/sec
> > -------------------------------------
> >
> > The only thing I can see that correlates with the ingest rate is the
> > number of Xceivers.  When the ingest rate is high the number of Xceivers is
> > usually low.  Likewise, when the ingest rate drops, the number of Xceivers
> > usually increases significantly.
> >
> > Question: What role to Xceivers play in ingest?
> >
> > Request: It would be great to add a plot showing the number of Xceivers
> > over time to the diagnostics.
> >
> > Regards.  -Jeremy
> >
> >

Re: ingest performance oscillations and Xceivers

Posted by Jeremy Kepner <ke...@ll.mit.edu>.
So what mechanism causes the number of Xceivers to increase?
I am carefully controlling the number of ingestors and the data isn't varying too much.
I would expect the number of Xceivers to remain consant.

Regards.  -Jeremy

On Tue, Jan 01, 2013 at 09:45:20PM -0500, Eric Newton wrote:
> Hey Jeremy,
> 
> Can you compare the ingest rate to the number of tablets, too?
> 
> I've found, that if I have 20-80 tablets per server (on similar hardware) I
> get the best performance.
> 
> # of Xceivers == number of writers when ingest is the primary target.
> 
> Also, is this 1.4 or trunk?
> 
> -Eric
> 
> 
> 
> On Tue, Jan 1, 2013 at 9:19 PM, Kepner, Jeremy - 1010 - MITLL <
> kepner@ll.mit.edu> wrote:
> 
> > Accumulo Colleagues,
> >   I am trying to optimize my ingest into a single node Accumulo instance
> > running on a 32 core node with 96 GB of RAM.  I am seeing the follow ingest
> > variations as a I change the number of ingest processes (see attached):
> >
> > -------------------------------------
> > Ingestors, Ingest rate
> > -------------------------------------
> > 1, 60K inserts/sec (stable)
> > 2, 120K inserts/sec (stable)
> > 3, 60K to 180K inserts/sec
> > 4, 90K to 220K inserts/sec
> > 8, 80K to 280K inserts/sec
> > 12, 80K to 280K inserts/sec
> > -------------------------------------
> >
> > The only thing I can see that correlates with the ingest rate is the
> > number of Xceivers.  When the ingest rate is high the number of Xceivers is
> > usually low.  Likewise, when the ingest rate drops, the number of Xceivers
> > usually increases significantly.
> >
> > Question: What role to Xceivers play in ingest?
> >
> > Request: It would be great to add a plot showing the number of Xceivers
> > over time to the diagnostics.
> >
> > Regards.  -Jeremy
> >
> >

Re: ingest performance oscillations and Xceivers

Posted by William Slacum <wi...@accumulo.net>.
How many disks do you have? That can be bottle-necking throughput as the
number of Xceivers is related to the number of resources (threads, sockets:
http://blog.cloudera.com/blog/2012/03/hbase-hadoop-xceivers/) used at once
to perform operations.

On Tue, Jan 1, 2013 at 6:45 PM, Eric Newton <er...@gmail.com> wrote:

> Hey Jeremy,
>
> Can you compare the ingest rate to the number of tablets, too?
>
> I've found, that if I have 20-80 tablets per server (on similar hardware) I
> get the best performance.
>
> # of Xceivers == number of writers when ingest is the primary target.
>
> Also, is this 1.4 or trunk?
>
> On Tue, Jan 1, 2013 at 9:19 PM, Kepner, Jeremy - 1010 - MITLL <
> kepner@ll.mit.edu> wrote:
>
> > Accumulo Colleagues,
> >   I am trying to optimize my ingest into a single node Accumulo instance
> > running on a 32 core node with 96 GB of RAM.  I am seeing the follow
> ingest
> > variations as a I change the number of ingest processes (see attached):
> >
> > -------------------------------------
> > Ingestors, Ingest rate
> > -------------------------------------
> > 1, 60K inserts/sec (stable)
> > 2, 120K inserts/sec (stable)
> > 3, 60K to 180K inserts/sec
> > 4, 90K to 220K inserts/sec
> > 8, 80K to 280K inserts/sec
> > 12, 80K to 280K inserts/sec
> > -------------------------------------
> >
> > The only thing I can see that correlates with the ingest rate is the
> > number of Xceivers.  When the ingest rate is high the number of Xceivers
> is
> > usually low.  Likewise, when the ingest rate drops, the number of
> Xceivers
> > usually increases significantly.
> >
> > Question: What role to Xceivers play in ingest?
> >
> > Request: It would be great to add a plot showing the number of Xceivers
> > over time to the diagnostics.
> >
> > Regards.  -Jeremy
> >
> >
>

Re: ingest performance oscillations and Xceivers

Posted by Eric Newton <er...@gmail.com>.
Hey Jeremy,

Can you compare the ingest rate to the number of tablets, too?

I've found, that if I have 20-80 tablets per server (on similar hardware) I
get the best performance.

# of Xceivers == number of writers when ingest is the primary target.

Also, is this 1.4 or trunk?

-Eric



On Tue, Jan 1, 2013 at 9:19 PM, Kepner, Jeremy - 1010 - MITLL <
kepner@ll.mit.edu> wrote:

> Accumulo Colleagues,
>   I am trying to optimize my ingest into a single node Accumulo instance
> running on a 32 core node with 96 GB of RAM.  I am seeing the follow ingest
> variations as a I change the number of ingest processes (see attached):
>
> -------------------------------------
> Ingestors, Ingest rate
> -------------------------------------
> 1, 60K inserts/sec (stable)
> 2, 120K inserts/sec (stable)
> 3, 60K to 180K inserts/sec
> 4, 90K to 220K inserts/sec
> 8, 80K to 280K inserts/sec
> 12, 80K to 280K inserts/sec
> -------------------------------------
>
> The only thing I can see that correlates with the ingest rate is the
> number of Xceivers.  When the ingest rate is high the number of Xceivers is
> usually low.  Likewise, when the ingest rate drops, the number of Xceivers
> usually increases significantly.
>
> Question: What role to Xceivers play in ingest?
>
> Request: It would be great to add a plot showing the number of Xceivers
> over time to the diagnostics.
>
> Regards.  -Jeremy
>
>

Re: ingest performance oscillations and Xceivers

Posted by Eric Newton <er...@gmail.com>.
Hey Jeremy,

Can you compare the ingest rate to the number of tablets, too?

I've found, that if I have 20-80 tablets per server (on similar hardware) I
get the best performance.

# of Xceivers == number of writers when ingest is the primary target.

Also, is this 1.4 or trunk?

-Eric



On Tue, Jan 1, 2013 at 9:19 PM, Kepner, Jeremy - 1010 - MITLL <
kepner@ll.mit.edu> wrote:

> Accumulo Colleagues,
>   I am trying to optimize my ingest into a single node Accumulo instance
> running on a 32 core node with 96 GB of RAM.  I am seeing the follow ingest
> variations as a I change the number of ingest processes (see attached):
>
> -------------------------------------
> Ingestors, Ingest rate
> -------------------------------------
> 1, 60K inserts/sec (stable)
> 2, 120K inserts/sec (stable)
> 3, 60K to 180K inserts/sec
> 4, 90K to 220K inserts/sec
> 8, 80K to 280K inserts/sec
> 12, 80K to 280K inserts/sec
> -------------------------------------
>
> The only thing I can see that correlates with the ingest rate is the
> number of Xceivers.  When the ingest rate is high the number of Xceivers is
> usually low.  Likewise, when the ingest rate drops, the number of Xceivers
> usually increases significantly.
>
> Question: What role to Xceivers play in ingest?
>
> Request: It would be great to add a plot showing the number of Xceivers
> over time to the diagnostics.
>
> Regards.  -Jeremy
>
>