You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@accumulo.apache.org by "Cardon, Tejay E" <te...@lmco.com> on 2012/09/20 23:05:51 UTC

Failing Tablet Servers

I'm seeing some strange behavior on a moderate (30 node) cluster.  I've got 27 tablet servers on large dell servers with 30GB of memory each.  I've set the TServer_OPTS to give them each 10G of memory.  I'm running an ingest process that uses AccumuloInputFormat in a MapReduce job to write 1,000 rows with each row containing ~1,000,000 columns in 160,000 families.  The MapReduce initially runs quite quickly and I can see the ingest rate peak on the  monitor page.  However, after about 30 seconds of high ingest, the ingest falls to 0.  It then stalls out and my map task are eventually killed.  In the end, the map/reduce fails and I usually end up with between 3 and 7 of my Tservers dead.

Inspecting the tserver.err logs shows nothing, even on the nodes that fail.  The tserver.out log shows a java OutOfMemoryError, and nothing else.  I've included a zip with the logs from one of the failed tservers and a second one with the logs from the master.  Other than the out of memory, I'm not seeing anything that stands out to me.

If I reduce the data size to only 100,000 columns, rather than 1,000,000, the process takes about 4 minutes and completes without incident.

Am I just ingesting too quickly?

Thanks,
Tejay Cardon

Re: EXTERNAL: Re: Failing Tablet Servers

Posted by Jim Klucar <kl...@gmail.com>.

I dont have the code here now, but if you look at the Mutation source there
are some memory numbers in there for its buffers. It depends on your
key/Value sizes as to how many you can do. I believe you can configure the
memory sizes.

Sent from my iPhone

On Sep 20, 2012, at 6:51 PM, "Cardon, Tejay E" <te...@lmco.com>
wrote:

  Sorry, yes it’s the AccumuloOutputFormat.  I do about 1,000,000
mutation.puts before I do a context.write.  Any idea how many is safe?



Thanks,

Tejay



*From:* Jim Klucar [mailto:klucar@gmail.com]
*Sent:* Thursday, September 20, 2012 4:44 PM
*To:* user@accumulo.apache.org
*Subject:* Re: EXTERNAL: Re: Failing Tablet Servers



Do you mean AccumuloOutputFormat? Is the map failing or the reduce failing?
How many Mutation.put are you doing before a context.write? Too many puts
will crash the mutation object. You need to periodically call context.write
and create a new mutation object. At some point I wrote a
ContextFlushingMutation that handled this problem for you, but I'd have to
dig around for it or rewrite it.



Sent from my iPhone


On Sep 20, 2012, at 5:29 PM, "Cardon, Tejay E" <te...@lmco.com>
wrote:

 John,

Thanks for the quick response.  I’m not seeing any errors in the logger
logs.  I am using native maps, and I left the memory map size at 1GB.  I
assume that’s plenty large if I’m using native maps, right?



Thanks,

Tejay



*From:* John Vines [mailto:vines@apache.org]
*Sent:* Thursday, September 20, 2012 3:20 PM
*To:* user@accumulo.apache.org
*Subject:* EXTERNAL: Re: Failing Tablet Servers



Okay, so we know that you're killing servers. We know when you drop the
amount of data down, you have no issues. There are two immediate issues
that come to mind-
1. You modified tservers opts to give them 10G of memory. Did you up the
memory map size in accumulo-site.xml to make those larger, or did you leave
those alone? Or did you up them to match the 10G? If you upped them and
arne't using the native maps, that would be problematic as you need space
for other purposes as well.

2. You seem to be making giant rows. Depending on your Key/Value size, it's
possible for you to write a row that you cannot send (especially if using a
WholeRowIterator) that can cause a cascading error when doing log recovery.
Are you seeing any sort of errors in your loggers logs?

John

On Thu, Sep 20, 2012 at 5:05 PM, Cardon, Tejay E <te...@lmco.com>
wrote:

I’m seeing some strange behavior on a moderate (30 node) cluster.  I’ve got
27 tablet servers on large dell servers with 30GB of memory each.  I’ve set
the TServer_OPTS to give them each 10G of memory.  I’m running an ingest
process that uses AccumuloInputFormat in a MapReduce job to write 1,000
rows with each row containing ~1,000,000 columns in 160,000 families.  The
MapReduce initially runs quite quickly and I can see the ingest rate peak
on the  monitor page.  However, after about 30 seconds of high ingest, the
ingest falls to 0.  It then stalls out and my map task are eventually
killed.  In the end, the map/reduce fails and I usually end up with between
3 and 7 of my Tservers dead.



Inspecting the tserver.err logs shows nothing, even on the nodes that
fail.  The tserver.out log shows a java OutOfMemoryError, and nothing
else.  I’ve included a zip with the logs from one of the failed tservers
and a second one with the logs from the master.  Other than the out of
memory, I’m not seeing anything that stands out to me.



If I reduce the data size to only 100,000 columns, rather than 1,000,000,
the process takes about 4 minutes and completes without incident.



Am I just ingesting too quickly?



Thanks,

Tejay Cardon

RE: EXTERNAL: Re: Failing Tablet Servers

Posted by Adam Fuchs <af...@apache.org>.

My guess would be that you are building an object several gigabytes in size
and Accumulo is copying it. Do you need all of those entries to be applied
atomically (in which case you should look into bulk loading), or can you
break them up into multiple mutations? I would say you should keep your
mutations under ten megabytes or so for performance. Bigger mutations won't
speed things up past that point.

Adam
On Sep 20, 2012 6:51 PM, "Cardon, Tejay E" <te...@lmco.com> wrote:

>  Sorry, yes it’s the AccumuloOutputFormat.  I do about 1,000,000
> mutation.puts before I do a context.write.  Any idea how many is safe?****
>
> ** **
>
> Thanks,****
>
> Tejay****
>
> ** **
>
> *From:* Jim Klucar [mailto:klucar@gmail.com]
> *Sent:* Thursday, September 20, 2012 4:44 PM
> *To:* user@accumulo.apache.org
> *Subject:* Re: EXTERNAL: Re: Failing Tablet Servers****
>
> ** **
>
> Do you mean AccumuloOutputFormat? Is the map failing or the reduce
> failing? How many Mutation.put are you doing before a context.write? Too
> many puts will crash the mutation object. You need to periodically call
> context.write and create a new mutation object. At some point I wrote a
> ContextFlushingMutation that handled this problem for you, but I'd have to
> dig around for it or rewrite it.****
>
>
>
> Sent from my iPhone****
>
>
> On Sep 20, 2012, at 5:29 PM, "Cardon, Tejay E" <te...@lmco.com>
> wrote:****
>
>  John, ****
>
> Thanks for the quick response.  I’m not seeing any errors in the logger
> logs.  I am using native maps, and I left the memory map size at 1GB.  I
> assume that’s plenty large if I’m using native maps, right?****
>
>  ****
>
> Thanks,****
>
> Tejay****
>
>  ****
>
> *From:* John Vines [mailto:vines@apache.org]
> *Sent:* Thursday, September 20, 2012 3:20 PM
> *To:* user@accumulo.apache.org
> *Subject:* EXTERNAL: Re: Failing Tablet Servers****
>
>  ****
>
> Okay, so we know that you're killing servers. We know when you drop the
> amount of data down, you have no issues. There are two immediate issues
> that come to mind-
> 1. You modified tservers opts to give them 10G of memory. Did you up the
> memory map size in accumulo-site.xml to make those larger, or did you leave
> those alone? Or did you up them to match the 10G? If you upped them and
> arne't using the native maps, that would be problematic as you need space
> for other purposes as well.
>
> 2. You seem to be making giant rows. Depending on your Key/Value size,
> it's possible for you to write a row that you cannot send (especially if
> using a WholeRowIterator) that can cause a cascading error when doing log
> recovery. Are you seeing any sort of errors in your loggers logs?
>
> John****
>
> On Thu, Sep 20, 2012 at 5:05 PM, Cardon, Tejay E <te...@lmco.com>
> wrote:****
>
> I’m seeing some strange behavior on a moderate (30 node) cluster.  I’ve
> got 27 tablet servers on large dell servers with 30GB of memory each.  I’ve
> set the TServer_OPTS to give them each 10G of memory.  I’m running an
> ingest process that uses AccumuloInputFormat in a MapReduce job to write
> 1,000 rows with each row containing ~1,000,000 columns in 160,000
> families.  The MapReduce initially runs quite quickly and I can see the
> ingest rate peak on the  monitor page.  However, after about 30 seconds of
> high ingest, the ingest falls to 0.  It then stalls out and my map task are
> eventually killed.  In the end, the map/reduce fails and I usually end up
> with between 3 and 7 of my Tservers dead.****
>
>  ****
>
> Inspecting the tserver.err logs shows nothing, even on the nodes that
> fail.  The tserver.out log shows a java OutOfMemoryError, and nothing
> else.  I’ve included a zip with the logs from one of the failed tservers
> and a second one with the logs from the master.  Other than the out of
> memory, I’m not seeing anything that stands out to me.****
>
>  ****
>
> If I reduce the data size to only 100,000 columns, rather than 1,000,000,
> the process takes about 4 minutes and completes without incident.****
>
>  ****
>
> Am I just ingesting too quickly?****
>
>  ****
>
> Thanks,****
>
> Tejay Cardon****
>
>  ****
>
>

RE: EXTERNAL: Re: Failing Tablet Servers

Posted by "Cardon, Tejay E" <te...@lmco.com>.

Sorry, yes it's the AccumuloOutputFormat.  I do about 1,000,000 mutation.puts before I do a context.write.  Any idea how many is safe?

Thanks,
Tejay

From: Jim Klucar [mailto:klucar@gmail.com]
Sent: Thursday, September 20, 2012 4:44 PM
To: user@accumulo.apache.org
Subject: Re: EXTERNAL: Re: Failing Tablet Servers

Do you mean AccumuloOutputFormat? Is the map failing or the reduce failing? How many Mutation.put are you doing before a context.write? Too many puts will crash the mutation object. You need to periodically call context.write and create a new mutation object. At some point I wrote a ContextFlushingMutation that handled this problem for you, but I'd have to dig around for it or rewrite it.

Sent from my iPhone

On Sep 20, 2012, at 5:29 PM, "Cardon, Tejay E" <te...@lmco.com>> wrote:
John,
Thanks for the quick response.  I'm not seeing any errors in the logger logs.  I am using native maps, and I left the memory map size at 1GB.  I assume that's plenty large if I'm using native maps, right?

Thanks,
Tejay

From: John Vines [mailto:vines@apache.org<ma...@apache.org>]
Sent: Thursday, September 20, 2012 3:20 PM
To: user@accumulo.apache.org<ma...@accumulo.apache.org>
Subject: EXTERNAL: Re: Failing Tablet Servers

Okay, so we know that you're killing servers. We know when you drop the amount of data down, you have no issues. There are two immediate issues that come to mind-
1. You modified tservers opts to give them 10G of memory. Did you up the memory map size in accumulo-site.xml to make those larger, or did you leave those alone? Or did you up them to match the 10G? If you upped them and arne't using the native maps, that would be problematic as you need space for other purposes as well.

2. You seem to be making giant rows. Depending on your Key/Value size, it's possible for you to write a row that you cannot send (especially if using a WholeRowIterator) that can cause a cascading error when doing log recovery. Are you seeing any sort of errors in your loggers logs?

John
On Thu, Sep 20, 2012 at 5:05 PM, Cardon, Tejay E <te...@lmco.com>> wrote:
I'm seeing some strange behavior on a moderate (30 node) cluster.  I've got 27 tablet servers on large dell servers with 30GB of memory each.  I've set the TServer_OPTS to give them each 10G of memory.  I'm running an ingest process that uses AccumuloInputFormat in a MapReduce job to write 1,000 rows with each row containing ~1,000,000 columns in 160,000 families.  The MapReduce initially runs quite quickly and I can see the ingest rate peak on the  monitor page.  However, after about 30 seconds of high ingest, the ingest falls to 0.  It then stalls out and my map task are eventually killed.  In the end, the map/reduce fails and I usually end up with between 3 and 7 of my Tservers dead.

Inspecting the tserver.err logs shows nothing, even on the nodes that fail.  The tserver.out log shows a java OutOfMemoryError, and nothing else.  I've included a zip with the logs from one of the failed tservers and a second one with the logs from the master.  Other than the out of memory, I'm not seeing anything that stands out to me.

If I reduce the data size to only 100,000 columns, rather than 1,000,000, the process takes about 4 minutes and completes without incident.

Am I just ingesting too quickly?

Thanks,
Tejay Cardon

Re: EXTERNAL: Re: Failing Tablet Servers

Posted by Jim Klucar <kl...@gmail.com>.

Do you mean AccumuloOutputFormat? Is the map failing or the reduce failing?
How many Mutation.put are you doing before a context.write? Too many puts
will crash the mutation object. You need to periodically call context.write
and create a new mutation object. At some point I wrote a
ContextFlushingMutation that handled this problem for you, but I'd have to
dig around for it or rewrite it.


Sent from my iPhone

On Sep 20, 2012, at 5:29 PM, "Cardon, Tejay E" <te...@lmco.com>
wrote:

  John,

Thanks for the quick response.  I’m not seeing any errors in the logger
logs.  I am using native maps, and I left the memory map size at 1GB.  I
assume that’s plenty large if I’m using native maps, right?



Thanks,

Tejay



*From:* John Vines [mailto:vines@apache.org]
*Sent:* Thursday, September 20, 2012 3:20 PM
*To:* user@accumulo.apache.org
*Subject:* EXTERNAL: Re: Failing Tablet Servers



Okay, so we know that you're killing servers. We know when you drop the
amount of data down, you have no issues. There are two immediate issues
that come to mind-
1. You modified tservers opts to give them 10G of memory. Did you up the
memory map size in accumulo-site.xml to make those larger, or did you leave
those alone? Or did you up them to match the 10G? If you upped them and
arne't using the native maps, that would be problematic as you need space
for other purposes as well.

2. You seem to be making giant rows. Depending on your Key/Value size, it's
possible for you to write a row that you cannot send (especially if using a
WholeRowIterator) that can cause a cascading error when doing log recovery.
Are you seeing any sort of errors in your loggers logs?

John

On Thu, Sep 20, 2012 at 5:05 PM, Cardon, Tejay E <te...@lmco.com>
wrote:

I’m seeing some strange behavior on a moderate (30 node) cluster.  I’ve got
27 tablet servers on large dell servers with 30GB of memory each.  I’ve set
the TServer_OPTS to give them each 10G of memory.  I’m running an ingest
process that uses AccumuloInputFormat in a MapReduce job to write 1,000
rows with each row containing ~1,000,000 columns in 160,000 families.  The
MapReduce initially runs quite quickly and I can see the ingest rate peak
on the  monitor page.  However, after about 30 seconds of high ingest, the
ingest falls to 0.  It then stalls out and my map task are eventually
killed.  In the end, the map/reduce fails and I usually end up with between
3 and 7 of my Tservers dead.



Inspecting the tserver.err logs shows nothing, even on the nodes that
fail.  The tserver.out log shows a java OutOfMemoryError, and nothing
else.  I’ve included a zip with the logs from one of the failed tservers
and a second one with the logs from the master.  Other than the out of
memory, I’m not seeing anything that stands out to me.



If I reduce the data size to only 100,000 columns, rather than 1,000,000,
the process takes about 4 minutes and completes without incident.



Am I just ingesting too quickly?



Thanks,

Tejay Cardon

RE: EXTERNAL: Re: Failing Tablet Servers

Posted by "Cardon, Tejay E" <te...@lmco.com>.

John,
Thanks for the quick response.  I'm not seeing any errors in the logger logs.  I am using native maps, and I left the memory map size at 1GB.  I assume that's plenty large if I'm using native maps, right?

Thanks,
Tejay

From: John Vines [mailto:vines@apache.org]
Sent: Thursday, September 20, 2012 3:20 PM
To: user@accumulo.apache.org
Subject: EXTERNAL: Re: Failing Tablet Servers

Okay, so we know that you're killing servers. We know when you drop the amount of data down, you have no issues. There are two immediate issues that come to mind-
1. You modified tservers opts to give them 10G of memory. Did you up the memory map size in accumulo-site.xml to make those larger, or did you leave those alone? Or did you up them to match the 10G? If you upped them and arne't using the native maps, that would be problematic as you need space for other purposes as well.

2. You seem to be making giant rows. Depending on your Key/Value size, it's possible for you to write a row that you cannot send (especially if using a WholeRowIterator) that can cause a cascading error when doing log recovery. Are you seeing any sort of errors in your loggers logs?

John
On Thu, Sep 20, 2012 at 5:05 PM, Cardon, Tejay E <te...@lmco.com>> wrote:
I'm seeing some strange behavior on a moderate (30 node) cluster.  I've got 27 tablet servers on large dell servers with 30GB of memory each.  I've set the TServer_OPTS to give them each 10G of memory.  I'm running an ingest process that uses AccumuloInputFormat in a MapReduce job to write 1,000 rows with each row containing ~1,000,000 columns in 160,000 families.  The MapReduce initially runs quite quickly and I can see the ingest rate peak on the  monitor page.  However, after about 30 seconds of high ingest, the ingest falls to 0.  It then stalls out and my map task are eventually killed.  In the end, the map/reduce fails and I usually end up with between 3 and 7 of my Tservers dead.

Inspecting the tserver.err logs shows nothing, even on the nodes that fail.  The tserver.out log shows a java OutOfMemoryError, and nothing else.  I've included a zip with the logs from one of the failed tservers and a second one with the logs from the master.  Other than the out of memory, I'm not seeing anything that stands out to me.

If I reduce the data size to only 100,000 columns, rather than 1,000,000, the process takes about 4 minutes and completes without incident.

Am I just ingesting too quickly?

Thanks,
Tejay Cardon

Re: Failing Tablet Servers

Posted by John Vines <vi...@apache.org>.

Okay, so we know that you're killing servers. We know when you drop the
amount of data down, you have no issues. There are two immediate issues
that come to mind-
1. You modified tservers opts to give them 10G of memory. Did you up the
memory map size in accumulo-site.xml to make those larger, or did you leave
those alone? Or did you up them to match the 10G? If you upped them and
arne't using the native maps, that would be problematic as you need space
for other purposes as well.

2. You seem to be making giant rows. Depending on your Key/Value size, it's
possible for you to write a row that you cannot send (especially if using a
WholeRowIterator) that can cause a cascading error when doing log recovery.
Are you seeing any sort of errors in your loggers logs?

John

On Thu, Sep 20, 2012 at 5:05 PM, Cardon, Tejay E <te...@lmco.com>wrote:

>  I’m seeing some strange behavior on a moderate (30 node) cluster.  I’ve
> got 27 tablet servers on large dell servers with 30GB of memory each.  I’ve
> set the TServer_OPTS to give them each 10G of memory.  I’m running an
> ingest process that uses AccumuloInputFormat in a MapReduce job to write
> 1,000 rows with each row containing ~1,000,000 columns in 160,000
> families.  The MapReduce initially runs quite quickly and I can see the
> ingest rate peak on the  monitor page.  However, after about 30 seconds of
> high ingest, the ingest falls to 0.  It then stalls out and my map task are
> eventually killed.  In the end, the map/reduce fails and I usually end up
> with between 3 and 7 of my Tservers dead.****
>
> ** **
>
> Inspecting the tserver.err logs shows nothing, even on the nodes that
> fail.  The tserver.out log shows a java OutOfMemoryError, and nothing
> else.  I’ve included a zip with the logs from one of the failed tservers
> and a second one with the logs from the master.  Other than the out of
> memory, I’m not seeing anything that stands out to me.****
>
> ** **
>
> If I reduce the data size to only 100,000 columns, rather than 1,000,000,
> the process takes about 4 minutes and completes without incident.****
>
> ** **
>
> Am I just ingesting too quickly?****
>
> ** **
>
> Thanks,****
>
> Tejay Cardon****
>

RE: EXTERNAL: Re: Failing Tablet Servers

Posted by "Cardon, Tejay E" <te...@lmco.com>.

Alright.  So I'm changing it to:

1.        moderate size mutations ~1,000 key/values per mutation.

2.       Tserver_opts = 5g

3.       Memory.maps = 3g

4.       Swappiness = 0 (right now I'm at 20)

It sounds like those are all settings I should fix anyway, so we'll do them all.  I'll report back if that doesn't fix the problem.

Thanks again for all the help

Tejay Cardon
From: Eric Newton [mailto:eric.newton@gmail.com]
Sent: Friday, September 21, 2012 8:33 AM
To: user@accumulo.apache.org
Subject: Re: EXTERNAL: Re: Failing Tablet Servers

We regularly send an overwhelming number of small key/value pairs to tablet servers (see the continuous ingest test).

If your servers are going down with smaller mutations, send the logs again.  I suspect that the tserver is being pushed into swap, and then the GC is taking too long.  That causes the tserver to lose its lock in zookeeper.

Make sure that swappiness is set to zero.

-Eric
On Fri, Sep 21, 2012 at 10:12 AM, Cardon, Tejay E <te...@lmco.com>> wrote:
Jim, Eric, and Adam,
Thanks.  It sounds like you're all saying the same thing.  Originally I was doing each key/value as its own mutation, and it was blowing up much faster (probably due to the volume/overhead of the mutation objects themselves.  I'll try refactoring to break them up into something in-between.  My keys are small (<25 Bytes), and my values are empty, but I'll aim for ~1,000 key/values per mutation and see how that works out for me.

Eric,
I was under the impression that the memory.maps setting was not very important when using native maps.  Apparently I'm mistaken there.  What does this setting control when in a native map setting?  And, in general, what's the proper balance between tserver_opts and tserver.memory.maps?

With regards to the "Finished gathering information from 24 servers in 27.45 seconds"  Do you have any recommendations for how to chase down the bottleneck?  I'm pretty sure I'm having GC issues, but I'm not sure what is causing them on the server side.  I'm sending a fairly small number of very large mutation objects, which I'd expect to be a moderate problem for the GC, but not a huge one..

Thanks again to everyone for being so responsive and helpful.

Tejay Cardon


From: Eric Newton [mailto:eric.newton@gmail.com<ma...@gmail.com>]
Sent: Friday, September 21, 2012 8:03 AM

To: user@accumulo.apache.org<ma...@accumulo.apache.org>
Subject: EXTERNAL: Re: Failing Tablet Servers

A few items noted from your logs:

tserver.memory.maps.max = 1G

If you are giving your processes 10G, you might want to make the map larger, say 6G, and then reduce the JVM by 6G.

Write-Ahead Log recovery complete for rz<;zw== (8 mutations applied, 8000000 entries created)

You are creating rows with 1M columns.  This is ok, but you might want to write them out more incrementally.

WARN : Running low on memory

That's pretty self-explanatory.  I'm guessing that the very large mutations are causing the tablet servers to run out of memory before they are held waiting for minor compactions.

Finished gathering information from 24 servers in 27.45 seconds

Something is running slow, probably due to GC thrashing.

WARN : Lost servers [10.1.24.69:9997[139d46130344b98]]

And there's a server crashing, probably due to an OOM condition.

Send smaller mutations.  Maybe keep it to 200K column updates.  You can still have 1M wide rows, just send 5 mutations.

-Eric

On Thu, Sep 20, 2012 at 5:05 PM, Cardon, Tejay E <te...@lmco.com>> wrote:
I'm seeing some strange behavior on a moderate (30 node) cluster.  I've got 27 tablet servers on large dell servers with 30GB of memory each.  I've set the TServer_OPTS to give them each 10G of memory.  I'm running an ingest process that uses AccumuloInputFormat in a MapReduce job to write 1,000 rows with each row containing ~1,000,000 columns in 160,000 families.  The MapReduce initially runs quite quickly and I can see the ingest rate peak on the  monitor page.  However, after about 30 seconds of high ingest, the ingest falls to 0.  It then stalls out and my map task are eventually killed.  In the end, the map/reduce fails and I usually end up with between 3 and 7 of my Tservers dead.

Inspecting the tserver.err logs shows nothing, even on the nodes that fail.  The tserver.out log shows a java OutOfMemoryError, and nothing else.  I've included a zip with the logs from one of the failed tservers and a second one with the logs from the master.  Other than the out of memory, I'm not seeing anything that stands out to me.

If I reduce the data size to only 100,000 columns, rather than 1,000,000, the process takes about 4 minutes and completes without incident.

Am I just ingesting too quickly?

Thanks,
Tejay Cardon

Re: EXTERNAL: Re: Failing Tablet Servers

Posted by Eric Newton <er...@gmail.com>.

We regularly send an overwhelming number of small key/value pairs to tablet
servers (see the continuous ingest test).

If your servers are going down with smaller mutations, send the logs again.
 I suspect that the tserver is being pushed into swap, and then the GC is
taking too long.  That causes the tserver to lose its lock in zookeeper.

Make sure that swappiness is set to zero.

-Eric

On Fri, Sep 21, 2012 at 10:12 AM, Cardon, Tejay E
<te...@lmco.com>wrote:

>  Jim, Eric, and Adam,****
>
> Thanks.  It sounds like you’re all saying the same thing.  Originally I
> was doing each key/value as its own mutation, and it was blowing up much
> faster (probably due to the volume/overhead of the mutation objects
> themselves.  I’ll try refactoring to break them up into something
> in-between.  My keys are small (<25 Bytes), and my values are empty, but
> I’ll aim for ~1,000 key/values per mutation and see how that works out for
> me.****
>
> ** **
>
> Eric,****
>
> I was under the impression that the memory.maps setting was not very
> important when using native maps.  Apparently I’m mistaken there.  What
> does this setting control when in a native map setting?  And, in general,
> what’s the proper balance between tserver_opts and tserver.memory.maps?***
> *
>
> ** **
>
> With regards to the “Finished gathering information from 24 servers in
> 27.45 seconds”  Do you have any recommendations for how to chase down the
> bottleneck?  I’m pretty sure I’m having GC issues, but I’m not sure what is
> causing them on the server side.  I’m sending a fairly small number of very
> large mutation objects, which I’d expect to be a moderate problem for the
> GC, but not a huge one..****
>
> ** **
>
> Thanks again to everyone for being so responsive and helpful.****
>
> ** **
>
> Tejay Cardon****
>
> ** **
>
> ** **
>
> *From:* Eric Newton [mailto:eric.newton@gmail.com]
> *Sent:* Friday, September 21, 2012 8:03 AM
>
> *To:* user@accumulo.apache.org
> *Subject:* EXTERNAL: Re: Failing Tablet Servers****
>
>  ** **
>
> A few items noted from your logs:****
>
> ** **
>
> tserver.memory.maps.max = 1G****
>
>  ** **
>
> If you are giving your processes 10G, you might want to make the map
> larger, say 6G, and then reduce the JVM by 6G.****
>
> ** **
>
> Write-Ahead Log recovery complete for rz<;zw== (8 mutations applied,
> 8000000 entries created)****
>
>  ** **
>
> You are creating rows with 1M columns.  This is ok, but you might want to
> write them out more incrementally.****
>
> ** **
>
> WARN : Running low on memory****
>
>  ** **
>
> That's pretty self-explanatory.  I'm guessing that the very large
> mutations are causing the tablet servers to run out of memory before they
> are held waiting for minor compactions.****
>
> ** **
>
> Finished gathering information from 24 servers in 27.45 seconds****
>
>  ** **
>
> Something is running slow, probably due to GC thrashing.****
>
> ** **
>
> WARN : Lost servers [10.1.24.69:9997[139d46130344b98]]****
>
>  ** **
>
> And there's a server crashing, probably due to an OOM condition.****
>
> ** **
>
> Send smaller mutations.  Maybe keep it to 200K column updates.  You can
> still have 1M wide rows, just send 5 mutations.****
>
> ** **
>
> -Eric****
>
> ** **
>
> On Thu, Sep 20, 2012 at 5:05 PM, Cardon, Tejay E <te...@lmco.com>
> wrote:****
>
> I’m seeing some strange behavior on a moderate (30 node) cluster.  I’ve
> got 27 tablet servers on large dell servers with 30GB of memory each.  I’ve
> set the TServer_OPTS to give them each 10G of memory.  I’m running an
> ingest process that uses AccumuloInputFormat in a MapReduce job to write
> 1,000 rows with each row containing ~1,000,000 columns in 160,000
> families.  The MapReduce initially runs quite quickly and I can see the
> ingest rate peak on the  monitor page.  However, after about 30 seconds of
> high ingest, the ingest falls to 0.  It then stalls out and my map task are
> eventually killed.  In the end, the map/reduce fails and I usually end up
> with between 3 and 7 of my Tservers dead.****
>
>  ****
>
> Inspecting the tserver.err logs shows nothing, even on the nodes that
> fail.  The tserver.out log shows a java OutOfMemoryError, and nothing
> else.  I’ve included a zip with the logs from one of the failed tservers
> and a second one with the logs from the master.  Other than the out of
> memory, I’m not seeing anything that stands out to me.****
>
>  ****
>
> If I reduce the data size to only 100,000 columns, rather than 1,000,000,
> the process takes about 4 minutes and completes without incident.****
>
>  ****
>
> Am I just ingesting too quickly?****
>
>  ****
>
> Thanks,****
>
> Tejay Cardon****
>
> ** **
>

Re: EXTERNAL: Re: Failing Tablet Servers

Posted by Jim Klucar <kl...@gmail.com>.

tejay,

Here's a good article about Java using native memory.

http://www.ibm.com/developerworks/linux/library/j-nativememory-linux/index.html#how

On Fri, Sep 21, 2012 at 10:35 AM, Cardon, Tejay E
<te...@lmco.com> wrote:
> Gotcha.  So if I’m using java maps then my tserver_opts needs to be
> tserver.memory.maps + extra for the rest of the tserver because the memory
> map will be taken from the overall memory allocated to the tserver.  But if
> I’m using native maps, then I need far less tserver memory because the map
> memory is not deducted from the tserver.  Is that correct?
>
>
>
> Thanks,
> tejay
>
>
>
> From: John Vines [mailto:vines@apache.org]
> Sent: Friday, September 21, 2012 8:26 AM
> To: user@accumulo.apache.org
> Subject: Re: EXTERNAL: Re: Failing Tablet Servers
>
>
>
> memory.maps is what defines the size of the in memory map. When using native
> maps, that space does not come out of the heap size. But when using
> non-native maps, it comes out of the heap space.
>
> I think the issue Eric is trying to hit at is the fickleness of the java
> garbage collector. When you give a process that much heap, that's so much
> more data you can hold before you need to garbage collect. However, that
> also means when it does garbage collect, it's collecting a LOT more, which
> can result is poor performance.
>
> John
>
>
>
> On Fri, Sep 21, 2012 at 10:12 AM, Cardon, Tejay E <te...@lmco.com>
> wrote:
>
> Jim, Eric, and Adam,
>
> Thanks.  It sounds like you’re all saying the same thing.  Originally I was
> doing each key/value as its own mutation, and it was blowing up much faster
> (probably due to the volume/overhead of the mutation objects themselves.
> I’ll try refactoring to break them up into something in-between.  My keys
> are small (<25 Bytes), and my values are empty, but I’ll aim for ~1,000
> key/values per mutation and see how that works out for me.
>
>
>
> Eric,
>
> I was under the impression that the memory.maps setting was not very
> important when using native maps.  Apparently I’m mistaken there.  What does
> this setting control when in a native map setting?  And, in general, what’s
> the proper balance between tserver_opts and tserver.memory.maps?
>
>
>
> With regards to the “Finished gathering information from 24 servers in 27.45
> seconds”  Do you have any recommendations for how to chase down the
> bottleneck?  I’m pretty sure I’m having GC issues, but I’m not sure what is
> causing them on the server side.  I’m sending a fairly small number of very
> large mutation objects, which I’d expect to be a moderate problem for the
> GC, but not a huge one..
>
>
>
> Thanks again to everyone for being so responsive and helpful.
>
>
>
> Tejay Cardon
>
>
>
>
>
> From: Eric Newton [mailto:eric.newton@gmail.com]
> Sent: Friday, September 21, 2012 8:03 AM
>
>
> To: user@accumulo.apache.org
> Subject: EXTERNAL: Re: Failing Tablet Servers
>
>
>
> A few items noted from your logs:
>
>
>
> tserver.memory.maps.max = 1G
>
>
>
> If you are giving your processes 10G, you might want to make the map larger,
> say 6G, and then reduce the JVM by 6G.
>
>
>
> Write-Ahead Log recovery complete for rz<;zw== (8 mutations applied, 8000000
> entries created)
>
>
>
> You are creating rows with 1M columns.  This is ok, but you might want to
> write them out more incrementally.
>
>
>
> WARN : Running low on memory
>
>
>
> That's pretty self-explanatory.  I'm guessing that the very large mutations
> are causing the tablet servers to run out of memory before they are held
> waiting for minor compactions.
>
>
>
> Finished gathering information from 24 servers in 27.45 seconds
>
>
>
> Something is running slow, probably due to GC thrashing.
>
>
>
> WARN : Lost servers [10.1.24.69:9997[139d46130344b98]]
>
>
>
> And there's a server crashing, probably due to an OOM condition.
>
>
>
> Send smaller mutations.  Maybe keep it to 200K column updates.  You can
> still have 1M wide rows, just send 5 mutations.
>
>
>
> -Eric
>
>
>
> On Thu, Sep 20, 2012 at 5:05 PM, Cardon, Tejay E <te...@lmco.com>
> wrote:
>
> I’m seeing some strange behavior on a moderate (30 node) cluster.  I’ve got
> 27 tablet servers on large dell servers with 30GB of memory each.  I’ve set
> the TServer_OPTS to give them each 10G of memory.  I’m running an ingest
> process that uses AccumuloInputFormat in a MapReduce job to write 1,000 rows
> with each row containing ~1,000,000 columns in 160,000 families.  The
> MapReduce initially runs quite quickly and I can see the ingest rate peak on
> the  monitor page.  However, after about 30 seconds of high ingest, the
> ingest falls to 0.  It then stalls out and my map task are eventually
> killed.  In the end, the map/reduce fails and I usually end up with between
> 3 and 7 of my Tservers dead.
>
>
>
> Inspecting the tserver.err logs shows nothing, even on the nodes that fail.
> The tserver.out log shows a java OutOfMemoryError, and nothing else.  I’ve
> included a zip with the logs from one of the failed tservers and a second
> one with the logs from the master.  Other than the out of memory, I’m not
> seeing anything that stands out to me.
>
>
>
> If I reduce the data size to only 100,000 columns, rather than 1,000,000,
> the process takes about 4 minutes and completes without incident.
>
>
>
> Am I just ingesting too quickly?
>
>
>
> Thanks,
>
> Tejay Cardon
>
>
>
>

RE: EXTERNAL: Re: Failing Tablet Servers

Posted by "Cardon, Tejay E" <te...@lmco.com>.

Gotcha.  So if I'm using java maps then my tserver_opts needs to be tserver.memory.maps + extra for the rest of the tserver because the memory map will be taken from the overall memory allocated to the tserver.  But if I'm using native maps, then I need far less tserver memory because the map memory is not deducted from the tserver.  Is that correct?

Thanks,
tejay

From: John Vines [mailto:vines@apache.org]
Sent: Friday, September 21, 2012 8:26 AM
To: user@accumulo.apache.org
Subject: Re: EXTERNAL: Re: Failing Tablet Servers

memory.maps is what defines the size of the in memory map. When using native maps, that space does not come out of the heap size. But when using non-native maps, it comes out of the heap space.
I think the issue Eric is trying to hit at is the fickleness of the java garbage collector. When you give a process that much heap, that's so much more data you can hold before you need to garbage collect. However, that also means when it does garbage collect, it's collecting a LOT more, which can result is poor performance.

John

On Fri, Sep 21, 2012 at 10:12 AM, Cardon, Tejay E <te...@lmco.com>> wrote:
Jim, Eric, and Adam,
Thanks.  It sounds like you're all saying the same thing.  Originally I was doing each key/value as its own mutation, and it was blowing up much faster (probably due to the volume/overhead of the mutation objects themselves.  I'll try refactoring to break them up into something in-between.  My keys are small (<25 Bytes), and my values are empty, but I'll aim for ~1,000 key/values per mutation and see how that works out for me.

Eric,
I was under the impression that the memory.maps setting was not very important when using native maps.  Apparently I'm mistaken there.  What does this setting control when in a native map setting?  And, in general, what's the proper balance between tserver_opts and tserver.memory.maps?

With regards to the "Finished gathering information from 24 servers in 27.45 seconds"  Do you have any recommendations for how to chase down the bottleneck?  I'm pretty sure I'm having GC issues, but I'm not sure what is causing them on the server side.  I'm sending a fairly small number of very large mutation objects, which I'd expect to be a moderate problem for the GC, but not a huge one..

Thanks again to everyone for being so responsive and helpful.

Tejay Cardon

From: Eric Newton [mailto:eric.newton@gmail.com<ma...@gmail.com>]
Sent: Friday, September 21, 2012 8:03 AM

To: user@accumulo.apache.org<ma...@accumulo.apache.org>
Subject: EXTERNAL: Re: Failing Tablet Servers

A few items noted from your logs:

tserver.memory.maps.max = 1G

If you are giving your processes 10G, you might want to make the map larger, say 6G, and then reduce the JVM by 6G.

Write-Ahead Log recovery complete for rz<;zw== (8 mutations applied, 8000000 entries created)

You are creating rows with 1M columns.  This is ok, but you might want to write them out more incrementally.

WARN : Running low on memory

That's pretty self-explanatory.  I'm guessing that the very large mutations are causing the tablet servers to run out of memory before they are held waiting for minor compactions.

Finished gathering information from 24 servers in 27.45 seconds

Something is running slow, probably due to GC thrashing.

WARN : Lost servers [10.1.24.69:9997[139d46130344b98]]

And there's a server crashing, probably due to an OOM condition.

Send smaller mutations.  Maybe keep it to 200K column updates.  You can still have 1M wide rows, just send 5 mutations.

-Eric

On Thu, Sep 20, 2012 at 5:05 PM, Cardon, Tejay E <te...@lmco.com>> wrote:
I'm seeing some strange behavior on a moderate (30 node) cluster.  I've got 27 tablet servers on large dell servers with 30GB of memory each.  I've set the TServer_OPTS to give them each 10G of memory.  I'm running an ingest process that uses AccumuloInputFormat in a MapReduce job to write 1,000 rows with each row containing ~1,000,000 columns in 160,000 families.  The MapReduce initially runs quite quickly and I can see the ingest rate peak on the  monitor page.  However, after about 30 seconds of high ingest, the ingest falls to 0.  It then stalls out and my map task are eventually killed.  In the end, the map/reduce fails and I usually end up with between 3 and 7 of my Tservers dead.

Inspecting the tserver.err logs shows nothing, even on the nodes that fail.  The tserver.out log shows a java OutOfMemoryError, and nothing else.  I've included a zip with the logs from one of the failed tservers and a second one with the logs from the master.  Other than the out of memory, I'm not seeing anything that stands out to me.

If I reduce the data size to only 100,000 columns, rather than 1,000,000, the process takes about 4 minutes and completes without incident.

Am I just ingesting too quickly?

Thanks,
Tejay Cardon

Re: EXTERNAL: Re: Failing Tablet Servers

Posted by John Vines <vi...@apache.org>.

memory.maps is what defines the size of the in memory map. When using
native maps, that space does not come out of the heap size. But when using
non-native maps, it comes out of the heap space.

I think the issue Eric is trying to hit at is the fickleness of the java
garbage collector. When you give a process that much heap, that's so much
more data you can hold before you need to garbage collect. However, that
also means when it does garbage collect, it's collecting a LOT more, which
can result is poor performance.

John

On Fri, Sep 21, 2012 at 10:12 AM, Cardon, Tejay E
<te...@lmco.com>wrote:

>  Jim, Eric, and Adam,****
>
> Thanks.  It sounds like you’re all saying the same thing.  Originally I
> was doing each key/value as its own mutation, and it was blowing up much
> faster (probably due to the volume/overhead of the mutation objects
> themselves.  I’ll try refactoring to break them up into something
> in-between.  My keys are small (<25 Bytes), and my values are empty, but
> I’ll aim for ~1,000 key/values per mutation and see how that works out for
> me.****
>
> ** **
>
> Eric,****
>
> I was under the impression that the memory.maps setting was not very
> important when using native maps.  Apparently I’m mistaken there.  What
> does this setting control when in a native map setting?  And, in general,
> what’s the proper balance between tserver_opts and tserver.memory.maps?***
> *
>
> ** **
>
> With regards to the “Finished gathering information from 24 servers in
> 27.45 seconds”  Do you have any recommendations for how to chase down the
> bottleneck?  I’m pretty sure I’m having GC issues, but I’m not sure what is
> causing them on the server side.  I’m sending a fairly small number of very
> large mutation objects, which I’d expect to be a moderate problem for the
> GC, but not a huge one..****
>
> ** **
>
> Thanks again to everyone for being so responsive and helpful.****
>
> ** **
>
> Tejay Cardon****
>
> ** **
>
> ** **
>
> *From:* Eric Newton [mailto:eric.newton@gmail.com]
> *Sent:* Friday, September 21, 2012 8:03 AM
>
> *To:* user@accumulo.apache.org
> *Subject:* EXTERNAL: Re: Failing Tablet Servers****
>
>  ** **
>
> A few items noted from your logs:****
>
> ** **
>
> tserver.memory.maps.max = 1G****
>
>  ** **
>
> If you are giving your processes 10G, you might want to make the map
> larger, say 6G, and then reduce the JVM by 6G.****
>
> ** **
>
> Write-Ahead Log recovery complete for rz<;zw== (8 mutations applied,
> 8000000 entries created)****
>
>  ** **
>
> You are creating rows with 1M columns.  This is ok, but you might want to
> write them out more incrementally.****
>
> ** **
>
> WARN : Running low on memory****
>
>  ** **
>
> That's pretty self-explanatory.  I'm guessing that the very large
> mutations are causing the tablet servers to run out of memory before they
> are held waiting for minor compactions.****
>
> ** **
>
> Finished gathering information from 24 servers in 27.45 seconds****
>
>  ** **
>
> Something is running slow, probably due to GC thrashing.****
>
> ** **
>
> WARN : Lost servers [10.1.24.69:9997[139d46130344b98]]****
>
>  ** **
>
> And there's a server crashing, probably due to an OOM condition.****
>
> ** **
>
> Send smaller mutations.  Maybe keep it to 200K column updates.  You can
> still have 1M wide rows, just send 5 mutations.****
>
> ** **
>
> -Eric****
>
> ** **
>
> On Thu, Sep 20, 2012 at 5:05 PM, Cardon, Tejay E <te...@lmco.com>
> wrote:****
>
> I’m seeing some strange behavior on a moderate (30 node) cluster.  I’ve
> got 27 tablet servers on large dell servers with 30GB of memory each.  I’ve
> set the TServer_OPTS to give them each 10G of memory.  I’m running an
> ingest process that uses AccumuloInputFormat in a MapReduce job to write
> 1,000 rows with each row containing ~1,000,000 columns in 160,000
> families.  The MapReduce initially runs quite quickly and I can see the
> ingest rate peak on the  monitor page.  However, after about 30 seconds of
> high ingest, the ingest falls to 0.  It then stalls out and my map task are
> eventually killed.  In the end, the map/reduce fails and I usually end up
> with between 3 and 7 of my Tservers dead.****
>
>  ****
>
> Inspecting the tserver.err logs shows nothing, even on the nodes that
> fail.  The tserver.out log shows a java OutOfMemoryError, and nothing
> else.  I’ve included a zip with the logs from one of the failed tservers
> and a second one with the logs from the master.  Other than the out of
> memory, I’m not seeing anything that stands out to me.****
>
>  ****
>
> If I reduce the data size to only 100,000 columns, rather than 1,000,000,
> the process takes about 4 minutes and completes without incident.****
>
>  ****
>
> Am I just ingesting too quickly?****
>
>  ****
>
> Thanks,****
>
> Tejay Cardon****
>
> ** **
>

RE: EXTERNAL: Re: Failing Tablet Servers

Posted by "Cardon, Tejay E" <te...@lmco.com>.

Jim, Eric, and Adam,
Thanks.  It sounds like you're all saying the same thing.  Originally I was doing each key/value as its own mutation, and it was blowing up much faster (probably due to the volume/overhead of the mutation objects themselves.  I'll try refactoring to break them up into something in-between.  My keys are small (<25 Bytes), and my values are empty, but I'll aim for ~1,000 key/values per mutation and see how that works out for me.

Eric,
I was under the impression that the memory.maps setting was not very important when using native maps.  Apparently I'm mistaken there.  What does this setting control when in a native map setting?  And, in general, what's the proper balance between tserver_opts and tserver.memory.maps?

With regards to the "Finished gathering information from 24 servers in 27.45 seconds"  Do you have any recommendations for how to chase down the bottleneck?  I'm pretty sure I'm having GC issues, but I'm not sure what is causing them on the server side.  I'm sending a fairly small number of very large mutation objects, which I'd expect to be a moderate problem for the GC, but not a huge one..

Thanks again to everyone for being so responsive and helpful.

Tejay Cardon


From: Eric Newton [mailto:eric.newton@gmail.com]
Sent: Friday, September 21, 2012 8:03 AM
To: user@accumulo.apache.org
Subject: EXTERNAL: Re: Failing Tablet Servers

A few items noted from your logs:

tserver.memory.maps.max = 1G

If you are giving your processes 10G, you might want to make the map larger, say 6G, and then reduce the JVM by 6G.

Write-Ahead Log recovery complete for rz<;zw== (8 mutations applied, 8000000 entries created)

You are creating rows with 1M columns.  This is ok, but you might want to write them out more incrementally.

WARN : Running low on memory

That's pretty self-explanatory.  I'm guessing that the very large mutations are causing the tablet servers to run out of memory before they are held waiting for minor compactions.

Finished gathering information from 24 servers in 27.45 seconds

Something is running slow, probably due to GC thrashing.

WARN : Lost servers [10.1.24.69:9997[139d46130344b98]]

And there's a server crashing, probably due to an OOM condition.

Send smaller mutations.  Maybe keep it to 200K column updates.  You can still have 1M wide rows, just send 5 mutations.

-Eric

On Thu, Sep 20, 2012 at 5:05 PM, Cardon, Tejay E <te...@lmco.com>> wrote:
I'm seeing some strange behavior on a moderate (30 node) cluster.  I've got 27 tablet servers on large dell servers with 30GB of memory each.  I've set the TServer_OPTS to give them each 10G of memory.  I'm running an ingest process that uses AccumuloInputFormat in a MapReduce job to write 1,000 rows with each row containing ~1,000,000 columns in 160,000 families.  The MapReduce initially runs quite quickly and I can see the ingest rate peak on the  monitor page.  However, after about 30 seconds of high ingest, the ingest falls to 0.  It then stalls out and my map task are eventually killed.  In the end, the map/reduce fails and I usually end up with between 3 and 7 of my Tservers dead.

Inspecting the tserver.err logs shows nothing, even on the nodes that fail.  The tserver.out log shows a java OutOfMemoryError, and nothing else.  I've included a zip with the logs from one of the failed tservers and a second one with the logs from the master.  Other than the out of memory, I'm not seeing anything that stands out to me.

If I reduce the data size to only 100,000 columns, rather than 1,000,000, the process takes about 4 minutes and completes without incident.

Am I just ingesting too quickly?

Thanks,
Tejay Cardon

Re: Failing Tablet Servers

Posted by Eric Newton <er...@gmail.com>.

A few items noted from your logs:

tserver.memory.maps.max = 1G

If you are giving your processes 10G, you might want to make the map
larger, say 6G, and then reduce the JVM by 6G.

Write-Ahead Log recovery complete for rz<;zw== (8 mutations applied,
> 8000000 entries created)

You are creating rows with 1M columns.  This is ok, but you might want to
write them out more incrementally.

WARN : Running low on memory

That's pretty self-explanatory.  I'm guessing that the very large mutations
are causing the tablet servers to run out of memory before they are held
waiting for minor compactions.

Finished gathering information from 24 servers in 27.45 seconds

Something is running slow, probably due to GC thrashing.

WARN : Lost servers [10.1.24.69:9997[139d46130344b98]]

And there's a server crashing, probably due to an OOM condition.

Send smaller mutations.  Maybe keep it to 200K column updates.  You can
still have 1M wide rows, just send 5 mutations.

-Eric

On Thu, Sep 20, 2012 at 5:05 PM, Cardon, Tejay E <te...@lmco.com>wrote:

>  I’m seeing some strange behavior on a moderate (30 node) cluster.  I’ve
> got 27 tablet servers on large dell servers with 30GB of memory each.  I’ve
> set the TServer_OPTS to give them each 10G of memory.  I’m running an
> ingest process that uses AccumuloInputFormat in a MapReduce job to write
> 1,000 rows with each row containing ~1,000,000 columns in 160,000
> families.  The MapReduce initially runs quite quickly and I can see the
> ingest rate peak on the  monitor page.  However, after about 30 seconds of
> high ingest, the ingest falls to 0.  It then stalls out and my map task are
> eventually killed.  In the end, the map/reduce fails and I usually end up
> with between 3 and 7 of my Tservers dead.****
>
> ** **
>
> Inspecting the tserver.err logs shows nothing, even on the nodes that
> fail.  The tserver.out log shows a java OutOfMemoryError, and nothing
> else.  I’ve included a zip with the logs from one of the failed tservers
> and a second one with the logs from the master.  Other than the out of
> memory, I’m not seeing anything that stands out to me.****
>
> ** **
>
> If I reduce the data size to only 100,000 columns, rather than 1,000,000,
> the process takes about 4 minutes and completes without incident.****
>
> ** **
>
> Am I just ingesting too quickly?****
>
> ** **
>
> Thanks,****
>
> Tejay Cardon****
>