You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@accumulo.apache.org by Michael Wall <mj...@gmail.com> on 2017/07/10 12:21:59 UTC

Re: HDFS disk usage grows too much

Hi Anton,

What is your interval for ingesting ~7.5M? Are you writing mutations or
bulk ingesting? Assuming you are writing mutations, those writeahead logs
will all be replicated 3 times by default in HDFS, so ~22.5M. Then it gets
flushed to disk as some point, also replicated 3 times by default for
another ~22.5M.

So a couple more questions.

1 - Do you have HDFS Trash enabled? While I consider that a best practice,
it will keep the data around longer.
2 - Is all the HDFS storage in /accumulo?

Mike


On Sun, Jul 9, 2017 at 8:18 AM Anton Puzanov <AN...@il.ibm.com> wrote:

> Hi all,
>
> I have a big cluster configuration with 10 Data Nodes and 2 writers.
> Currently the HDFS disk usage is 60%. I am ingesting key-value pairs with
> rate of ~7.5M.
> Each Data Node have total disk memory of ~10T.
> I observed a very strange behavior of the disk size, screenshot from
> grafana:
>
>
>
> As you can see the disk usage increased really fast. This increase could
> not be caused by solely the ingestion data since previous executions of the
> writers wrote few hundreds of gigabytes per day!
>
> At the peak of the increase of disk usage several Tablet servers have
> failed and the system froze (CPU, disk, network...)! a screenshot of the
> cpu usage:
>
>
>
> The GC configurations are not changed in this run so GC should work every
> 5min Accumulo GC heap memory = 8192MB.
> possibly relevant configurations:
>       "tserver.wal.blocksize": "1G",
>       "tserver.walog.max.size": "2G",
>       "tserver.memory.maps.max": "4G",
>       "tserver.compaction.minor.concurrent.max": "50",
>       "tserver.compaction.major.concurrent.max": "20",
>
> My question is whether this increase in disk consumption is normal? should
> I always keep the disk usage at 50%?
> What can cause such errors and how can they be avoided?
>
> Thanks,
> Anton P.
>
>