You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@accumulo.apache.org by "Dickson, Matt MR" <ma...@defence.gov.au> on 2014/05/22 03:00:21 UTC

RFiles not referenced in !METADATA [SEC=UNOFFICIAL]

UNOFFICIAL

I've run scan on hdfs under /accumulo/tables/<table_id> for all rfiles older than our ageoff filter on that table.  When I then scan for these rfiles in the metadata table most are not listed.

Should all rfiles be referenced in the metadata table?  My goal had been to get the rowid from the metadata and then force a compaction on that range.  Eg for row   4n;234234234 file:/fdi-2342/234234.rf    run a compaction for 234234234 to 234234234~

Thanks in advance.
Matt



Re: RFiles not referenced in !METADATA [SEC=UNOFFICIAL]

Posted by Mike Drob <md...@mdrob.com>.
The accumulo gc is a separate role (in the same vein as
tracer/monitor/etc...)

It is fine to run on the same node as the master.

The "master status" page on the monitor should tell you when the last time
that the gc ran a cycle was. Otherwise, you can look for the process or
check out the logs, if you want to dive into the terminal.

Mike


On Thu, May 22, 2014 at 8:42 PM, Dickson, Matt MR <
matt.dickson@defence.gov.au> wrote:

>  *UNOFFICIAL*
> How do I check if the Accumulo gc is running?  Can I force it to run?
>
> I'm on version 1.5.
>
>  ------------------------------
> *From:* Mike Drob [mailto:mdrob@mdrob.com]
> *Sent:* Friday, 23 May 2014 02:39
> *To:* user@accumulo.apache.org
> *Subject:* Re: RFiles not referenced in !METADATA [SEC=UNOFFICIAL]
>
>  Is your GC running? It should be catching the unreferenced files.
>
> I think you are safe to manually delete any files not references in the
> !METADATA table.
>
> What version of Accumulo are you running?
>
>
> On Wed, May 21, 2014 at 9:00 PM, Dickson, Matt MR <
> matt.dickson@defence.gov.au> wrote:
>
>>  *UNOFFICIAL*
>> I've run scan on hdfs under /accumulo/tables/<table_id> for all rfiles
>> older than our ageoff filter on that table.  When I then scan for these
>> rfiles in the metadata table most are not listed.
>>
>> Should all rfiles be referenced in the metadata table?  My goal had been
>> to get the rowid from the metadata and then force a compaction on that
>> range.  Eg for row   4n;234234234 file:/fdi-2342/234234.rf    run a
>> compaction for 234234234 to 234234234~
>>
>> Thanks in advance.
>> Matt
>>
>>
>>
>
>

RE: RFiles not referenced in !METADATA [SEC=UNOFFICIAL]

Posted by "Dickson, Matt MR" <ma...@defence.gov.au>.
UNOFFICIAL

How do I check if the Accumulo gc is running?  Can I force it to run?

I'm on version 1.5.

________________________________
From: Mike Drob [mailto:mdrob@mdrob.com]
Sent: Friday, 23 May 2014 02:39
To: user@accumulo.apache.org
Subject: Re: RFiles not referenced in !METADATA [SEC=UNOFFICIAL]

Is your GC running? It should be catching the unreferenced files.

I think you are safe to manually delete any files not references in the !METADATA table.

What version of Accumulo are you running?


On Wed, May 21, 2014 at 9:00 PM, Dickson, Matt MR <ma...@defence.gov.au>> wrote:

UNOFFICIAL

I've run scan on hdfs under /accumulo/tables/<table_id> for all rfiles older than our ageoff filter on that table.  When I then scan for these rfiles in the metadata table most are not listed.

Should all rfiles be referenced in the metadata table?  My goal had been to get the rowid from the metadata and then force a compaction on that range.  Eg for row   4n;234234234 file:/fdi-2342/234234<tel:2342%2F234234>.rf    run a compaction for 234234234 to 234234234~

Thanks in advance.
Matt




Re: RFiles not referenced in !METADATA [SEC=UNOFFICIAL]

Posted by Mike Drob <md...@mdrob.com>.
Is your GC running? It should be catching the unreferenced files.

I think you are safe to manually delete any files not references in the
!METADATA table.

What version of Accumulo are you running?


On Wed, May 21, 2014 at 9:00 PM, Dickson, Matt MR <
matt.dickson@defence.gov.au> wrote:

>  *UNOFFICIAL*
> I've run scan on hdfs under /accumulo/tables/<table_id> for all rfiles
> older than our ageoff filter on that table.  When I then scan for these
> rfiles in the metadata table most are not listed.
>
> Should all rfiles be referenced in the metadata table?  My goal had been
> to get the rowid from the metadata and then force a compaction on that
> range.  Eg for row   4n;234234234 file:/fdi-2342/234234.rf    run a
> compaction for 234234234 to 234234234~
>
> Thanks in advance.
> Matt
>
>
>

Re: RFiles not referenced in !METADATA [SEC=UNOFFICIAL]

Posted by Eric Newton <er...@gmail.com>.
If a file is unreferenced anywhere in the metadata table, then there's
probably a bug, and I can easily imagine it would go undetected.

There are small moments in time when a file is ready to be used, and the
tablet server dies, which would create an unreferenced file.

As noted in ACCUMULO-2381<https://issues.apache.org/jira/browse/ACCUMULO-2381>
the
GC should look for these abandoned files periodically.  Right now the GC
just removes files that have references in the metadata table (delete
markers).

-Eric


On Wed, May 21, 2014 at 9:00 PM, Dickson, Matt MR <
matt.dickson@defence.gov.au> wrote:

>  *UNOFFICIAL*
> I've run scan on hdfs under /accumulo/tables/<table_id> for all rfiles
> older than our ageoff filter on that table.  When I then scan for these
> rfiles in the metadata table most are not listed.
>
> Should all rfiles be referenced in the metadata table?  My goal had been
> to get the rowid from the metadata and then force a compaction on that
> range.  Eg for row   4n;234234234 file:/fdi-2342/234234.rf    run a
> compaction for 234234234 to 234234234~
>
> Thanks in advance.
> Matt
>
>
>