You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@accumulo.apache.org by Aaron <aa...@gmail.com> on 2014/01/06 19:47:36 UTC

"Re-applying" split file to a table?

To set the stage:

We create a table and pre-split it..then we start to ingest some data.
 During the ingest, the table splits a few more times maybe, and after the
ingest is done the table balances itself out across the tablet severs.

What happens if we apply the spilt file again  to the same table?  From
what I can tell, nothing appears to change, but, just wanted to double
check..make sure I wasn't missing anything.

Same question, but if we use a completely different spilt file, with
different splits?  Same result..nothing changes?

Re: "Re-applying" split file to a table?

Posted by Billie Rinaldi <bi...@gmail.com>.
On Mon, Jan 6, 2014 at 11:24 AM, Mike Drob <ma...@cloudera.com> wrote:

> Aaron,
>
> If you attempt to apply the same splits file, then you are attempting to
> add already existing splits. Since the data is already split on those
> points, there are no changes, and nothing happens, exactly as you observed.
>
> If you apply a different split file to the existing data (after it already
> had the initial and natural splits), then you will likely get more split
> points. The data might not split immediately, but you can prompt it to do
> so by issuing a major compaction.
>

To clarify: when adding new split points the tablets will split
immediately, but will continue to share the same existing data files until
a major compaction occurs.


> Your underlying data will not change, but you should see more tablets in
> your table via the monitor interface.
>
> Mike
>
>
> On Mon, Jan 6, 2014 at 10:47 AM, Aaron <aa...@gmail.com> wrote:
>
>> To set the stage:
>>
>> We create a table and pre-split it..then we start to ingest some data.
>>  During the ingest, the table splits a few more times maybe, and after the
>> ingest is done the table balances itself out across the tablet severs.
>>
>> What happens if we apply the spilt file again  to the same table?  From
>> what I can tell, nothing appears to change, but, just wanted to double
>> check..make sure I wasn't missing anything.
>>
>> Same question, but if we use a completely different spilt file, with
>> different splits?  Same result..nothing changes?
>>
>
>

Re: "Re-applying" split file to a table?

Posted by Mike Drob <ma...@cloudera.com>.
Aaron,

If you attempt to apply the same splits file, then you are attempting to
add already existing splits. Since the data is already split on those
points, there are no changes, and nothing happens, exactly as you observed.

If you apply a different split file to the existing data (after it already
had the initial and natural splits), then you will likely get more split
points. The data might not split immediately, but you can prompt it to do
so by issuing a major compaction. Your underlying data will not change, but
you should see more tablets in your table via the monitor interface.

Mike


On Mon, Jan 6, 2014 at 10:47 AM, Aaron <aa...@gmail.com> wrote:

> To set the stage:
>
> We create a table and pre-split it..then we start to ingest some data.
>  During the ingest, the table splits a few more times maybe, and after the
> ingest is done the table balances itself out across the tablet severs.
>
> What happens if we apply the spilt file again  to the same table?  From
> what I can tell, nothing appears to change, but, just wanted to double
> check..make sure I wasn't missing anything.
>
> Same question, but if we use a completely different spilt file, with
> different splits?  Same result..nothing changes?
>