You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@carbondata.apache.org by "xuchuanyin (JIRA)" <ji...@apache.org> on 2018/03/28 09:18:00 UTC

[jira] [Created] (CARBONDATA-2288) Compaction should be able to run concurrently with data loading

xuchuanyin created CARBONDATA-2288:
--------------------------------------

             Summary: Compaction should be able to run concurrently with data loading
                 Key: CARBONDATA-2288
                 URL: https://issues.apache.org/jira/browse/CARBONDATA-2288
             Project: CarbonData
          Issue Type: Improvement
          Components: data-load
            Reporter: xuchuanyin
            Assignee: xuchuanyin


Currently in carbondata, compaction can be triggered in two ways:
1. Manually trigger compaction using ALTER statement.
2. Atomically trigger compaction when doing data loading.

In both ways, compaction and data loading cannot run concurrently. In way 1, compation will fail if data load is processing. In way 2, the compaction will only start after the main data loading finished and the user has to wait until the compaction is finished.

In my option, data loading will work on a new segment, whereas compaction works on the existed segments, so we can let them run concurrently.

For the 1st way, compaction will succeed even data loading is processing;
For the 2nd way, compaction will run concurrently with the data loading, or after the data loading (we can configure it). And user will not have to wait the compaction finished.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)