You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@carbondata.apache.org by vikramahuja1001 <vi...@gmail.com> on 2020/07/30 05:59:55 UTC

[DISCUSSION] A property to enable/disable SILoadEventForFailedSegments and separate sql command to trigger segments repair logic

Hi Community!

When load/insert command is triggered in the scenario where the main table
has one or more SI tables, after loading the new segment in the main table
and all the SI tables there is a SILoadEventListenerForFailedSegments which
compares the segments in main table and SI table. In case of any mismatch or
missing segments in any of the SI tables, the listener fires a load on the
missing segments in the SI table. The load/insert command on the main table
will be finished only after all the missing segments in all the SI tables
have been loaded again. 

Consider a scenario where the SI table has 10000 missing segments. In this
case after the new load is completed on both the main table and SI table,
the SILoadEventListenerForFailedSegments will try to load all the missing
10000 segments back to the SI table. Since there are a lot of segments to be
reloaded in the SI table, this step will block the next load command for
many hours if not days. To solve this problem please find the 2 step
solution. 

Step 1. Add a carbon property which will enable/disable the  loading for
missing/failed segments. By default it can be kept true, only when the user
sets it as false this functionality will be disabled.
Step 2. Provide a separate SI repair command thus making the whole
functionality independent of load/insert command. We can provide both table
level as well as segment level command to repair the missing segments. 
Example table level command: REPAIR INDEX ON TABLE MAIN_TABLE. This will
check for all the SI table in the main table
Example Segment Level command: REPAIR INDEX ON MAIN_TABLE WHERE SEGMENT.ID
IN (0,1,2,3,4). This will only check for the given segments in all the SI
tables.


Please give your input and suggestions for the above solution.

Rgds
Vikram Ahuja 



--
Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

Re: [DISCUSSION] A property to enable/disable SILoadEventForFailedSegments and separate sql command to trigger segments repair logic

Posted by David CaiQiang <da...@gmail.com>.
Hi, 
I have some doubts as following.

Step 1: 
Can we only trigger to do "REPAIR INDEX ON TABLE MAIN_TABLE" by default? 
the system will repair the index in the background. Incremental load and
compaction can return immediately, no need to wait for the repair index.

Step 2:
What the lock level of the repair index command? Can we support concurrently
repairing the different segments on the same table?



-----
Best Regards
David Cai
--
Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/