You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@doris.apache.org by GitBox <gi...@apache.org> on 2020/07/04 12:30:26 UTC

[GitHub] [incubator-doris] ZhangYu0123 opened a new issue #4017: [Proposal] Resolve error "fail to init reader.res=-230" by delayed deletion of rowset

ZhangYu0123 opened a new issue #4017:
URL: https://github.com/apache/incubator-doris/issues/4017


   **Describe the bug**
   
   Because the compaction task on the BE will continuously merge the Rowset version, the useless Rowset after the merge is deleted. At this time, if the query version issued by the FE is among the merged versions, the BE can not obtain the Rowset version path to be queried, and the error OLAP_ERR_VERSION_ALREADY_MERGED = -230 is returned.
   
   The specific meaning of this error can be found in #3270. And in PR #3271,  #3859
   
   
   **Resolution**
   In order to not only ensure efficient compaction of Rowset merge, but also be able to query the previous version when querying, and make low-risk changes at the same time. This design adds the logic of the delayed deletion of the merged Rowset. The main ideas are as follows:
   
   (1) Data structure changes
   - Add _expired_snapshot_rs_version_map to the Tablet to maintain the merged Rowset.
   - Add _expired_snapshot_rs_metas to TabletMeta to maintain the merged RowsetMeta. 
   - Redefine the RowsetGraph structure in Rowset and change it to VersionedRowsetTracker, with the following responsibilities:
   a) Including the original RowsetGraph function, adding path information to the Vertex. The same path indicates the path that has been merged, and when pathVersion is -1, it indicates that the Rowset has not been merged.
   b) Join to maintain the merged Rowset collection _expired_snapshot_rs_path_map. The key of the map is the pathVersion and the value is the Rowset list with the same pathVersion.
   c) Maintain the current maximum path value and assign the Vertex corresponding to the Rowset merged next time.
   ![image](https://user-images.githubusercontent.com/67053339/86512214-adb23100-be32-11ea-81af-be059a5ba955.png)
   Among them, the Rowset version on the path where the pathVersion is not -1 is the Rowset that can be deleted by delay.
   
   (2) Compaction process changes
   - After compaction merge, enter the modify_rowsets stage. At the end of the modify_rowsets, the tablet adds the rowset deleted from rs_version_map to _expired_snaphort_rs_version_map; the same applies to the deletion of RowsetMeta.
   - In the reconstruct_rowset_graph reconstruction logic of VersionedRowsetTracker, also add Rowset of _expired_snapshot_rs_metas to build VersionedRowsetTracker.  Add the merged Rowset list  to _expired_snapshot_rs_path_map, and the pathVersion is incremented by 1.
   - Remove the gc operation in the last compaction.
   
   (3) GC process changes
   -  Add cleanup task of _expired_snapshot_rs_metas to start_trash_sweep of TabletManager.
   -  When cleaning, check all paths in VersionedRowsetTracker where pathVersion is not -1. When the createtime of Rowset with the largest version number in a path is greater than config:tablet_rowset_expired_snapshot_sweep_time (new configuration, the default is 30 minutes), add Rowset on the entire pathVersion path to storage_engine's unused_rowset for cleaning.
   - After cleaning, use _expired_snapshot_rs_metas and _rs_meta to reconstruct VersionedRowsetTracker. At the same time, delete the key of the corresponding cleaned pathVersion in _expired_snapshot_rs_path_map.
   
   (4) Find the Rowset to be read
   When reading data, increase to find rowset in _expired_snapshot_rs_version_map.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] ZhangYu0123 closed issue #4017: [Proposal] Resolve error "fail to init reader.res=-230" by delayed deletion of rowset

Posted by GitBox <gi...@apache.org>.
ZhangYu0123 closed issue #4017:
URL: https://github.com/apache/incubator-doris/issues/4017


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] morningman commented on issue #4017: [Proposal] Resolve error "fail to init reader.res=-230" by delayed deletion of rowset

Posted by GitBox <gi...@apache.org>.
morningman commented on issue #4017:
URL: https://github.com/apache/incubator-doris/issues/4017#issuecomment-653901201


   Nice job. This proposal looks good to me. Waiting for your PR.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org