You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@doris.apache.org by GitBox <gi...@apache.org> on 2022/07/09 12:50:42 UTC

[GitHub] [doris] weizuo93 opened a new issue, #10720: [Bug] Segment files are removed as trash but tablet meta is normal

weizuo93 opened a new issue, #10720:
URL: https://github.com/apache/doris/issues/10720

   ### Search before asking
   
   - [X] I had searched in the [issues](https://github.com/apache/incubator-doris/issues?q=is%3Aissue) and found no similar issues.
   
   
   ### Version
   
   trunk version. commit id : a5efda68829c0873800b62d7e2e2c3b1807d1734
   
   ### What's Wrong?
   
   We replaced the original disk on the BE node with a new disk by wrong operation when BE restarted. When we discovered the mistake and added the original disk. There is no unhealthy replica in the cluster after a period of time, and we removed the wrong disk. When query comes, exception information is thrown and the error code is `-3109` which means `failed to open segment`. We found that segment files for some tablets in this BE node had been removed as trash but tablet meta is normal in original disk. These abnormal tablets can not be detected and repaired by FE.
   
   ### What You Expected?
   
   Tablet metadata should be consistent with data files for a tablet. When segment files removed as trash, the tablet should be droped on the BE node so that FE node could detecte and repaire the error replica.
   
   ### How to Reproduce?
   
   Cluster: 1 FE + 3 BE (BE01, BE02 and BE03, there is one disk called `disk-1` on BE01.)
   
   STEP 1: create a table on the cluster and ensure there are 3 replica for each tablet.
   
   STEP 2: insert data into the table.
   
   STEP 3: remove the `disk-1` on BE01 and add a new disk called `disk-2`, then restart BE01.
               When the deamon start, we will find that there is no replica on BE01 because there is only one empty disk which is `disk-2`, and the replica repair task will clone some replica to the `disk-2` on BE01.
   
   STEP 4: When there is no unhealthy replica in the cluster after a period of time, add the `disk-1` and restart BE01(there is two disks which are `disk-1` and `disk-2`).
               When the deamon start, we will find tablets in `disk-2` would be load and tablets which hold the same id with that on `disk-2` will not be load. Data on different disks are loaded in parallel. If the later loaded tablet on `disk-1`(there is a tablet with same id on `disk-2` has been loaded before successfully), the tablet will not be loaded successfully and segment files would be removed as trash but metadata is normal.
   
   STEP 5: remove the `disk-2` on BE01, keep `disk-1` on BE01, then restart BE01.
              When the deamon start, we will find tablets in `disk-1` would be load. These tablets hold normal metadata but has no segment files.
   
   STEP 6: query the table. If the query falls on these replica on BE01, an exception will occur and the error code is `-3109` which means `failed to open segment`. These abnormal tablets can not be detected and repaired by FE.
   
   
   
   ### Anything Else?
   
   NO.
   
   ### Are you willing to submit PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] weizuo93 closed issue #10720: [Bug] Segment files are removed as trash but tablet meta is normal

Posted by GitBox <gi...@apache.org>.
weizuo93 closed issue #10720: [Bug] Segment files are removed as trash but tablet meta is normal
URL: https://github.com/apache/doris/issues/10720


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org