You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@doris.apache.org by "chenlinzhong (via GitHub)" <gi...@apache.org> on 2023/04/05 03:14:12 UTC

[GitHub] [doris] chenlinzhong opened a new pull request, #18397: [bugfix] Fix the issue of incorrect disk usage

chenlinzhong opened a new pull request, #18397:
URL: https://github.com/apache/doris/pull/18397

   # Proposed changes
   
   disk used capacity is incorrect:
   DataUsedCapacity+AvailCapacity !=TotalCapacity
   ![image](https://user-images.githubusercontent.com/11487604/229971354-081cc12a-d37d-451c-96f8-02cce5e85335.png)
   
   
   ## Problem summary
   
   some invalid rowset rowset not be  deleted for ever due to the bug:
   ![image](https://user-images.githubusercontent.com/11487604/229971796-6a433c49-934d-4d02-8522-460e26b81120.png)
   when key_may_exists() return true not means it is real exists
   
   
   ## Checklist(Required)
   
   * [ ] Does it affect the original behavior
   * [ ] Has unit tests been added
   * [ ] Has document been added or modified
   * [ ] Does it need to update dependencies
   * [ ] Is this PR support rollback (If NO, please explain WHY)
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at [dev@doris.apache.org](mailto:dev@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc...
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] chenlinzhong commented on pull request #18397: [bug](GC)the issue of incorrect disk usage

Posted by "chenlinzhong (via GitHub)" <gi...@apache.org>.
chenlinzhong commented on PR #18397:
URL: https://github.com/apache/doris/pull/18397#issuecomment-1498541036

   run p0


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] github-actions[bot] commented on pull request #18397: [bug](GC)the issue of incorrect disk usage

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #18397:
URL: https://github.com/apache/doris/pull/18397#issuecomment-1500226215

   clang-tidy review says "All clean, LGTM! :+1:"


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] chenlinzhong commented on pull request #18397: [bug](GC)the issue of incorrect disk usage

Posted by "chenlinzhong (via GitHub)" <gi...@apache.org>.
chenlinzhong commented on PR #18397:
URL: https://github.com/apache/doris/pull/18397#issuecomment-1498541089

   run beut


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] chenlinzhong commented on pull request #18397: [bug](GC)the issue of incorrect disk usage

Posted by "chenlinzhong (via GitHub)" <gi...@apache.org>.
chenlinzhong commented on PR #18397:
URL: https://github.com/apache/doris/pull/18397#issuecomment-1498389424

   run p0


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] github-actions[bot] commented on pull request #18397: [bug](GC)the issue of incorrect disk usage

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #18397:
URL: https://github.com/apache/doris/pull/18397#issuecomment-1500227448

   clang-tidy review says "All clean, LGTM! :+1:"


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] chenlinzhong commented on pull request #18397: [bug](GC)the issue of incorrect disk usage

Posted by "chenlinzhong (via GitHub)" <gi...@apache.org>.
chenlinzhong commented on PR #18397:
URL: https://github.com/apache/doris/pull/18397#issuecomment-1497574101

   run all


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] morningman merged pull request #18397: [bug](GC)the issue of incorrect disk usage

Posted by "morningman (via GitHub)" <gi...@apache.org>.
morningman merged PR #18397:
URL: https://github.com/apache/doris/pull/18397


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] morningman commented on a diff in pull request #18397: [bug](GC)the issue of incorrect disk usage

Posted by "morningman (via GitHub)" <gi...@apache.org>.
morningman commented on code in PR #18397:
URL: https://github.com/apache/doris/pull/18397#discussion_r1159330559


##########
be/src/olap/rowset/rowset_meta_manager.cpp:
##########
@@ -42,6 +42,17 @@ bool RowsetMetaManager::check_rowset_meta(OlapMeta* meta, TabletUid tablet_uid,
     return meta->key_may_exist(META_COLUMN_FAMILY_INDEX, key, &value);
 }
 
+bool RowsetMetaManager::exists(OlapMeta* meta, TabletUid tablet_uid, const RowsetId& rowset_id) {
+    std::string key = ROWSET_PREFIX + tablet_uid.to_string() + "_" + rowset_id.to_string();
+    std::string value;
+    Status s = meta->get(META_COLUMN_FAMILY_INDEX, key, &value);

Review Comment:
   Better to return `s` directly instead of a `bool`.
   So that if there is other error, we can handle it.
   Otherwise, if the `s` is InternalError(for example), this will still return `true`, which is wrong



##########
be/src/io/fs/file_system.cpp:
##########
@@ -44,6 +46,24 @@ Status FileSystem::delete_file(const Path& file) {
     FILESYSTEM_M(delete_file_impl(path));
 }
 
+Status FileSystem::delete_directory_or_file(const Path& path) {
+    auto real_path = absolute_path(path);

Review Comment:
   I think we should implement this only in `LocalFileSystem`.
   Because you use `stat` to check if it is dir, which is only available for local fs.
   
   And there is already a method `is_directory()` in local fs, better to use that.



##########
be/src/olap/tablet.cpp:
##########
@@ -1160,7 +1160,7 @@ bool Tablet::check_rowset_id(const RowsetId& rowset_id) {
             return true;
         }
     }
-    if (RowsetMetaManager::check_rowset_meta(_data_dir->get_meta(), tablet_uid(), rowset_id)) {
+    if (RowsetMetaManager::exists(_data_dir->get_meta(), tablet_uid(), rowset_id)) {

Review Comment:
   Handle the `Status` here



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] chenlinzhong commented on pull request #18397: [bug](GC)the issue of incorrect disk usage

Posted by "chenlinzhong (via GitHub)" <gi...@apache.org>.
chenlinzhong commented on PR #18397:
URL: https://github.com/apache/doris/pull/18397#issuecomment-1498542356

   ru beut


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] chenlinzhong commented on pull request #18397: [bug](GC)the issue of incorrect disk usage

Posted by "chenlinzhong (via GitHub)" <gi...@apache.org>.
chenlinzhong commented on PR #18397:
URL: https://github.com/apache/doris/pull/18397#issuecomment-1498542680

   run clickbench


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] chenlinzhong commented on pull request #18397: [bug](GC)the issue of incorrect disk usage

Posted by "chenlinzhong (via GitHub)" <gi...@apache.org>.
chenlinzhong commented on PR #18397:
URL: https://github.com/apache/doris/pull/18397#issuecomment-1498541170

   run feut


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] chenlinzhong commented on pull request #18397: [bug](GC)the issue of incorrect disk usage

Posted by "chenlinzhong (via GitHub)" <gi...@apache.org>.
chenlinzhong commented on PR #18397:
URL: https://github.com/apache/doris/pull/18397#issuecomment-1498542263

   run p0


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] chenlinzhong commented on pull request #18397: [bug](GC)the issue of incorrect disk usage

Posted by "chenlinzhong (via GitHub)" <gi...@apache.org>.
chenlinzhong commented on PR #18397:
URL: https://github.com/apache/doris/pull/18397#issuecomment-1500238273

   run buildall


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] github-actions[bot] commented on pull request #18397: [bug] (fix) the issue of incorrect disk usage

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #18397:
URL: https://github.com/apache/doris/pull/18397#issuecomment-1496865766

   clang-tidy review says "All clean, LGTM! :+1:"


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] hello-stephen commented on pull request #18397: [bug](GC)the issue of incorrect disk usage

Posted by "hello-stephen (via GitHub)" <gi...@apache.org>.
hello-stephen commented on PR #18397:
URL: https://github.com/apache/doris/pull/18397#issuecomment-1498636095

   TeamCity pipeline, clickbench performance test result:
    the sum of best hot time: 33.23 seconds
    stream load tsv:          463 seconds loaded 74807831229 Bytes, about 154 MB/s
    stream load json:         21 seconds loaded 2358488459 Bytes, about 107 MB/s
    stream load orc:          73 seconds loaded 1101869774 Bytes, about 14 MB/s
    stream load parquet:          30 seconds loaded 861443392 Bytes, about 27 MB/s
    https://doris-community-test-1308700295.cos.ap-hongkong.myqcloud.com/tmp/20230406075000_clickbench_pr_125369.html


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] github-actions[bot] commented on pull request #18397: [bug](GC)the issue of incorrect disk usage

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #18397:
URL: https://github.com/apache/doris/pull/18397#issuecomment-1500226731

   clang-tidy review says "All clean, LGTM! :+1:"


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] github-actions[bot] commented on pull request #18397: [bug](GC)the issue of incorrect disk usage

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #18397:
URL: https://github.com/apache/doris/pull/18397#issuecomment-1500227916

   clang-tidy review says "All clean, LGTM! :+1:"


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org