You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "David Capwell (Jira)" <ji...@apache.org> on 2019/10/22 01:31:00 UTC
[jira] [Comment Edited] (CASSANDRA-15364) Avoid over scanning data
directories in LogFile.verify()
[ https://issues.apache.org/jira/browse/CASSANDRA-15364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16956431#comment-16956431 ]
David Capwell edited comment on CASSANDRA-15364 at 10/22/19 1:30 AM:
---------------------------------------------------------------------
* As a general comment, almost every case I see where absolutePath.ifPresent is called is known to be a ADD or REMOVE so the lambda version could be replaced with .get() (style comment, feel free to ignore)
* [https://github.com/apache/cassandra/compare/trunk...krummas:marcuse/15364#diff-d8721a4dd04f4f35b65e7edeb6c883f6R357]
The makeRecord call above will do a listing (in make(Type, SSTable) method), so this change looks to be because deleteRecordFiles(LogRecord) no longer exists and was replaced by deleteRecordFiles(List<File>); LGTM
* [https://github.com/apache/cassandra/compare/trunk...krummas:marcuse/15364#diff-d8721a4dd04f4f35b65e7edeb6c883f6R386]
Is this to handle failure in the middle of the log (while deleting on commit we hard-stop in the middle, so recovery may see missing files)? Seems fine, was just curious if there was a different reason (there was a comment in the code about people copying tx files to different nodes, hence why I wanted to call out).
* [https://github.com/apache/cassandra/compare/trunk...krummas:marcuse/15364#diff-2e4c884e6aace427bdb6b0a122164ce4R308-R328]
Took a while to parse this change, so I am curious what was the motivation? Was it just to avoid File::getCanonicalPath? As far as I can tell Descriptor::fromFilenameWithComponent is only string parsing with a call to File::getCanonicalFile.
While reading this change, what bothered me was the question "why is the file name a prefix" and I missed the comment in the javadoc which expresses that expectation ("absoluteFilePaths contains full file parts up to the component name", "up to" being the important part); the code before relied on Descriptor:fromFilename so didn't really care.
The new version looks fine, I just wanted to better understand the motive
Overall, looks good to me; I want to do one more pass and test this, ETA EOD.
was (Author: dcapwell):
* As a general comment, almost every case I see where absolutePath.ifPresent is called is known to be a ADD or REMOVE so the lambda version could be replaced with .get() (style comment, feel free to ignore)
* [https://github.com/apache/cassandra/compare/trunk...krummas:marcuse/15364#diff-d8721a4dd04f4f35b65e7edeb6c883f6R357]
The makeRecord call above will do a listing (in make(Type, SSTable) method), so this change looks to be because deleteRecordFiles(LogRecord) no longer exists and was replaced by deleteRecordFiles(List<File>); LGTM
* [https://github.com/apache/cassandra/compare/trunk...krummas:marcuse/15364#diff-d8721a4dd04f4f35b65e7edeb6c883f6R386]
Is this to handle failure in the middle of the log (while deleting on commit we hard-stop in the middle, so recovery may see missing files)? Seems fine, was just curious if there was a different reason (there was a comment in the code about people copying tx files to different nodes, hence why I wanted to call out).
* [https://github.com/apache/cassandra/compare/trunk...krummas:marcuse/15364#diff-2e4c884e6aace427bdb6b0a122164ce4R308-R328]
Took a while to parse this change, so I am curious what was the motivation? Was it just to avoid File::getCanonicalPath? As far as I can tell Descriptor::fromFilenameWithComponent is only string parsing with a call to File::getCanonicalFile.
While reading this change, what bothered me was the question "why is the file name a prefix" and I missed the comment in the javadoc which expresses that expectation ("absoluteFilePaths contains full file parts up to the component name", "up to" being the important part); the code before relied on Descriptor:fromFilename so didn't really care.
Overall, looks good to me; I want to do one more pass and test this, ETA EOD.
> Avoid over scanning data directories in LogFile.verify()
> --------------------------------------------------------
>
> Key: CASSANDRA-15364
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15364
> Project: Cassandra
> Issue Type: Bug
> Components: Local/Compaction
> Reporter: Marcus Eriksson
> Assignee: Marcus Eriksson
> Priority: Normal
> Fix For: 3.0.x, 3.11.x, 4.x
>
>
> We currently list the data directory for every {{REMOVE}} record in the file in {{LogFile.verify()}} - this can get very expensive during startup when we call {{LogTransaction.removeUnfinishedLeftovers()}}. In {{LogRecord.getExistingFiles(Set<String> absoluteFilePaths)}} we also fully parse the file name of the sstables found, here we only need to prefix match.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org