You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Stefania (JIRA)" <ji...@apache.org> on 2015/11/17 04:22:11 UTC

[jira] [Commented] (CASSANDRA-10677) Improve performance of folderSize function

    [ https://issues.apache.org/jira/browse/CASSANDRA-10677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15007936#comment-15007936 ] 

Stefania commented on CASSANDRA-10677:
--------------------------------------

I've rebased the patch on 3.0 and added a unit test, see [link attached|https://github.com/stef1927/cassandra/tree/10677-3.0]. 

Started following CI jobs:

http://cassci.datastax.com/view/Dev/view/stef1927/job/stef1927-10677-3.0-testall/
http://cassci.datastax.com/view/Dev/view/stef1927/job/stef1927-10677-testall/
http://cassci.datastax.com/view/Dev/view/stef1927/job/stef1927-10677-3.0-windows-utest_win32/

Because the code is only used by unit tests or _nodetool listsnapshots_ dtests will not exercise this code and so I did not launch them.

I've repeated some basic bench-marking confirming the initial observation that the new method is about twice as fast as before. It also handles correctly invalid parameters such as files or non existing folders whereas the old implementation would have thrown a null pointer exception. Another difference is that it does not follow symbolic links, which I believe is the correct thing to do.

One more observation is that nether the new code, nor the old code include the folder descriptors in the space calculations, therefore the value returned is slightly less than what's returned by {{du -sb}}. This can be easily rectified with the new implementation provided by this patch but it is not presently done since I did not want to alter existing behavior.

If unit tests complete without problems we can commit this, I will post another update then.

> Improve performance of folderSize function
> ------------------------------------------
>
>                 Key: CASSANDRA-10677
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10677
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Local Write-Read Paths
>         Environment: Ubuntu 14. JDK 7
>            Reporter: Briareus
>            Priority: Minor
>              Labels: patch, performance
>             Fix For: 3.x
>
>         Attachments: Optimized_folderSize_function_to_use_Java_7_nio_walkFileTree_method_.patch
>
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> FileUtils.folderSize function recursively traverses the directory tree using listFiles method. This is no longer efficient as Java 7 offers much better Files.walkFileTree method. It makes the method work twice faster according to my tests. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)