You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2015/06/01 14:12:17 UTC

[jira] [Commented] (FLINK-2121) FileInputFormat.addFilesInDir miscalculates total size

    [ https://issues.apache.org/jira/browse/FLINK-2121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14567223#comment-14567223 ] 

ASF GitHub Bot commented on FLINK-2121:
---------------------------------------

Github user mxm commented on the pull request:

    https://github.com/apache/flink/pull/752#issuecomment-107419256
  
    LGTM. Thanks for spotting this one.


> FileInputFormat.addFilesInDir miscalculates total size
> ------------------------------------------------------
>
>                 Key: FLINK-2121
>                 URL: https://issues.apache.org/jira/browse/FLINK-2121
>             Project: Flink
>          Issue Type: Bug
>          Components: Core
>            Reporter: Gabor Gevay
>            Assignee: Gabor Gevay
>            Priority: Minor
>
> In FileInputFormat.addFilesInDir, the length variable should start from 0, because the return value is always used by adding it to the length (instead of just assigning). So with the current version, the length before the call will be seen twice in the result.
> mvn verify caught this for me now. The reason why this hasn't been seen yet, is because testGetStatisticsMultipleNestedFiles catches this only if it gets the listings of the outer directory in a certain order. Concretely, if the inner directory is seen before the other file in the outer directory, then length is 0 at that point, so the bug doesn't show. But if the other file is seen first, then its size is added twice to the total result.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)