You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by "Koen De Groote (JIRA)" <ji...@apache.org> on 2018/06/06 22:33:00 UTC

[jira] [Created] (KAFKA-7013) Do Infer analysis and investigate results

Koen De Groote created KAFKA-7013:
-------------------------------------

             Summary: Do Infer analysis and investigate results
                 Key: KAFKA-7013
                 URL: https://issues.apache.org/jira/browse/KAFKA-7013
             Project: Kafka
          Issue Type: Bug
            Reporter: Koen De Groote


Infer is a tool by Facebook that performs checks on code in order to find potential flaws.

Specifically, it attemps to track down null dereferences, thread safety violations and resource leaks.

Site: [http://fbinfer.com|http://fbinfer.com/]

Github: [https://github.com/facebook/infer]

The tool supports java. I made a docker container for the kafka code and the infer tool and ran Infer's analysis on the `gradle clean compileJava` process.

Results:
Summary of the reports

THREAD_SAFETY_VIOLATION: 274
RESOURCE_LEAK: 263
NULL_DEREFERENCE: 151

Do take into account: the analysis is not perfect. Most resources leaks reported are simply because the end of a scope was reached and the resource wasn't closed. But there are times you don't actually want to close them.

For instance:

[https://github.com/apache/kafka/blob/2.0/streams/src/main/java/org/apache/kafka/streams/processor/internals/DefaultKafkaClientSupplier.java]

All those byte serializers/deserializers are reported as resource leaks. While I don't know this code, a quick gander tells me it seems like these resources should not be closed and the end of the scope.

688 issues is a lot and probably worth going through, even if it is only to conclude that all the reported instances are false positives.

Another example, actual infer output this time:
{noformat}
687. clients/src/main/java/org/apache/kafka/common/record/CompressionRatioEstimator.java:52: error: THREAD_SAFETY_VIOLATION
       Read/Write race. Non-private method `float CompressionRatioEstimator.updateEstimation(String,CompressionType,float)` reads without synchronization from `compressionRatioForTopic.[_]`. Potentially races with write in method `CompressionRatioEstimator.updateEstimation(...)`.
{noformat}

While technically correct in that the synchronization only happens on write and not on read, one can wonder if adding it would negatively affect performance and if the "correct" ratio would differ so much from the incorrect that it actually warrants adding the synchronization on read.

The results should definitely be taken with a grain of salt.

Reason for not uploading the file with the bugs listed in them: because that file only shows the last step in the code that would potentially trigger the problem. Infer has a secondary command `infer-explore` which lets you pick one of the issues it found and then it outputs the exact code path that leads to the problem, something showing surprising cases.

Because of the scope of this and my relative inexperience with this codebase, I don't really know how to properly fill in this ticket, so I'll be leaving the ticket as is.

Final word: looking into stuff like this is work I usually do outside of working hours. Unless someone is really really interested, I'd ask people not to take time away from what would be their regular tickets. Especially since eventual PR(s) will most likely have a discussion of whether or not a particular change should actually happen or not.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)