You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2019/03/01 00:53:01 UTC

[GitHub] HeartSaVioR commented on issue #22138: [SPARK-25151][SS] Apply Apache Commons Pool to KafkaDataConsumer

HeartSaVioR commented on issue #22138: [SPARK-25151][SS] Apply Apache Commons Pool to KafkaDataConsumer
URL: https://github.com/apache/spark/pull/22138#issuecomment-468499156

Unfortunately I don't. I might be able to spin some VMs and run for couple of hours if we are uncertain of the code, but I guess it doesn't meet your expectation as `production`. I also personally don't believe "it runs our production so far so no issue" - I have some experiences which others trying to convince their codes based on that claim, and I found multiple critical performance hits from that code.

IMHO, connection pool contributes only very small portion of batch run, unless it is ill-implemented or encounters edge-case. I've covered the functionalities via test cases, and for performance perspective Apache Commons Pool which the patch leverages is widely used (DISCLAIMER: `Jedis` project where I'm committer of also use for years) and actively maintained.

I appreciate your previous concern of extra seeking which could contribute the latency pretty much (though that's edge-case). The issue exists on current code, and the patch fixes it. Like this case, we could still verify which this patch brings improvements via reviewing PR's description and code. That's just a matter of the faith whether reviewers start reading the statements in PR's description without suspicions, and review the codebase to verify the statements are valid.

I would kindly ask you to start weighing the value from statements in PR's description, and provide your feedback before looking into code if you think it doesn't have more value than the complexity brought from the code change.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org