You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tinkerpop.apache.org by "dkuppitz (GitHub)" <gi...@apache.org> on 2019/01/23 21:17:02 UTC

[GitHub] [tinkerpop] dkuppitz opened pull request #1044: TINKERPOP-2126 Fixed concurrency issue in `ObjectWritable::toString()`.

https://issues.apache.org/jira/browse/TINKERPOP-2126

So, apparently, Spark 2.4.0 has some background logging going on, where it takes objects out of memory and dumps their `toString()` into the log. I don't think we can control when Spark is going to do that, so we have to work around it by making our `toString()` methods thread-safe or handle each `ConcurrentModificationException`. Guaranteeing thread-safety was easy for `TraverserSet` but turned out to be tricky (impossible?) for `ObjectWritable` as it's totally unclear where the objects are coming from and/or where they get modified while Spark is trying to log them.

I don't think we can make any arbitrary object thread-safe, but we can easily catch the `ConcurrentModificationException` and retry until we succeed. `while (true)` looks a bit scary, but I don't think that this can ever become an infinite loop.

`docker/build.sh -t -i -n` passed (as well as the external provider tests that consistently triggered the concurrency exceptions).

VOTE +1

[ Full content available at: https://github.com/apache/tinkerpop/pull/1044 ]
This message was relayed via gitbox.apache.org for dev@tinkerpop.apache.org

[GitHub] [tinkerpop] spmallette commented on issue #1044: TINKERPOP-2126 Fixed concurrency issue in `ObjectWritable::toString()`.

Posted by "spmallette (GitHub)" <gi...@apache.org>.
I like what you proposed - a more defensive posture is better imo.  thanks

[ Full content available at: https://github.com/apache/tinkerpop/pull/1044 ]
This message was relayed via gitbox.apache.org for dev@tinkerpop.apache.org

[GitHub] [tinkerpop] dkuppitz commented on pull request #1044: TINKERPOP-2126 Fixed concurrency issue in `ObjectWritable::toString()`.

Posted by "dkuppitz (GitHub)" <gi...@apache.org>.
Ha, you're picky. "::" is the official notation for method references though.

[ Full content available at: https://github.com/apache/tinkerpop/pull/1044 ]
This message was relayed via gitbox.apache.org for dev@tinkerpop.apache.org

[GitHub] [tinkerpop] spmallette commented on issue #1044: TINKERPOP-2126 Fixed concurrency issue in `ObjectWritable::toString()`.

Posted by "spmallette (GitHub)" <gi...@apache.org>.
It's an odd way to deal with `ConcurrentModificationException` but the code that is throwing it is out of our hands so we can't really fix it at its source very well. As i look at this one this morning, I wonder IF this does happen to somehow create some form of infinite loop it's going to be really hard to debug why jobs are just hanging - no one is going to think of the logger. 

Should we not timelimit the loop or something? The concurrent modification should not be going on for hours on end i would think so it could be really short? Maybe set a limit that slowly introduces a `Thread.sleep()` in the catch that incrementally increases and ultimately just gives up and returns the `toString()` as something like "yeah...this isn't happening". then at least the loop would break in the worst possible case that we're not thinking of (or in control of) right now.  

If that's too much for such an off chance thing happening maybe we should log some portion of the retries (every 100th try or something?) in the `catch` so that at least the logs would have some notice of them happening and we'd know why a job was hanging?

thoughts?

[ Full content available at: https://github.com/apache/tinkerpop/pull/1044 ]
This message was relayed via gitbox.apache.org for dev@tinkerpop.apache.org

[GitHub] [tinkerpop] dkuppitz closed pull request #1044: TINKERPOP-2126 Fixed concurrency issue in `ObjectWritable::toString()`.

Posted by "dkuppitz (GitHub)" <gi...@apache.org>.
[ pull request closed by dkuppitz ]

[ Full content available at: https://github.com/apache/tinkerpop/pull/1044 ]
This message was relayed via gitbox.apache.org for dev@tinkerpop.apache.org

[GitHub] [tinkerpop] dkuppitz commented on issue #1044: TINKERPOP-2126 Fixed concurrency issue in `ObjectWritable::toString()`.

Posted by "dkuppitz (GitHub)" <gi...@apache.org>.
I can make it a `for` loop, retry at most 5 times, log every retry attempt and increase the pause between each retry. If they all fail, we just return `ObjectWritable.class.toString()`, as you suggested earlier. No big deal, but realistically, I don't think this can ever happen. Hitting the exception only once is already hard, hitting it twice for the same object should be almost impossible.

[ Full content available at: https://github.com/apache/tinkerpop/pull/1044 ]
This message was relayed via gitbox.apache.org for dev@tinkerpop.apache.org

[GitHub] [tinkerpop] spmallette commented on issue #1044: TINKERPOP-2126 Fixed concurrency issue in `ObjectWritable::toString()`.

Posted by "spmallette (GitHub)" <gi...@apache.org>.
VOTE +1

[ Full content available at: https://github.com/apache/tinkerpop/pull/1044 ]
This message was relayed via gitbox.apache.org for dev@tinkerpop.apache.org

[GitHub] [tinkerpop] spmallette commented on pull request #1044: TINKERPOP-2126 Fixed concurrency issue in `ObjectWritable::toString()`.

Posted by "spmallette (GitHub)" <gi...@apache.org>.
we tend to just use the "." rather than the "::" when referencing methods on specific classes.

[ Full content available at: https://github.com/apache/tinkerpop/pull/1044 ]
This message was relayed via gitbox.apache.org for dev@tinkerpop.apache.org

[GitHub] [tinkerpop] spmallette commented on pull request #1044: TINKERPOP-2126 Fixed concurrency issue in `ObjectWritable::toString()`.

Posted by "spmallette (GitHub)" <gi...@apache.org>.
sorry - just trying to be consistent.

[ Full content available at: https://github.com/apache/tinkerpop/pull/1044 ]
This message was relayed via gitbox.apache.org for dev@tinkerpop.apache.org