You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by andralungu <gi...@git.apache.org> on 2015/08/26 13:28:44 UTC

[GitHub] flink pull request: [FLINK-2512] [bugfix] Used a concurrent list i...

GitHub user andralungu opened a pull request:

    https://github.com/apache/flink/pull/1058

    [FLINK-2512] [bugfix] Used a concurrent list in the open method

    This PR fixes the concurrency issue raised by the elements in the broadcast set. I instantiated the list to be a CopyOnWriteArrayList. Hope that fixes the issue! 

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/andralungu/flink zipWithIndexBugFix

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/1058.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1058
    
----
commit 8e81256d6d10140d180762c118ef4bd575300dd3
Author: Andra Lungu <lu...@gmail.com>
Date:   2015-08-26T11:26:32Z

    [FLINK-2512] [bugfix] Used a concurrent list in the open method

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2152] [bugfix] Used a concurrent list i...

Posted by uce <gi...@git.apache.org>.
Github user uce commented on the pull request:

    https://github.com/apache/flink/pull/1058#issuecomment-135041846
  
    With the correct issue it makes sense :) Do you know why the concurrent mod exception is occurring in the first place btw?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2152] [bugfix] Used a concurrent list i...

Posted by andralungu <gi...@git.apache.org>.
Github user andralungu commented on the pull request:

    https://github.com/apache/flink/pull/1058#issuecomment-135396805
  
    Yup that's what I thought as well. 
    However, I only tested `zipWithUniqueId` on a cluster. 
    A student came to me and said that `zipWithIndex` did not work in a cluster environment (I believed him) and claimed some days after via the Jira issue that this was the fix that worked for him. 
    
    That's all the input I have. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2152] [bugfix] Used a concurrent list i...

Posted by andralungu <gi...@git.apache.org>.
Github user andralungu commented on the pull request:

    https://github.com/apache/flink/pull/1058#issuecomment-138692441
  
    It seems the issue was already fixed via #1075 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2152] [bugfix] Used a concurrent list i...

Posted by andralungu <gi...@git.apache.org>.
Github user andralungu commented on the pull request:

    https://github.com/apache/flink/pull/1058#issuecomment-135042510
  
    I guess because more parallel instances are trying to access that broadcast variable. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2152] [bugfix] Used a concurrent list i...

Posted by andralungu <gi...@git.apache.org>.
Github user andralungu commented on the pull request:

    https://github.com/apache/flink/pull/1058#issuecomment-135035150
  
    Yes, I am sorry about that. It was a propagated typo. Problem solved! 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2152] [bugfix] Used a concurrent list i...

Posted by StephanEwen <gi...@git.apache.org>.
Github user StephanEwen commented on the pull request:

    https://github.com/apache/flink/pull/1058#issuecomment-135398622
  
    Okay, looking at the "zipWithIndex" code, here is what really is the problem:
    
    Each function actually modifies the list, by sorting it. The here proposes solution solves it, by making sure everyone has its own copy of the list. That, btw, would have worked with any ArrayList as well. CopyOnWriteList seems a bit overkill.
    
    A nicer way to solve this is IMHO to use a broadcast variable initializer, which would guarantee that the list is sorted once (by the first one that accesses it) and then everyone shares the same sorted list.
      - Less memory consumption (not super critical, as we are talking about small lists)
      - Less work, since only one sort happens per TaskManager, rather than one sort per task.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2152] [bugfix] Used a concurrent list i...

Posted by andralungu <gi...@git.apache.org>.
Github user andralungu closed the pull request at:

    https://github.com/apache/flink/pull/1058


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2152] [bugfix] Used a concurrent list i...

Posted by StephanEwen <gi...@git.apache.org>.
Github user StephanEwen commented on the pull request:

    https://github.com/apache/flink/pull/1058#issuecomment-138677060
  
    I think if we fix it, let's fix it the best way we can.
    We can use broadcast variable initializers (I gave a pointer to that), which makes the solution more efficient.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2512] [bugfix] Used a concurrent list i...

Posted by uce <gi...@git.apache.org>.
Github user uce commented on the pull request:

    https://github.com/apache/flink/pull/1058#issuecomment-134980990
  
    I don't get it. You are not instantiating a CopyOnWriteArrayList here, but just cast the list returned by the runtime context. Either this works and the returned list is already a CopyOnWriteArrayList, in which case this does nothing. Or it's not a CopyOnWriteArrayList and the cast fails at runtime.
    
    What are you trying to fix here? In general, fixes should come with a test to verify them as well. Otherwise, it's too easy to loose fixes and very hard to understand what is being fixed when reviewing.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2152] [bugfix] Used a concurrent list i...

Posted by StephanEwen <gi...@git.apache.org>.
Github user StephanEwen commented on the pull request:

    https://github.com/apache/flink/pull/1058#issuecomment-138631715
  
    Is anything happening here, or can we close this?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2152] [bugfix] Used a concurrent list i...

Posted by StephanEwen <gi...@git.apache.org>.
Github user StephanEwen commented on the pull request:

    https://github.com/apache/flink/pull/1058#issuecomment-135395583
  
    I don't understand this issue.
    
    Is any user code modifying this list? Concurrent read access should be well possible with any regular list.
    The system never modifies the list after it has been created once.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2152] [bugfix] Used a concurrent list i...

Posted by andralungu <gi...@git.apache.org>.
Github user andralungu commented on the pull request:

    https://github.com/apache/flink/pull/1058#issuecomment-135407258
  
    so you mean something like
    
    `List<Tuple2<Integer, Long>> offsets = getRuntimeContext().getBroadcastVariableWithInitializer("counts", new ArrayList());`?
    



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2152] [bugfix] Used a concurrent list i...

Posted by andralungu <gi...@git.apache.org>.
Github user andralungu commented on the pull request:

    https://github.com/apache/flink/pull/1058#issuecomment-138654987
  
    I asked whether I should change the list initialization. The current commit fixes the bug reported on the Jira issue, but I understood that you'd like it to be different. 
    
    I can also close it, but then we should mark the Jira issue as resolved or say it's not a bug or whatever. I saw it reopened and wanted to fix it... 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2512] [bugfix] Used a concurrent list i...

Posted by andralungu <gi...@git.apache.org>.
Github user andralungu commented on the pull request:

    https://github.com/apache/flink/pull/1058#issuecomment-135027923
  
    excerpt from the JIRA 
    "This can be fixed by wrapping a concurrent list around the counts variable
    e.g. CopyOnWriteArrayList from java.util.concurrrent (in the open method)"
    
    Stood there open for a while so I thought I fixed it. Don't really know what test should be written. The use case is the same, the environment just goes from local to cluster. 



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2152] [bugfix] Used a concurrent list i...

Posted by mjsax <gi...@git.apache.org>.
Github user mjsax commented on the pull request:

    https://github.com/apache/flink/pull/1058#issuecomment-135034643
  
    I agree with @rmetzger and @uce https://issues.apache.org/jira/browse/FLINK-2512 is already closed via #1009 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2512] [bugfix] Used a concurrent list i...

Posted by uce <gi...@git.apache.org>.
Github user uce commented on the pull request:

    https://github.com/apache/flink/pull/1058#issuecomment-135033376
  
    Thanks! I think you are referencing a wrong issue. Can you double check? The new commit looks better. :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2512] [bugfix] Used a concurrent list i...

Posted by rmetzger <gi...@git.apache.org>.
Github user rmetzger commented on the pull request:

    https://github.com/apache/flink/pull/1058#issuecomment-134980282
  
    Are you sure this is a fix for FLINK-2512 ?
    I don't see how this change is fixing any issue.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---