You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zeppelin.apache.org by fireboy1919 <gi...@git.apache.org> on 2016/03/16 16:12:37 UTC

[GitHub] incubator-zeppelin pull request: Persist pool

GitHub user fireboy1919 opened a pull request:

    https://github.com/apache/incubator-zeppelin/pull/782

    Persist pool

    ### What is this PR for?
    This is a step towards creating a REST endpoint from resource pools  https://issues.apache.org/jira/browse/ZEPPELIN-699
    
    It allows resources to be fully serializable so that the object that they contain can be passed around using thrift.
    
    ### What type of PR is it?
    Improvement
    
    ### Todos
    * [X] - Add serialization for rest endpoint
      [] - Create a REST endpoint
      [] - Create a pluggable API that accepts different kinds of resource pools.
      [] - Create a type of resource pool that persists.
    
    ### What is the Jira issue?
    Partially implements this:  https://issues.apache.org/jira/browse/ZEPPELIN-699
    
    Note, that this pull request is dependent upon this one:  
    https://github.com/apache/incubator-zeppelin/pull/781
    
    Before that change is made, this will not compile.
    
    ### How should this be tested?
    There is a test class "org.apache.zeppelin.ResourceSerializerTest", but with no front-end, there isn't an easy manual way to test this.
    
    However, these steps will work:
    
    1. Start Zeppelin
    2. Run a paragraph.
    3.  Inspect "org.apache.zeppelin.resource.ResourcePoolUtils.getAllResources()".  You will see that this paragraph's results are included in this resource pool.
    
    ### Screenshots (if appropriate)
    
    ### Questions:
    * Does the licenses files need update? No
    * Is there breaking changes for older versions? No
    * Does this needs documentation?  No.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/fireboy1919/incubator-zeppelin persist_pool

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-zeppelin/pull/782.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #782
    
----
commit cbdb22d309948eaa2d7562c66c3b8780a6581647
Author: Rusty Phillips <ru...@cloudspace.com>
Date:   2016-03-15T23:11:20Z

    Added serialization to resourcepoolutils.

commit ffd092b6e7df495ecd1f4e753e3e88678bae2ead
Author: Rusty Phillips <ru...@cloudspace.com>
Date:   2016-03-15T23:06:35Z

    Added pool persistance.

commit 84cabe2d02a7cf49a82f56d8ae8cbbecb6699aaf
Author: Rusty Phillips <ru...@cloudspace.com>
Date:   2016-03-16T14:59:29Z

    Added test.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-zeppelin pull request: Enhanced resource pool serializat...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on the pull request:

    https://github.com/apache/incubator-zeppelin/pull/782#issuecomment-198111341
  
    @fireboy1919 are the ToDos list of items completed?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-zeppelin pull request: Enhanced resource pool serializat...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on the pull request:

    https://github.com/apache/incubator-zeppelin/pull/782#issuecomment-198610147
  
    It might help to break this up into smaller PR to help review and getting it merged.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-zeppelin pull request: Enhanced resource pool serializat...

Posted by fireboy1919 <gi...@git.apache.org>.
Github user fireboy1919 commented on the pull request:

    https://github.com/apache/incubator-zeppelin/pull/782#issuecomment-199269648
  
    That's why I only have the first TODO done:  it's the smallest case I could think of.  
    
    It's only the serializer and the place where it gets used and nothing else.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-zeppelin pull request: Enhanced resource pool serializat...

Posted by fireboy1919 <gi...@git.apache.org>.
Github user fireboy1919 commented on the pull request:

    https://github.com/apache/incubator-zeppelin/pull/782#issuecomment-204451849
  
    @Leemoonsoo:  does this address everything?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-zeppelin pull request: Enhanced resource pool serializat...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on the pull request:

    https://github.com/apache/incubator-zeppelin/pull/782#issuecomment-200179469
  
    looks good, @Leemoonsoo what do you think?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-zeppelin pull request: Enhanced resource pool serializat...

Posted by fireboy1919 <gi...@git.apache.org>.
Github user fireboy1919 commented on the pull request:

    https://github.com/apache/incubator-zeppelin/pull/782#issuecomment-199814378
  
    I have finished work on the other three tasks on the list that I mentioned above - which are not included to make the request easier to review.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-zeppelin pull request: Enhanced resource pool serializat...

Posted by Leemoonsoo <gi...@git.apache.org>.
Github user Leemoonsoo commented on the pull request:

    https://github.com/apache/incubator-zeppelin/pull/782#issuecomment-202459832
  
    Thanks for the understanding.
    
    I think term 'resource', 'remote resource' are bit confusing in this conversation. if i define, 
    
    Resource - Represents a single resource. [Resource.java](https://github.com/apache/incubator-zeppelin/blob/master/zeppelin-interpreter/src/main/java/org/apache/zeppelin/resource/Resource.java) implements local resource that user object exists in current interpreter process,  [RemoteResource.java](https://github.com/apache/incubator-zeppelin/blob/master/zeppelin-interpreter/src/main/java/org/apache/zeppelin/resource/RemoteResource.java) implements remote resource that user object exists in the other interpreter process.
    User object - user object that Resource holds. Internally field [Object r](https://github.com/apache/incubator-zeppelin/blob/master/zeppelin-interpreter/src/main/java/org/apache/zeppelin/resource/Resource.java#L26)
    
    Then, can i understand your last sentence as 
    
    Modify "ResourcePoolUtils.getAllResourcesExcept" to return set of User objects instead of set of Resources?
    
    or
    
    Modify "ResourcePoolUtils.getAllResourceExcept" to return set of RemoteResource instead of set of Resources?
    
    which one is correct?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-zeppelin pull request: Enhanced resource pool serializat...

Posted by Leemoonsoo <gi...@git.apache.org>.
Github user Leemoonsoo commented on the pull request:

    https://github.com/apache/incubator-zeppelin/pull/782#issuecomment-201719098
  
    ResourcePool supposed to allow any resource. From not serializable object such as SparkContext, to very large object. `Object r` field in `class Resource` holds user object.
    
    This PR makes `Object r` serialized when `class Resource` is serialized. I think it will brings following problems,
    
    - DistributedResourcePool.getAll() method returns list of `Resource` of all remote interpreters. And this change will serialize all user object pointed by `Object r`, regardless of user want to access or not.
    - It'll make some error on DistributedResourcePool.getAll() when user put some unserializable object.
    
    Current implementation supposed to read user object from remote process only when Resource.get() method is called explicitly. please see https://github.com/apache/incubator-zeppelin/blob/master/zeppelin-interpreter/src/main/java/org/apache/zeppelin/resource/DistributedResourcePool.java#L72
    
    I think field `Object r` in the class `Resource` remains as transitive and shouldn't be serialized with `class Resource` for efficiency.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-zeppelin pull request: Enhanced resource pool serializat...

Posted by fireboy1919 <gi...@git.apache.org>.
Github user fireboy1919 commented on the pull request:

    https://github.com/apache/incubator-zeppelin/pull/782#issuecomment-202411857
  
    That makes sense.  The issue is that "ResourcePoolUtils" currently doesn't use that enhanced method, and if you're trying to get a specific resource, that looks like the proper path to do it.
    
    How about I create a pull request that modified "ResourcePoolUtils.getAllResourcesExcept" so that the returned resources from the remote interpreter process are actually remote resources?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-zeppelin pull request: Enhanced resource pool serializat...

Posted by fireboy1919 <gi...@git.apache.org>.
Github user fireboy1919 commented on the pull request:

    https://github.com/apache/incubator-zeppelin/pull/782#issuecomment-202581884
  
    The second one.
    
    And I have a solution that does that in this pull request now.
    
    I didn't remove the interpreter serializer from this pull request because we may still want something like that to save resources to persistent storage.  Since User objects are to be transient, we'd need something like that to explicitly make them *not* transient for when we want to store them.
    
    I therefore modified the serializer to that purpose.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-zeppelin pull request: Enhanced resource pool serializat...

Posted by fireboy1919 <gi...@git.apache.org>.
Github user fireboy1919 commented on the pull request:

    https://github.com/apache/incubator-zeppelin/pull/782#issuecomment-198351846
  
    I was going to make a different request for each of those.  I have one ready for the second item, too, but i figured that smaller was better.
    
    Should I change the PR or the description of it?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-zeppelin pull request: Enhanced resource pool serializat...

Posted by fireboy1919 <gi...@git.apache.org>.
Github user fireboy1919 commented on the pull request:

    https://github.com/apache/incubator-zeppelin/pull/782#issuecomment-198048450
  
    @Leemoonsoo:  This is ready to go, and is based upon your previous work.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---