You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zeppelin.apache.org by oxygen311 <gi...@git.apache.org> on 2018/07/11 12:12:45 UTC

[GitHub] zeppelin pull request #3064: [ZEPPELIN-3596] Saving resources from pool to S...

GitHub user oxygen311 opened a pull request:

    https://github.com/apache/zeppelin/pull/3064

    [ZEPPELIN-3596] Saving resources from pool to SQL

    ### What is this PR for?
    > It's information in resourcePool, but it is only available for viewing. It would be nice if we can save this data to SQL via JDBC Driver.
    Now you can access to `ResourcePool` data from SQL query. For example, if you write `SELECT * FROM {ResourcePool.note_id=SOME_NOTE_ID.paragraph_id=SOME_PARAGRAPH_ID};` and it will be correct SQL query.
    How it works:
    1. Creates and fills a table named like a `PARAGRAPH_ID`;
    2. Expression `{*}` replaces with actual table name;
    3. Updated SQL query is running.
    Also, you can not specify the note_id, then the note_id of the current notebook will be used. For example, `SELECT * FROM {ResourcePool.paragraph_id=SOME_PARAGRAPH_ID};`.
    
    ### What type of PR is it?
    Improvement
    
    ### What is the Jira issue?
    [Zeppelin 3596](https://issues.apache.org/jira/projects/ZEPPELIN/issues/ZEPPELIN-3596?filter=allopenissues)
    
    ### Questions:
    * Does the licenses files need update?
    * Is there breaking changes for older versions?
    * Does this needs documentation?


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/TinkoffCreditSystems/zeppelin ZEPPELIN-3596

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/zeppelin/pull/3064.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #3064
    
----
commit 6033d8fddb4b7f21f268a79143baa5903bf6c759
Author: oxygen311 <al...@...>
Date:   2018-07-10T10:18:37Z

    Add `preparePoolData` function

commit 882a4ee9aa128d2d06d85841a4ea50e4d01f4aaa
Author: oxygen311 <al...@...>
Date:   2018-07-10T16:06:45Z

    Code refactoring

commit af8747b6c19e7ab94e23352ab8162d72c9a19fe0
Author: oxygen311 <al...@...>
Date:   2018-07-11T09:41:28Z

    Refactoring

commit 1a32689b101d5974f5e31d1d883befe9d0142eec
Author: oxygen311 <al...@...>
Date:   2018-07-11T11:41:03Z

    Add tests

commit d5e57ad5cc5e19b77aded00efcc2a567283f1b1b
Author: oxygen311 <al...@...>
Date:   2018-07-11T11:55:47Z

    Remove assignment operation from cycle

----


---

[GitHub] zeppelin issue #3064: [ZEPPELIN-3596] Saving resources from pool to SQL

Posted by oxygen311 <gi...@git.apache.org>.
Github user oxygen311 commented on the issue:

    https://github.com/apache/zeppelin/pull/3064
  
    Scenario: get a list of client IDs from one source (flat file, base, SAP), send the list to the database and perform complex analytics with it.
    The data in the database is large and there is no way to pull everything into Python to run the analytics inside the Python.


---

[GitHub] zeppelin issue #3064: [ZEPPELIN-3596] Saving resources from pool to SQL

Posted by oxygen311 <gi...@git.apache.org>.
Github user oxygen311 commented on the issue:

    https://github.com/apache/zeppelin/pull/3064
  
    @zjffdu 
    The user can transfer data not only from one database to another, but from flat files and interpreters, which are not available in the python (e.g. SAP).
    Sending data through `ResourcePool` is not as effective as native methods, but can cover up to 80% of user needs.



---

[GitHub] zeppelin pull request #3064: [ZEPPELIN-3596] Saving resources from pool to S...

Posted by oxygen311 <gi...@git.apache.org>.
GitHub user oxygen311 reopened a pull request:

    https://github.com/apache/zeppelin/pull/3064

    [ZEPPELIN-3596] Saving resources from pool to SQL

    ### What is this PR for?
    > It's information in resourcePool, but it is only available for viewing. It would be nice if we can save this data to SQL via JDBC Driver.
    
    Now you can access to `ResourcePool` data from SQL query. For example, if you write `SELECT * FROM {ResourcePool.note_id=SOME_NOTE_ID.paragraph_id=SOME_PARAGRAPH_ID};` and it will be correct SQL query.
    How it works:
    1. Creates and fills a table named like a `PARAGRAPH_ID`;
    2. Expression `{*}` replaces with actual table name;
    3. Updated SQL query is running.
    
    Also, you can not specify the note_id, then the note_id of the current notebook will be used. For example, `SELECT * FROM {ResourcePool.paragraph_id=SOME_PARAGRAPH_ID};`.
    
    ### What type of PR is it?
    Improvement
    
    ### What is the Jira issue?
    [Zeppelin 3596](https://issues.apache.org/jira/projects/ZEPPELIN/issues/ZEPPELIN-3596?filter=allopenissues)
    
    ### Questions:
    * Does the licenses files need update? No
    * Is there breaking changes for older versions? No
    * Does this needs documentation? No


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/TinkoffCreditSystems/zeppelin ZEPPELIN-3596

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/zeppelin/pull/3064.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #3064
    
----
commit 6033d8fddb4b7f21f268a79143baa5903bf6c759
Author: oxygen311 <al...@...>
Date:   2018-07-10T10:18:37Z

    Add `preparePoolData` function

commit 882a4ee9aa128d2d06d85841a4ea50e4d01f4aaa
Author: oxygen311 <al...@...>
Date:   2018-07-10T16:06:45Z

    Code refactoring

commit af8747b6c19e7ab94e23352ab8162d72c9a19fe0
Author: oxygen311 <al...@...>
Date:   2018-07-11T09:41:28Z

    Refactoring

commit 1a32689b101d5974f5e31d1d883befe9d0142eec
Author: oxygen311 <al...@...>
Date:   2018-07-11T11:41:03Z

    Add tests

commit d5e57ad5cc5e19b77aded00efcc2a567283f1b1b
Author: oxygen311 <al...@...>
Date:   2018-07-11T11:55:47Z

    Remove assignment operation from cycle

----


---

[GitHub] zeppelin pull request #3064: [ZEPPELIN-3596] Saving resources from pool to S...

Posted by oxygen311 <gi...@git.apache.org>.
Github user oxygen311 closed the pull request at:

    https://github.com/apache/zeppelin/pull/3064


---

[GitHub] zeppelin issue #3064: [ZEPPELIN-3596] Saving resources from pool to SQL

Posted by zjffdu <gi...@git.apache.org>.
Github user zjffdu commented on the issue:

    https://github.com/apache/zeppelin/pull/3064
  
    @oxygen311 I am not sure how much useful and important this senior is for users. It looks like a little weird to me that you have to insert the paragraph result into database and then execute the query. 
    Can you share more details about your scenario ?
    
    BTW, I raise one thread in mail list to discuss about sharing data in zeppelin, welcome any comments on that. 
    https://lists.apache.org/thread.html/36d5c74bc946df2ed2ab456227251e94c1d88635c6acbc69d9a8c4f1@%3Cdev.zeppelin.apache.org%3E



---

[GitHub] zeppelin pull request #3064: [ZEPPELIN-3596] Saving resources from pool to S...

Posted by oxygen311 <gi...@git.apache.org>.
Github user oxygen311 closed the pull request at:

    https://github.com/apache/zeppelin/pull/3064


---

[GitHub] zeppelin issue #3064: [ZEPPELIN-3596] Saving resources from pool to SQL

Posted by sanjaydasgupta <gi...@git.apache.org>.
Github user sanjaydasgupta commented on the issue:

    https://github.com/apache/zeppelin/pull/3064
  
    There are similarities between the two features that all users will notice, and think about for some time. There needs to be good documentation to avoid any confusion. The substitution syntax {...} used in this feature is the same as in ZEPPELIN-3377, but there is almost 0 possibility that the wrong features will be invoked. More specifically, even if z-variable interpolation is enabled (zeppelin.jdbc.interpolation = true), the contents of {...} will be safely passed through since it is virtually impossible for a user to produce the correct string to use to activate ZEPPELIN-3377's code.


---

[GitHub] zeppelin issue #3064: [ZEPPELIN-3596] Saving resources from pool to SQL

Posted by zjffdu <gi...@git.apache.org>.
Github user zjffdu commented on the issue:

    https://github.com/apache/zeppelin/pull/3064
  
    Thanks @oxygen311 for the contribution, but I think it is already done in ZEPPELIN-3377 via ZeppelinContext variable interpolation


---

[GitHub] zeppelin issue #3064: [ZEPPELIN-3596] Saving resources from pool to SQL

Posted by zjffdu <gi...@git.apache.org>.
Github user zjffdu commented on the issue:

    https://github.com/apache/zeppelin/pull/3064
  
    @oxygen311 Could you tell more details about your scenario ? What is the interpreter of your source data (ResourcePool.note_id=SOME_NOTE_ID.paragraph_id=SOME_PARAGRAPH_ID)


---

[GitHub] zeppelin issue #3064: [ZEPPELIN-3596] Saving resources from pool to SQL

Posted by oxygen311 <gi...@git.apache.org>.
Github user oxygen311 commented on the issue:

    https://github.com/apache/zeppelin/pull/3064
  
    @zjffdu variable interpolation is about just using text variable. But with this feature you can transform `%table` text from ResourcePool to SQL table.


---

[GitHub] zeppelin issue #3064: [ZEPPELIN-3596] Saving resources from pool to SQL

Posted by zjffdu <gi...@git.apache.org>.
Github user zjffdu commented on the issue:

    https://github.com/apache/zeppelin/pull/3064
  
    ping @oxygen311 


---