You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@nifi.apache.org by GitBox <gi...@apache.org> on 2021/10/26 22:27:38 UTC

[GitHub] [nifi] patricker opened a new pull request #3511: NIFI-6175 Spark Livy - Improving Livy

patricker opened a new pull request #3511:
URL: https://github.com/apache/nifi/pull/3511


   #### Description of PR
   
   The Livy Session Controller is missing many of the options available, and many of them I feel are critical for this service to be useful (queue? conf? num of executors?)
   
   Add in functionality to shutdown open sessions when service is disabled.
   
   
   ### For all changes:
   - [x] Is there a JIRA ticket associated with this PR? Is it referenced 
        in the commit message?
   
   - [x] Does your PR title start with **NIFI-XXXX** where XXXX is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character.
   
   - [x] Has your PR been rebased against the latest commit within the target branch (typically `master`)?
   
   - [x] Is your initial contribution a single, squashed commit? _Additional commits in response to PR reviewer feedback should be made on this branch and pushed to allow change tracking. Do not `squash` or use `--force` when pushing to allow for clean monitoring of changes._
   
   ### For code changes:
   - [ ] Have you ensured that the full suite of tests is executed via `mvn -Pcontrib-check clean install` at the root `nifi` folder?
   - [x] Have you written or updated unit tests to verify your changes?
   - [ ] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)? 
   - [ ] If applicable, have you updated the `LICENSE` file, including the main `LICENSE` file under `nifi-assembly`?
   - [ ] If applicable, have you updated the `NOTICE` file, including the main `NOTICE` file found under `nifi-assembly`?
   - [ ] If adding new Properties, have you added `.displayName` in addition to .name (programmatic access) for each of the new properties?
   
   ### For documentation related changes:
   - [ ] Have you ensured that format looks appropriate for the output in which it is rendered?
   
   ### Note:
   Please ensure that once the PR is submitted, you check travis-ci for build issues and submit an update to your PR as soon as possible.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@nifi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [nifi] patricker commented on pull request #3511: NIFI-6175 Spark Livy - Improving Livy

Posted by GitBox <gi...@apache.org>.
patricker commented on pull request #3511:
URL: https://github.com/apache/nifi/pull/3511#issuecomment-952173247


   Hey @pvillard31. I've got some bad news, I changed companies and I don't have access to a Hadoop environment, and even less a Livy environment :(. I'd still love to see this PR get rebased and merged, but I don't think I'm the one to do it anymore.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@nifi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [nifi] naru014 edited a comment on pull request #3511: NIFI-6175 Spark Livy - Improving Livy

Posted by GitBox <gi...@apache.org>.
naru014 edited a comment on pull request #3511:
URL: https://github.com/apache/nifi/pull/3511#issuecomment-644564279


   > @naru014 Yes, it can. It also includes Batch processing controller service/processor. The one that comes with NiFi currently has severe limitations. I keep forgetting to get this one merged. I just need to update it for the latest version of NiFI.
   
   @patricker Thank you. We have got this built and included in the nifi. But when using a livy session controller service per nifi flow, the flows in some cases dont use the session created by the controller for that flow, but any other available session. Is this expected? So currently instead of creating a livy session controller per flow, I have created one livy session controller on the parent nifi-flow and it is inherited by all the child flows. By doing that, the nifi cluster has say min 2 sessions available and is used by any running flow based on its availability.
   Please let me know if there is a better way to use it.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [nifi] mattyb149 closed pull request #3511: NIFI-6175 Spark Livy - Improving Livy

Posted by GitBox <gi...@apache.org>.
mattyb149 closed pull request #3511:
URL: https://github.com/apache/nifi/pull/3511


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@nifi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [nifi] naru014 commented on pull request #3511: NIFI-6175 Spark Livy - Improving Livy

Posted by GitBox <gi...@apache.org>.
naru014 commented on pull request #3511:
URL: https://github.com/apache/nifi/pull/3511#issuecomment-639404592


   @patricker Can the contents in this PR be used to build the livy session controller and sparkinteractive processor? I have a use case to submit pyspark jobs from nifi. But the controllerservice and processor that comes along with nifi has limitations.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [nifi] github-actions[bot] closed pull request #3511: NIFI-6175 Spark Livy - Improving Livy

Posted by GitBox <gi...@apache.org>.
github-actions[bot] closed pull request #3511:
URL: https://github.com/apache/nifi/pull/3511


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [nifi] naru014 commented on pull request #3511: NIFI-6175 Spark Livy - Improving Livy

Posted by GitBox <gi...@apache.org>.
naru014 commented on pull request #3511:
URL: https://github.com/apache/nifi/pull/3511#issuecomment-644564279


   > @naru014 Yes, it can. It also includes Batch processing controller service/processor. The one that comes with NiFi currently has severe limitations. I keep forgetting to get this one merged. I just need to update it for the latest version of NiFI.
   
   @patricker Thank you. We have got this build and included in the nifi. But when using a livy session controller service per nifi flow, the flows in some cases dont use the session created by the controller for that flow, but any other available session. Is this expected? So currently instead of creating a livy session controller per flow, I have created one livy session controller on the parent nifi-flow and it is inherited by all the child flows. By doing that, the nifi cluster has say min 2 sessions available and is used by any running flow based on its availability.
   Please let me know if there is a better way to use it.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [nifi] naru014 edited a comment on pull request #3511: NIFI-6175 Spark Livy - Improving Livy

Posted by GitBox <gi...@apache.org>.
naru014 edited a comment on pull request #3511:
URL: https://github.com/apache/nifi/pull/3511#issuecomment-647270637


   > @naru014 Yes, this is expected. This is somewhat a limitation of Livy 0.5.0. Livy will not let you find sessions by name until v0.6.0. This version is available, but I don't have it in my environment.
   > 
   > There are two work arounds I've used.
   > 
   > * Use different accounts to run the session controllers.  The code does check to see if the username matches between the session it grabs from Livy and the configuration.
   > * Use multiple Livy servers, we used two different name nodes and setup an instance on each.
   > 
   > If you have Livy 0.6.0 you could update the code to work with session names. This would easily work around your issue, each session in a session controller could be named `session-1`, `session-2`, etc...
   > 
   > Note: Livy never updated the documentation when they added this feature ([apache/incubator-livy#48](https://github.com/apache/incubator-livy/pull/48)). But it looks like you just replace the sessionId with the name when making calls.
   
   @patricker Thank you. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [nifi] pvillard31 commented on pull request #3511: NIFI-6175 Spark Livy - Improving Livy

Posted by GitBox <gi...@apache.org>.
pvillard31 commented on pull request #3511:
URL: https://github.com/apache/nifi/pull/3511#issuecomment-951759990


   @patricker - I know this PR has been automatically closed but I'm currently playing with NiFi and Livy and if you want to re-open it and rebase it, happy to review/test this one.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@nifi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [nifi] github-actions[bot] closed pull request #3511: NIFI-6175 Spark Livy - Improving Livy

Posted by GitBox <gi...@apache.org>.
github-actions[bot] closed pull request #3511:
URL: https://github.com/apache/nifi/pull/3511


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@nifi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [nifi] naru014 commented on pull request #3511: NIFI-6175 Spark Livy - Improving Livy

Posted by GitBox <gi...@apache.org>.
naru014 commented on pull request #3511:
URL: https://github.com/apache/nifi/pull/3511#issuecomment-647270637


   > @naru014 Yes, this is expected. This is somewhat a limitation of Livy 0.5.0. Livy will not let you find sessions by name until v0.6.0. This version is available, but I don't have it in my environment.
   > 
   > There are two work arounds I've used.
   > 
   > * Use different accounts to run the session controllers.  The code does check to see if the username matches between the session it grabs from Livy and the configuration.
   > * Use multiple Livy servers, we used two different name nodes and setup an instance on each.
   > 
   > If you have Livy 0.6.0 you could update the code to work with session names. This would easily work around your issue, each session in a session controller could be named `session-1`, `session-2`, etc...
   > 
   > Note: Livy never updated the documentation when they added this feature ([apache/incubator-livy#48](https://github.com/apache/incubator-livy/pull/48)). But it looks like you just replace the sessionId with the name when making calls.
   
   Thank you. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [nifi] pvillard31 commented on pull request #3511: NIFI-6175 Spark Livy - Improving Livy

Posted by GitBox <gi...@apache.org>.
pvillard31 commented on pull request #3511:
URL: https://github.com/apache/nifi/pull/3511#issuecomment-952177293


   Thanks for the feedback @patricker - will try to have a look if I get the chance.
   Good luck / have fun at your new job - there is always some room for NiFi :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@nifi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [nifi] patricker commented on pull request #3511: NIFI-6175 Spark Livy - Improving Livy

Posted by GitBox <gi...@apache.org>.
patricker commented on pull request #3511:
URL: https://github.com/apache/nifi/pull/3511#issuecomment-642694758


   @naru014 Yes, it can. It also includes Batch processing controller service/processor.  The one that comes with NiFi currently has severe limitations.  I keep forgetting to get this one merged. I just need to update it for the latest version of NiFI.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [nifi] github-actions[bot] commented on pull request #3511: NIFI-6175 Spark Livy - Improving Livy

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #3511:
URL: https://github.com/apache/nifi/pull/3511#issuecomment-826171173


   We're marking this PR as stale due to lack of updates in the past few months. If after another couple of weeks the stale label has not been removed this PR will be closed. This stale marker and eventual auto close does not indicate a judgement of the PR just lack of reviewer bandwidth and helps us keep the PR queue more manageable.  If you would like this PR re-opened you can do so and a committer can remove the stale tag.  Or you can open a new PR.  Try to help review other PRs to increase PR review bandwidth which in turn helps yours.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [nifi] github-actions[bot] closed pull request #3511: NIFI-6175 Spark Livy - Improving Livy

Posted by GitBox <gi...@apache.org>.
github-actions[bot] closed pull request #3511:
URL: https://github.com/apache/nifi/pull/3511


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@nifi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [nifi] github-actions[bot] commented on pull request #3511: NIFI-6175 Spark Livy - Improving Livy

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #3511:
URL: https://github.com/apache/nifi/pull/3511#issuecomment-917507519


   We're marking this PR as stale due to lack of updates in the past few months. If after another couple of weeks the stale label has not been removed this PR will be closed. This stale marker and eventual auto close does not indicate a judgement of the PR just lack of reviewer bandwidth and helps us keep the PR queue more manageable.  If you would like this PR re-opened you can do so and a committer can remove the stale tag.  Or you can open a new PR.  Try to help review other PRs to increase PR review bandwidth which in turn helps yours.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@nifi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [nifi] patricker commented on pull request #3511: NIFI-6175 Spark Livy - Improving Livy

Posted by GitBox <gi...@apache.org>.
patricker commented on pull request #3511:
URL: https://github.com/apache/nifi/pull/3511#issuecomment-644782068


   @naru014 Yes, this is expected.  This is somewhat a limitation of Livy 0.5.0.  Livy will not let you find sessions by name until v0.6.0.  This version is available, but I don't have it in my environment.
   
   There are two work arounds I've used.
    - Use different accounts to run the session controllers.  The code does check to see if the username matches between the session it grabs from Livy and the configuration.
    - Use multiple Livy servers, we used two different name nodes and setup an instance on each.
   
   If you have Livy 0.6.0 you could update the code to work with session names.  This would easily work around your issue, each session in a session controller could be named `session-1`, `session-2`, etc...
   
   Note: Livy never updated the documentation when they added this feature (https://github.com/apache/incubator-livy/pull/48).  But it looks like you just replace the sessionId with the name when making calls.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [nifi] pvillard31 commented on pull request #3511: NIFI-6175 Spark Livy - Improving Livy

Posted by GitBox <gi...@apache.org>.
pvillard31 commented on pull request #3511:
URL: https://github.com/apache/nifi/pull/3511#issuecomment-841119994


   @patricker - any chance you can rebase this against main/latest?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [nifi] github-actions[bot] commented on pull request #3511: NIFI-6175 Spark Livy - Improving Livy

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #3511:
URL: https://github.com/apache/nifi/pull/3511#issuecomment-826171173


   We're marking this PR as stale due to lack of updates in the past few months. If after another couple of weeks the stale label has not been removed this PR will be closed. This stale marker and eventual auto close does not indicate a judgement of the PR just lack of reviewer bandwidth and helps us keep the PR queue more manageable.  If you would like this PR re-opened you can do so and a committer can remove the stale tag.  Or you can open a new PR.  Try to help review other PRs to increase PR review bandwidth which in turn helps yours.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [nifi] mattyb149 commented on pull request #3511: NIFI-6175 Spark Livy - Improving Livy

Posted by GitBox <gi...@apache.org>.
mattyb149 commented on pull request #3511:
URL: https://github.com/apache/nifi/pull/3511#issuecomment-992959769


   Closing in favor of #5601 as it incorporates these changes and continues forward


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@nifi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org