You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "Jason Gerlowski (Jira)" <ji...@apache.org> on 2021/02/26 18:09:00 UTC

[jira] [Comment Edited] (SOLR-15080) Apache Zeppelin Sandbox Integration

    [ https://issues.apache.org/jira/browse/SOLR-15080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17291849#comment-17291849 ] 

Jason Gerlowski edited comment on SOLR-15080 at 2/26/21, 6:08 PM:
------------------------------------------------------------------

I've attached an updated version of this patch.  This version massages the CLI syntax to more closely fit other {{bin/solr}} commands.  It also cleans up some issues around starting/stopping Zeppelin and puts better help text in place (which in turn dragged in a few small SolrCLI refactors).  

bq. We drag in some extra interpreters like kotlin and influxdb, in a "perfect world" we wouldn't worry about them.

I took a look at this.  Zeppelin offers two downloads - one that includes *all* interpreters, and one that only includes a minimal set.  I assumed I'd accidentally used the former instead of the latter, but it turns out that the patch *does* use the minimal download (it's just not all that minimal).  I'm going to open a Zeppelin ticket to discuss making the minimal distribution moreso, but we're stuck for the current Zeppelin release at least. 

Still definitely on my list is testing on Windows and a fix for the {{update_interpreter}} subcommand.  If I can clear those away soon I'll be looking to merge in the next week or so, so I'd love any testing help that people could offer on their own systems.  Eric, I think I addressed most of your feedback (other than creating additional zeppelin-solr commands, which we can handle independently of the integration here), but if I missed something or you've got more suggestions let me know!

----

To get back to the question around making the nyc311 dataset available.  I def agree that we should allow that, but I'm unsure about the approach so I'd rather tackle it in a separate ticket.

I think I mentioned earlier potentially exposing this using bin/solr's {{-e example}} mechanism, but on second thought I'm less sure of this approach.  Currently, Solr "examples" couple together the node/core topology with the dataset.  e.g.  {{-e techproducts}} can only be used with Solr standalone.  Which is less than ideal.  Ideally you could run something like {{bin/solr example}} to set up a particular topology or deployment config, and then have a command like {{bin/solr exampledata}} capable of loading datasets into any of the example topologies.

Anyway, I'm going to punt on this for now to avoid any sort of rush on sorting that out.


was (Author: gerlowskija):
I've attached an updated version of this patch.  This version massages the CLI syntax to more closely fit other {{bin/solr}} commands.  It also cleans up some issues around starting/stopping Zeppelin and puts better help text in place (which in turn dragged in a few small SolrCLI refactors).  

bq. We drag in some extra interpreters like kotlin and influxdb, in a "perfect world" we wouldn't worry about them.

I took a look at this.  Zeppelin offers two downloads - one that includes *all* interpreters, and one that only includes a minimal set.  I assumed I'd accidentally used the former instead of the latter, but it turns out that the patch *does* use the minimal download (it's just not all that minimal).  I'm going to open a Zeppelin ticket to discuss making the minimal distribution moreso, but we're stuck for the current Zeppelin release at least. 

Still definitely on my list is testing on Windows and a fix for the {{update_interpreter}} subcommand.  If I can clear those away soon I'll be looking to merge in the next week or so, so I'd love any testing help that people could offer on their own systems.

----

To get back to the question around making the nyc311 dataset available.  I def agree that we should allow that, but I'm unsure about the approach so I'd rather tackle it in a separate ticket.

I think I mentioned earlier potentially exposing this using bin/solr's {{-e example}} mechanism, but on second thought I'm less sure of this approach.  Currently, Solr "examples" couple together the node/core topology with the dataset.  e.g.  {{-e techproducts}} can only be used with Solr standalone.  Which is less than ideal.  Ideally you could run something like {{bin/solr example}} to set up a particular topology or deployment config, and then have a command like {{bin/solr exampledata}} capable of loading datasets into any of the example topologies.

Anyway, I'm going to punt on this for now to avoid any sort of rush on sorting that out.

> Apache Zeppelin Sandbox Integration  
> -------------------------------------
>
>                 Key: SOLR-15080
>                 URL: https://issues.apache.org/jira/browse/SOLR-15080
>             Project: Solr
>          Issue Type: New Feature
>            Reporter: Jason Gerlowski
>            Assignee: Jason Gerlowski
>            Priority: Major
>         Attachments: SOLR-15080.patch, SOLR-15080.patch
>
>
> With the steady expansion of Solr's "Math Expression" and "Streaming Expression" libraries, Solr has a lot of analytics and data exploration capabilities to show off in a "notebook" environment.  Case in point - the "Visual Guide to Math Expressions" being worked on in SOLR-13105.  These docs make heavy use of screenshots taken from Zeppelin, a popular notebook project run by the ASF.  Interested readers are going to want to try their own hand at replicating the specific visualizations showed off in those docs, and in using Solr's analytics capabilities more broadly.
> Zeppelin isn't hard to set up and run, but there are a few steps that might deter or thwart unfamiliar users.  I'd love to see Solr make this easier by offering some sort of integration point with Zeppelin to get users up and running.
> I'm still up in the air on what form would be best for such an integration.  But as a strawman I've attached a patch that creates a "zeppelin" tool for "bin/solr".
> This tool is in the same spirit as our Solr "examples" in that it sets a user up to play with a particular use case without any fuss or configuration on their part.  It will install Zeppelin, the Zeppelin "interpreter" needed to talk to Solr, and the Zeppelin configs necessary to talk to a local Solr.  It contains other commands to start/stop Zeppelin and clean out the Zeppelin sandbox, but draws the line there in terms of exposing Zeppelin functionality more broadly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org