You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by "Chaushu, Shani" <sh...@intel.com> on 2015/03/31 08:31:29 UTC
Spark-Solr in python
Hi,
I saw there is a tool for reading solr into Spark RDD in JAVA
I want to do something like this in python, is there any package in python for reading solr into spark RDD?
Thanks ,
Shani
---------------------------------------------------------------------
Intel Electronics Ltd.
This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
RE: Spark-Solr in python
Posted by "Chaushu, Shani" <sh...@intel.com>.
There is a package of python with solr-cloud
https://pypi.python.org/pypi/solrcloudpy
but I don't know if there is possibility to connect it to spark
-----Original Message-----
From: Timothy Potter [mailto:thelabdude@gmail.com]
Sent: Tuesday, March 31, 2015 23:15
To: solr-user@lucene.apache.org
Subject: Re: Spark-Solr in python
You'll need a python lib that uses a python ZooKeeper client to be SolrCloud-aware so that you can do RDD like things, such as reading from all shards in a collection in parallel. I'm not aware of any Solr py libs that are cloud-aware yet, but it would be a good contribution to upgrade https://github.com/toastdriven/pysolr to be SolrCloud-aware
On Mon, Mar 30, 2015 at 11:31 PM, Chaushu, Shani <sh...@intel.com> wrote:
> Hi,
> I saw there is a tool for reading solr into Spark RDD in JAVA I want
> to do something like this in python, is there any package in python for reading solr into spark RDD?
>
> Thanks ,
> Shani
>
>
> ---------------------------------------------------------------------
> Intel Electronics Ltd.
>
> This e-mail and any attachments may contain confidential material for
> the sole use of the intended recipient(s). Any review or distribution
> by others is strictly prohibited. If you are not the intended
> recipient, please contact the sender and delete all copies.
---------------------------------------------------------------------
Intel Electronics Ltd.
This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
RE: Spark-Solr in python
Posted by "Davis, Daniel (NIH/NLM) [C]" <da...@nih.gov>.
There is a pull request for that - https://github.com/toastdriven/pysolr/pull/138. Depending on how you install Python modules, you could grab the cone for the feature, and run that version.
-----Original Message-----
From: Timothy Potter [mailto:thelabdude@gmail.com]
Sent: Tuesday, March 31, 2015 4:15 PM
To: solr-user@lucene.apache.org
Subject: Re: Spark-Solr in python
You'll need a python lib that uses a python ZooKeeper client to be SolrCloud-aware so that you can do RDD like things, such as reading from all shards in a collection in parallel. I'm not aware of any Solr py libs that are cloud-aware yet, but it would be a good contribution to upgrade https://github.com/toastdriven/pysolr to be SolrCloud-aware
On Mon, Mar 30, 2015 at 11:31 PM, Chaushu, Shani <sh...@intel.com> wrote:
> Hi,
> I saw there is a tool for reading solr into Spark RDD in JAVA I want
> to do something like this in python, is there any package in python for reading solr into spark RDD?
>
> Thanks ,
> Shani
>
>
> ---------------------------------------------------------------------
> Intel Electronics Ltd.
>
> This e-mail and any attachments may contain confidential material for
> the sole use of the intended recipient(s). Any review or distribution
> by others is strictly prohibited. If you are not the intended
> recipient, please contact the sender and delete all copies.
Re: Spark-Solr in python
Posted by Timothy Potter <th...@gmail.com>.
You'll need a python lib that uses a python ZooKeeper client to be
SolrCloud-aware so that you can do RDD like things, such as reading
from all shards in a collection in parallel. I'm not aware of any Solr
py libs that are cloud-aware yet, but it would be a good contribution
to upgrade https://github.com/toastdriven/pysolr to be SolrCloud-aware
On Mon, Mar 30, 2015 at 11:31 PM, Chaushu, Shani
<sh...@intel.com> wrote:
> Hi,
> I saw there is a tool for reading solr into Spark RDD in JAVA
> I want to do something like this in python, is there any package in python for reading solr into spark RDD?
>
> Thanks ,
> Shani
>
>
> ---------------------------------------------------------------------
> Intel Electronics Ltd.
>
> This e-mail and any attachments may contain confidential material for
> the sole use of the intended recipient(s). Any review or distribution
> by others is strictly prohibited. If you are not the intended
> recipient, please contact the sender and delete all copies.