You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by karan alang <ka...@gmail.com> on 2022/02/18 07:44:29 UTC
GCP Dataproc - getting error in importing KafkaProducer
Hello All,
I've a GCP Dataproc cluster, and i'm running a Spark StructuredStreaming
job on this.
I'm trying to use KafkaProducer to push aggregated data into a Kafka
topic, however when i import KafkaProducer (from kafka import
KafkaProducer), it gives error
```
Traceback (most recent call last):
File
"/tmp/7e27e272e64b461dbdc2e5083dc23202/StructuredStreaming_GCP_Versa_Sase_gcloud.py",
line 14, in <module>
from kafka.producer import KafkaProducer
File "/opt/conda/default/lib/python3.8/site-packages/kafka/__init__.py",
line 23, in <module>
from kafka.producer import KafkaProducer
File
"/opt/conda/default/lib/python3.8/site-packages/kafka/producer/__init__.py",
line 4, in <module>
from .simple import SimpleProducer
File
"/opt/conda/default/lib/python3.8/site-packages/kafka/producer/simple.py",
line 54
return '<SimpleProducer batch=%s>' % self.async
```
As part of the initialization actions, i'm installing the following :
---
pip install pypi
pip install kafka-python
pip install google-cloud-storage
pip install pandas
---
Additional details in stackoverflow :
https://stackoverflow.com/questions/71169869/gcp-dataproc-getting-error-in-importing-kafkaproducer
Any ideas on what needs to be to fix this ?
tia!
Re: GCP Dataproc - getting error in importing KafkaProducer
Posted by Mich Talebzadeh <mi...@gmail.com>.
On Dataproc package kafka-python does not exist not installed as standard
sudo su - to root and install it as above
as root
pip list|grep kafka
root@ctpcluster-m:~#
pip install kafka-python
Collecting kafka-python
Downloading kafka_python-2.0.2-py2.py3-none-any.whl (246 kB)
|████████████████████████████████| 246 kB 22.0 MB/s
Installing collected packages: kafka-python
Successfully installed kafka-python-2.0.2
hduser@ctpcluster-m: /home/hduser> pip list|grep kafka
kafka-python 2.0.2
HTH
view my Linkedin profile
<https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
https://en.everybodywiki.com/Mich_Talebzadeh
*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.
On Fri, 18 Feb 2022 at 08:39, Mich Talebzadeh <mi...@gmail.com>
wrote:
> Have you installed the correct package kafka-python?
>
> *pip install kafka-python*
> Collecting kafka-python
> Downloading kafka_python-2.0.2-py2.py3-none-any.whl (246 kB)
> |████████████████████████████████| 246 kB 1.9 MB/s
> Installing collected packages: kafka-python
> Successfully installed kafka-python-2.0.2
>
>
> *pip list|grep kafka*
> *kafka-python 2.0.2*
>
> *python3*
> Python 3.7.3 (default, Apr 3 2021, 20:42:31)
> [GCC 4.8.5 20150623 (Red Hat 4.8.5-39)] on linux
> Type "help", "copyright", "credits" or "license" for more information.
> *>>> from kafka import KafkaProducer*
> *>>>*
>
>
> view my Linkedin profile
> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>
>
> https://en.everybodywiki.com/Mich_Talebzadeh
>
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
>
> On Fri, 18 Feb 2022 at 07:45, karan alang <ka...@gmail.com> wrote:
>
>> Hello All,
>>
>> I've a GCP Dataproc cluster, and i'm running a Spark StructuredStreaming
>> job on this.
>> I'm trying to use KafkaProducer to push aggregated data into a Kafka
>> topic, however when i import KafkaProducer (from kafka import
>> KafkaProducer), it gives error
>>
>> ```
>>
>> Traceback (most recent call last):
>>
>> File
>>
>> "/tmp/7e27e272e64b461dbdc2e5083dc23202/StructuredStreaming_GCP_Versa_Sase_gcloud.py",
>> line 14, in <module>
>>
>> from kafka.producer import KafkaProducer
>>
>> File "/opt/conda/default/lib/python3.8/site-packages/kafka/__init__.py",
>> line 23, in <module>
>>
>> from kafka.producer import KafkaProducer
>>
>> File
>>
>> "/opt/conda/default/lib/python3.8/site-packages/kafka/producer/__init__.py",
>> line 4, in <module>
>>
>> from .simple import SimpleProducer
>>
>> File
>> "/opt/conda/default/lib/python3.8/site-packages/kafka/producer/simple.py",
>> line 54
>>
>> return '<SimpleProducer batch=%s>' % self.async
>> ```
>>
>> As part of the initialization actions, i'm installing the following :
>> ---
>>
>> pip install pypi
>> pip install kafka-python
>> pip install google-cloud-storage
>> pip install pandas
>>
>> ---
>>
>> Additional details in stackoverflow :
>>
>> https://stackoverflow.com/questions/71169869/gcp-dataproc-getting-error-in-importing-kafkaproducer
>>
>> Any ideas on what needs to be to fix this ?
>> tia!
>>
>
Re: GCP Dataproc - getting error in importing KafkaProducer
Posted by Mich Talebzadeh <mi...@gmail.com>.
Have you installed the correct package kafka-python?
*pip install kafka-python*
Collecting kafka-python
Downloading kafka_python-2.0.2-py2.py3-none-any.whl (246 kB)
|████████████████████████████████| 246 kB 1.9 MB/s
Installing collected packages: kafka-python
Successfully installed kafka-python-2.0.2
*pip list|grep kafka*
*kafka-python 2.0.2*
*python3*
Python 3.7.3 (default, Apr 3 2021, 20:42:31)
[GCC 4.8.5 20150623 (Red Hat 4.8.5-39)] on linux
Type "help", "copyright", "credits" or "license" for more information.
*>>> from kafka import KafkaProducer*
*>>>*
view my Linkedin profile
<https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
https://en.everybodywiki.com/Mich_Talebzadeh
*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.
On Fri, 18 Feb 2022 at 07:45, karan alang <ka...@gmail.com> wrote:
> Hello All,
>
> I've a GCP Dataproc cluster, and i'm running a Spark StructuredStreaming
> job on this.
> I'm trying to use KafkaProducer to push aggregated data into a Kafka
> topic, however when i import KafkaProducer (from kafka import
> KafkaProducer), it gives error
>
> ```
>
> Traceback (most recent call last):
>
> File
>
> "/tmp/7e27e272e64b461dbdc2e5083dc23202/StructuredStreaming_GCP_Versa_Sase_gcloud.py",
> line 14, in <module>
>
> from kafka.producer import KafkaProducer
>
> File "/opt/conda/default/lib/python3.8/site-packages/kafka/__init__.py",
> line 23, in <module>
>
> from kafka.producer import KafkaProducer
>
> File
>
> "/opt/conda/default/lib/python3.8/site-packages/kafka/producer/__init__.py",
> line 4, in <module>
>
> from .simple import SimpleProducer
>
> File
> "/opt/conda/default/lib/python3.8/site-packages/kafka/producer/simple.py",
> line 54
>
> return '<SimpleProducer batch=%s>' % self.async
> ```
>
> As part of the initialization actions, i'm installing the following :
> ---
>
> pip install pypi
> pip install kafka-python
> pip install google-cloud-storage
> pip install pandas
>
> ---
>
> Additional details in stackoverflow :
>
> https://stackoverflow.com/questions/71169869/gcp-dataproc-getting-error-in-importing-kafkaproducer
>
> Any ideas on what needs to be to fix this ?
> tia!
>