You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by shikha sharma <sh...@gmail.com> on 2022/11/14 11:01:51 UTC

Connection issue

Hello,

I am trying to connect to kafka using this command:
orderRawData = spark.readStream \
    .format("kafka") \
    .option("kafka.bootstrap.servers", "18.211.252.152:9092") \
    .option("startingOffsets","earliest") \
    .option("failOnDataLoss", "false") \
    .option("subscribe", "real-time-project") \
    .load()

It is giving me error as:

'Failed to find data source: kafka. Please deploy the application as
per the deployment section of "Structured Streaming + Kafka
Integration Guide".;'
Traceback (most recent call last):
  File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/sql/streaming.py",
line 400, in load
    return self._df(self._jreader.load())
  File "/usr/lib/spark/python/lib/py4j-0.10.7-src.zip/py4j/java_gateway.py",
line 1257, in __call__
    answer, self.gateway_client, self.target_id, self.name)
  File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/sql/utils.py",
line 69, in deco
    raise AnalysisException(s.split(': ', 1)[1], stackTrace)
pyspark.sql.utils.AnalysisException: 'Failed to find data source:
kafka. Please deploy the application as per the deployment section of
"Structured Streaming + Kafka Integration Guide".;'




could you please help me with this.

Re: Connection issue

Posted by Luke Chen <sh...@gmail.com>.
Hi shikha,

I think you should ask in Spark community.

Thanks
Luke

On Tue, Nov 15, 2022 at 3:17 AM shikha sharma <sh...@gmail.com>
wrote:

> Hello,
>
> I am trying to connect to kafka using this command:
> orderRawData = spark.readStream \
>     .format("kafka") \
>     .option("kafka.bootstrap.servers", "18.211.252.152:9092") \
>     .option("startingOffsets","earliest") \
>     .option("failOnDataLoss", "false") \
>     .option("subscribe", "real-time-project") \
>     .load()
>
> It is giving me error as:
>
> 'Failed to find data source: kafka. Please deploy the application as
> per the deployment section of "Structured Streaming + Kafka
> Integration Guide".;'
> Traceback (most recent call last):
>   File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/sql/streaming.py",
> line 400, in load
>     return self._df(self._jreader.load())
>   File
> "/usr/lib/spark/python/lib/py4j-0.10.7-src.zip/py4j/java_gateway.py",
> line 1257, in __call__
>     answer, self.gateway_client, self.target_id, self.name)
>   File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/sql/utils.py",
> line 69, in deco
>     raise AnalysisException(s.split(': ', 1)[1], stackTrace)
> pyspark.sql.utils.AnalysisException: 'Failed to find data source:
> kafka. Please deploy the application as per the deployment section of
> "Structured Streaming + Kafka Integration Guide".;'
>
>
>
>
> could you please help me with this.
>