You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@seatunnel.apache.org by GitBox <gi...@apache.org> on 2022/05/14 10:59:52 UTC
[GitHub] [incubator-seatunnel] chenhu opened a new issue, #1875: [Bug] [seatunnel-connector-spark-tidb] The dependency "tispark-assembly" should cause "Multiple sources found for parquet xxxxx"
chenhu opened a new issue, #1875:
URL: https://github.com/apache/incubator-seatunnel/issues/1875
### Search before asking
- [X] I had searched in the [issues](https://github.com/apache/incubator-seatunnel/issues?q=is%3Aissue+label%3A%22bug%22) and found no similar issues.
### What happened
Read hudi datasource in spark3.x, the exception below :
Caused by: org.apache.spark.sql.AnalysisException: Multiple sources found for parquet (org.apache.spark.sql.execution.datasources.v2.parquet.ParquetDataSourceV2, org.apache.spark.sql.execution.datasources.parquet.ParquetFileFormat), please specify the fully qualified class name.
I have search the dependencies by mvn dependency:tree, and found the dependency "tispark-assembly" in the seatunnel-connector-spark-tidb module , also having the spark-sql indirect dependency of spark-sql, this will import multiple version of spark-sql, and will cause some unreasonable issue in the future。
### SeaTunnel Version
2.1.1
### SeaTunnel Config
```conf
nothing
```
### Running Command
```shell
nothing
```
### Error Exception
```log
Caused by: org.apache.spark.sql.AnalysisException: Multiple sources found for parquet (org.apache.spark.sql.execution.datasources.v2.parquet.ParquetDataSourceV2, org.apache.spark.sql.execution.datasources.parquet.ParquetFileFormat), please specify the fully qualified class name.
```
### Flink or Spark Version
spark3.1.1 (self complied version)
### Java or Scala Version
java8
### Screenshots
_No response_
### Are you willing to submit PR?
- [X] Yes I am willing to submit a PR!
### Code of Conduct
- [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@seatunnel.apache.org.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-seatunnel] yezhengli-Mr9 commented on issue #1875: [Bug] [seatunnel-connector-spark-tidb] The dependency "tispark-assembly" should cause "Multiple sources found for parquet xxxxx"
Posted by "yezhengli-Mr9 (via GitHub)" <gi...@apache.org>.
yezhengli-Mr9 commented on issue #1875:
URL: https://github.com/apache/incubator-seatunnel/issues/1875#issuecomment-1467320877
How is this resolved? That is, read `hudi` but confront
```java
Multiple sources found for parquet (org.apache.spark.sql.execution.datasources.v2.parquet.ParquetDataSourceV2, org.apache.spark.sql.execution.datasources.parquet.ParquetFileFormat),
```
I am with spark 3.3 and scala 1.12.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@seatunnel.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org