You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2021/03/17 09:36:52 UTC

[GitHub] [spark] xy1024xiangyu commented on pull request #17916: [SPARK-20590][SQL] Use Spark internal datasource if multiples are found for the same shorten name

xy1024xiangyu commented on pull request #17916:
URL: https://github.com/apache/spark/pull/17916#issuecomment-800938475


   @HyukjinKwon @cloud-fan , according to the discussion, it seemed that the  "Multiple sources found for csv" issue has been solved. However, when I running my Java jar, an error happens. 
   The Java code is as follows:
   `DataFrameReader read = spark.read();`
   `JavaRDD<String> stringJavaRDD = read.textFile(inputPath).javaRDD();` 
   
   When running the Java code in IDE, the program works well. However when using `spark-submit`, the error as follows:  
   
   `org.apache.spark.sql.AnalysisException: Multiple sources found for text (org.apache.spark.sql.execution.datasources.v2.text.TextDataSourceV2, org.apache.spark.sql.execution.datasources.text.TextFileFormat), please specify the fully qualified class name.;
       at org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSource(DataSource.scala:707)
       at org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSourceV2(DataSource.scala:733)
       at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:248)
       at org.apache.spark.sql.DataFrameReader.text(DataFrameReader.scala:843)
       at org.apache.spark.sql.DataFrameReader.textFile(DataFrameReader.scala:880)
       at org.apache.spark.sql.DataFrameReader.textFile(DataFrameReader.scala:852)
       at com.three2three.bigfoot.vola.NormalizeSnapshotSigmaAxisImpliedVola.main(NormalizeSnapshotSigmaAxisImpliedVola.java:306)
       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
       at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
       at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
       at java.lang.reflect.Method.invoke(Method.java:498)
       at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
       at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:928)
       at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
       at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
       at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
       at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1007)
       at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1016)
       at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)`
   
   
   
   Even, I change my code to 
   `DataFrameReader read = spark.read();`
   `JavaRDD<String> stringJavaRDD = read.format("org.apache.spark.sql.execution.datasources.text.TextFileFormat").textFile(inputPath).javaRDD();`  does not help with this problem. 
   
   Detailed description here: https://stackoverflow.com/questions/66664181/spark-multiple-sources-found-for-text
   
   Any idea how to solve this problem?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org