You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@spark.apache.org by "amaliujia (via GitHub)" <gi...@apache.org> on 2023/02/16 00:22:37 UTC

[GitHub] [spark] amaliujia commented on a diff in pull request #40025: [SPARK-42457][CONNECT] Adding SparkSession#read

amaliujia commented on code in PR #40025:
URL: https://github.com/apache/spark/pull/40025#discussion_r1107880879


##########
connector/connect/client/jvm/src/test/scala/org/apache/spark/sql/PlanGenerationTestSuite.scala:
##########
@@ -184,6 +194,38 @@ class PlanGenerationTestSuite extends ConnectFunSuite with BeforeAndAfterAll wit
     session.range(1, 10, 1, 2)
   }
 
+  test("read") {
+    session.read.format("text")
+      .schema(StructType(StructField("name", StringType) :: StructField("age", IntegerType) :: Nil))
+      .option("op1", "op1")
+      .options(Map("op2" -> "op2"))
+      .load(testDataPath.resolve("people.txt").toString)

Review Comment:
   I might be wrong but I am thinking if you offer a schema in the proto then the server side might not need to load the data: it loads the data as it needs to infer the scheme by reading the data directly when a schema is not set. I am not sure though besides this case if server side has to read the data directly. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org