You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2022/12/27 06:51:44 UTC

[GitHub] [spark] LuciferYang opened a new pull request, #39233: [SPARK-41676][CORE][SQL] Protobuf serializer for `StreamingQueryData`

LuciferYang opened a new pull request, #39233:
URL: https://github.com/apache/spark/pull/39233

   ### What changes were proposed in this pull request?
   Add Protobuf serializer for `StreamingQueryData`
   
   
   
   ### Why are the changes needed?
   Support fast and compact serialization/deserialization for `StreamingQueryData` over RocksDB.
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   
   ### How was this patch tested?
   Add new UT


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] LuciferYang commented on a diff in pull request #39233: [SPARK-41676][CORE][SQL][SS][UI] Protobuf serializer for `StreamingQueryData`

Posted by GitBox <gi...@apache.org>.
LuciferYang commented on code in PR #39233:
URL: https://github.com/apache/spark/pull/39233#discussion_r1059228544


##########
sql/core/src/main/scala/org/apache/spark/sql/streaming/ui/StreamingQueryStatusListener.scala:
##########
@@ -115,7 +115,7 @@ private[sql] class StreamingQueryStatusListener(
   }
 }
 
-private[sql] class StreamingQueryData(
+private[spark] class StreamingQueryData(

Review Comment:
   [15c0cf9](https://github.com/apache/spark/pull/39233/commits/15c0cf996ce4b0eff7d0168915fb9a9ccb6938b6) change `private[sql]` to `private[spark]` and move `StreamingQueryDataSerializer` to package `org.apache.spark.status.protobuf.sql`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] LuciferYang commented on a diff in pull request #39233: [SPARK-41676][CORE][SQL][SS] Protobuf serializer for `StreamingQueryData`

Posted by GitBox <gi...@apache.org>.
LuciferYang commented on code in PR #39233:
URL: https://github.com/apache/spark/pull/39233#discussion_r1059219926


##########
sql/core/src/test/scala/org/apache/spark/sql/streaming/protobuf/KVStoreProtobufSerializerSuite.scala:
##########
@@ -0,0 +1,51 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.streaming.protobuf
+
+import java.util.UUID
+
+import org.apache.spark.SparkFunSuite
+import org.apache.spark.sql.streaming.ui.StreamingQueryData
+import org.apache.spark.status.protobuf.KVStoreProtobufSerializer
+
+class KVStoreProtobufSerializerSuite extends SparkFunSuite {
+
+  private val serializer = new KVStoreProtobufSerializer()
+
+  test("StreamingQueryData") {
+    val id = UUID.randomUUID()
+    val input = new StreamingQueryData(
+      name = "some-query",
+      id = id,
+      runId = id.toString,

Review Comment:
   done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] LuciferYang commented on a diff in pull request #39233: [SPARK-41676][CORE][SQL] Protobuf serializer for `StreamingQueryData`

Posted by GitBox <gi...@apache.org>.
LuciferYang commented on code in PR #39233:
URL: https://github.com/apache/spark/pull/39233#discussion_r1057478429


##########
sql/core/src/main/scala/org/apache/spark/sql/streaming/protobuf/StreamingQueryDataSerializer.scala:
##########
@@ -0,0 +1,60 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.streaming.protobuf

Review Comment:
   @gengliangwang Should it be in this package? Please correct me ~ Thanks 
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] gengliangwang commented on a diff in pull request #39233: [SPARK-41676][CORE][SQL][SS] Protobuf serializer for `StreamingQueryData`

Posted by GitBox <gi...@apache.org>.
gengliangwang commented on code in PR #39233:
URL: https://github.com/apache/spark/pull/39233#discussion_r1059198133


##########
sql/core/src/test/scala/org/apache/spark/sql/streaming/protobuf/KVStoreProtobufSerializerSuite.scala:
##########
@@ -0,0 +1,51 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.streaming.protobuf
+
+import java.util.UUID
+
+import org.apache.spark.SparkFunSuite
+import org.apache.spark.sql.streaming.ui.StreamingQueryData
+import org.apache.spark.status.protobuf.KVStoreProtobufSerializer
+
+class KVStoreProtobufSerializerSuite extends SparkFunSuite {
+
+  private val serializer = new KVStoreProtobufSerializer()
+
+  test("StreamingQueryData") {
+    val id = UUID.randomUUID()
+    val input = new StreamingQueryData(
+      name = "some-query",
+      id = id,
+      runId = id.toString,

Review Comment:
   nit: let's have a different input from `id` in case there is mistake in the serializer.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] LuciferYang commented on pull request #39233: [SPARK-41676][CORE][SQL][SS][UI] Protobuf serializer for `StreamingQueryData`

Posted by GitBox <gi...@apache.org>.
LuciferYang commented on PR #39233:
URL: https://github.com/apache/spark/pull/39233#issuecomment-1368417539

   Thanks @gengliangwang 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] LuciferYang commented on a diff in pull request #39233: [SPARK-41676][CORE][SQL][SS] Protobuf serializer for `StreamingQueryData`

Posted by GitBox <gi...@apache.org>.
LuciferYang commented on code in PR #39233:
URL: https://github.com/apache/spark/pull/39233#discussion_r1059220189


##########
sql/core/src/main/scala/org/apache/spark/sql/streaming/protobuf/StreamingQueryDataSerializer.scala:
##########
@@ -0,0 +1,60 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.streaming.protobuf

Review Comment:
   `StreamingQueryData` with `private[sql]` package access scope,  can't access from `org.apache.spark.status.protobuf.sql` package, need change to `private[spark]` like `JobDataWrapper`. Need to change?
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] gengliangwang commented on a diff in pull request #39233: [SPARK-41676][CORE][SQL][SS] Protobuf serializer for `StreamingQueryData`

Posted by GitBox <gi...@apache.org>.
gengliangwang commented on code in PR #39233:
URL: https://github.com/apache/spark/pull/39233#discussion_r1059227209


##########
sql/core/src/main/scala/org/apache/spark/sql/streaming/protobuf/StreamingQueryDataSerializer.scala:
##########
@@ -0,0 +1,60 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.streaming.protobuf

Review Comment:
   Yeah it's ok to have such changes



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] gengliangwang commented on a diff in pull request #39233: [SPARK-41676][CORE][SQL][SS] Protobuf serializer for `StreamingQueryData`

Posted by GitBox <gi...@apache.org>.
gengliangwang commented on code in PR #39233:
URL: https://github.com/apache/spark/pull/39233#discussion_r1059198448


##########
sql/core/src/main/scala/org/apache/spark/sql/streaming/protobuf/StreamingQueryDataSerializer.scala:
##########
@@ -0,0 +1,60 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.streaming.protobuf

Review Comment:
   Shall we put it under the same directory of `SQLExecutionUIDataSerializer`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] gengliangwang commented on pull request #39233: [SPARK-41676][CORE][SQL][SS][UI] Protobuf serializer for `StreamingQueryData`

Posted by GitBox <gi...@apache.org>.
gengliangwang commented on PR #39233:
URL: https://github.com/apache/spark/pull/39233#issuecomment-1368294199

   Thanks, merging to master


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] LuciferYang commented on pull request #39233: [SPARK-41676][CORE][SQL][SS][UI] Protobuf serializer for `StreamingQueryData`

Posted by GitBox <gi...@apache.org>.
LuciferYang commented on PR #39233:
URL: https://github.com/apache/spark/pull/39233#issuecomment-1368242387

   https://github.com/LuciferYang/spark/actions/runs/3809234138
   
   <img width="1037" alt="image" src="https://user-images.githubusercontent.com/1475305/210141342-90557bcd-3e02-4e0d-af54-d4ae73fcd520.png">
   
   GA passed, just report not update.
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] LuciferYang commented on a diff in pull request #39233: [SPARK-41676][CORE][SQL][SS] Protobuf serializer for `StreamingQueryData`

Posted by GitBox <gi...@apache.org>.
LuciferYang commented on code in PR #39233:
URL: https://github.com/apache/spark/pull/39233#discussion_r1059220189


##########
sql/core/src/main/scala/org/apache/spark/sql/streaming/protobuf/StreamingQueryDataSerializer.scala:
##########
@@ -0,0 +1,60 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.streaming.protobuf

Review Comment:
   `StreamingQueryData` is `private[sql]`,  can't access from `org.apache.spark.status.protobuf.sql` package, need change to `private[spark]` like `JobDataWrapper`. Need to change?
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] gengliangwang closed pull request #39233: [SPARK-41676][CORE][SQL][SS][UI] Protobuf serializer for `StreamingQueryData`

Posted by GitBox <gi...@apache.org>.
gengliangwang closed pull request #39233: [SPARK-41676][CORE][SQL][SS][UI] Protobuf serializer for `StreamingQueryData`
URL: https://github.com/apache/spark/pull/39233


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org