You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by "ericm-db (via GitHub)" <gi...@apache.org> on 2024/01/25 18:33:31 UTC

[PR] [SPARK-46864][SS] Onboard Arbitrary StateV2 onto New Error Class Framework [spark]

ericm-db opened a new pull request, #44883:
URL: https://github.com/apache/spark/pull/44883

   <!--
   Thanks for sending a pull request!  Here are some tips for you:
     1. If this is your first time, please read our contributor guidelines: https://spark.apache.org/contributing.html
     2. Ensure you have added or run the appropriate tests for your PR: https://spark.apache.org/developer-tools.html
     3. If the PR is unfinished, add '[WIP]' in your PR title, e.g., '[WIP][SPARK-XXXX] Your PR title ...'.
     4. Be sure to keep the PR description updated to reflect all changes.
     5. Please write your PR title to summarize what this PR proposes.
     6. If possible, provide a concise example to reproduce the issue for a faster review.
     7. If you want to add a new configuration, please read the guideline first for naming configurations in
        'core/src/main/scala/org/apache/spark/internal/config/ConfigEntry.scala'.
     8. If you want to add or modify an error type or message, please read the guideline first in
        'core/src/main/resources/error/README.md'.
   -->
   
   ### What changes were proposed in this pull request?
   <!--
   Please clarify what changes you are proposing. The purpose of this section is to outline the changes and how this PR fixes the issue. 
   If possible, please consider writing useful notes for better and faster reviews in your PR. See the examples below.
     1. If you refactor some codes with changing classes, showing the class hierarchy will help reviewers.
     2. If you fix some SQL features, you can provide some references of other DBMSes.
     3. If there is design documentation, please add the link.
     4. If there is a discussion in the mailing list, please add the link.
   -->
   This PR proposes to apply error class framework to the new data source, State API V2.
   
   ### Why are the changes needed?
   <!--
   Please clarify why the changes are needed. For instance,
     1. If you propose a new API, clarify the use case for a new API.
     2. If you fix a bug, you can clarify why it is a bug.
   -->
   Error class framework is a standard to represent all exceptions in Spark.
   
   ### Does this PR introduce _any_ user-facing change?
   <!--
   Note that it means *any* user-facing change including all aspects such as the documentation fix.
   If yes, please clarify the previous behavior and the change this PR proposes - provide the console output, description and/or an example to show the behavior difference if possible.
   If possible, please also clarify if this is a user-facing change compared to the released Spark versions or within the unreleased branches such as master.
   If no, write 'No'.
   -->
   No
   
   ### How was this patch tested?
   <!--
   If tests were added, say they were added here. Please make sure to add some test cases that check the changes thoroughly including negative and positive cases if possible.
   If it was tested in a way different from regular unit tests, please clarify how you tested step by step, ideally copy and paste-able, so that other reviewers can test and check, and descendants can verify in the future.
   If tests were not added, please describe why they were not added and/or why it was difficult to add.
   If benchmark tests were added, please run the benchmarks in GitHub Actions for the consistent environment, and the instructions could accord to: https://spark.apache.org/developer-tools.html#github-workflow-benchmarks.
   -->
   No test changes needed here.
   
   ### Was this patch authored or co-authored using generative AI tooling?
   <!--
   If generative AI tooling has been used in the process of authoring this patch, please include the
   phrase: 'Generated-by: ' followed by the name of the tool and its version.
   If no, write 'No'.
   Please refer to the [ASF Generative Tooling Guidance](https://www.apache.org/legal/generative-tooling.html) for details.
   -->
   No


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46864][SS] Onboard Arbitrary StateV2 onto New Error Class Framework [spark]

Posted by "anishshri-db (via GitHub)" <gi...@apache.org>.
anishshri-db commented on code in PR #44883:
URL: https://github.com/apache/spark/pull/44883#discussion_r1470378892


##########
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/StateStoreErrors.scala:
##########
@@ -0,0 +1,58 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.streaming.state
+
+import org.apache.spark.{SparkException, SparkUnsupportedOperationException}
+
+/**
+ * Object for grouping error messages from (most) exceptions thrown from State API V2
+ *
+ * ERROR_CLASS has a prefix of "TWS_" or "STATE_STORE_" to indicate where the error is from
+ */
+object StateStoreErrors {
+  def implicitKeyNotFound(
+    stateName: String): SparkException = {
+    SparkException.internalError(
+      msg = s"Implicit key not found in state store for stateName=$stateName",
+      category = "TWS.IMPLICIT_KEY_NOT_FOUND"
+    )
+  }
+
+  def multipleColumnFamiliesNotSupported(stateStoreProvider: String):

Review Comment:
   Can we rephrase this to use StateStore prefix as well ?
   
   `new StateStoreMultipleColumnFamiliesNotSupportedException` ?
   
   Same for one below ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46864][SS] Onboard Arbitrary StateV2 onto New Error Class Framework [spark]

Posted by "anishshri-db (via GitHub)" <gi...@apache.org>.
anishshri-db commented on code in PR #44883:
URL: https://github.com/apache/spark/pull/44883#discussion_r1470249805


##########
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/ValueStateImpl.scala:
##########
@@ -48,8 +48,7 @@ class ValueStateImpl[K, S](
   private def encodeKey(): UnsafeRow = {
     val keyOption = ImplicitGroupingKeyTracker.getImplicitKeyOption
     if (!keyOption.isDefined) {
-      throw new UnsupportedOperationException("Implicit key not found for operation on" +
-        s"stateName=$stateName")
+      throw StateStoreErrors.implicitKeyNotFound(stateName = stateName)

Review Comment:
   We can just pass `stateName` directly ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46864][SS] Onboard Arbitrary StateV2 onto New Error Class Framework [spark]

Posted by "ericm-db (via GitHub)" <gi...@apache.org>.
ericm-db commented on code in PR #44883:
URL: https://github.com/apache/spark/pull/44883#discussion_r1473464608


##########
common/utils/src/main/resources/error/error-classes.json:
##########
@@ -3230,6 +3236,18 @@
     ],
     "sqlState" : "0A000"
   },
+  "STATE_STORE_MULTIPLE_VALUES_PER_KEY" : {
+    "message" : [
+      "Store does not support multiple values per key"
+    ],
+    "sqlState" : "42802"
+  },
+  "STATE_STORE_UNSUPPORTED_OPERATION" : {
+    "message" : [
+      "<operationType> operation not supported with <entity>"
+    ],
+    "sqlState" : "XXKST"

Review Comment:
   No, a user cannot get here. This can stay as an internal error.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46864][SS] Onboard Arbitrary StateV2 onto New Error Class Framework [spark]

Posted by "anishshri-db (via GitHub)" <gi...@apache.org>.
anishshri-db commented on code in PR #44883:
URL: https://github.com/apache/spark/pull/44883#discussion_r1470035025


##########
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/StateStoreErrors.scala:
##########
@@ -0,0 +1,79 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.streaming.state
+
+import org.apache.spark.{SparkException, SparkRuntimeException, SparkUnsupportedOperationException}
+
+/**
+ * Object for grouping error messages from (most) exceptions thrown from State API V2
+ *
+ * ERROR_CLASS has a prefix of "TWS_" representing State API V2.
+ */
+object StateStoreErrors {
+  def implicitKeyNotFound(
+    stateName: String): SparkException = {
+    SparkException.internalError(
+      msg = s"Implicit key not found in state store for stateName=$stateName",
+      category = "TWS"
+    )
+  }
+
+  def multipleColumnFamilies(stateStoreProvider: String):
+  TransformWithStateMultipleColumnFamilies = {
+    new TransformWithStateMultipleColumnFamilies(stateStoreProvider)
+  }
+
+  def multipleValuesPerKey(): TransformWithStateMultipleValuesPerKey = {
+    new TransformWithStateMultipleValuesPerKey()
+  }
+
+  def unsupportedOperationException(operationName: String, entity: String):
+  TransformWithStateUnsupportedOperation = {
+    new TransformWithStateUnsupportedOperation(operationName, entity)
+  }
+  def valueShouldBeNonNull(typeOfState: String): TransformWithStateValueShouldBeNonNull = {
+    new TransformWithStateValueShouldBeNonNull(typeOfState)
+  }
+}
+
+class TransformWithStateMultipleColumnFamilies(stateStoreProvider: String)
+  extends SparkUnsupportedOperationException(
+    errorClass = "STAT_STORE_MULTIPLE_COLUMN_FAMILIES",
+    messageParameters = Map("stateStoreProvider" -> stateStoreProvider)
+  )
+
+// Used for ListState
+class TransformWithStateMultipleValuesPerKey()
+  extends SparkRuntimeException(
+    errorClass = "STATE_STORE_STORE_MULTIPLE_VALUES_PER_KEY",
+    messageParameters = Map.empty
+  )
+
+class TransformWithStateUnsupportedOperation(operationType: String, entity: String)

Review Comment:
   Where is this used ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46864][SS] Onboard Arbitrary StateV2 onto New Error Class Framework [spark]

Posted by "anishshri-db (via GitHub)" <gi...@apache.org>.
anishshri-db commented on code in PR #44883:
URL: https://github.com/apache/spark/pull/44883#discussion_r1470025065


##########
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/StateStoreErrors.scala:
##########
@@ -0,0 +1,79 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.streaming.state
+
+import org.apache.spark.{SparkException, SparkRuntimeException, SparkUnsupportedOperationException}
+
+/**
+ * Object for grouping error messages from (most) exceptions thrown from State API V2
+ *
+ * ERROR_CLASS has a prefix of "TWS_" representing State API V2.
+ */
+object StateStoreErrors {
+  def implicitKeyNotFound(
+    stateName: String): SparkException = {
+    SparkException.internalError(
+      msg = s"Implicit key not found in state store for stateName=$stateName",
+      category = "TWS"
+    )
+  }
+
+  def multipleColumnFamilies(stateStoreProvider: String):
+  TransformWithStateMultipleColumnFamilies = {
+    new TransformWithStateMultipleColumnFamilies(stateStoreProvider)
+  }
+
+  def multipleValuesPerKey(): TransformWithStateMultipleValuesPerKey = {
+    new TransformWithStateMultipleValuesPerKey()
+  }
+
+  def unsupportedOperationException(operationName: String, entity: String):
+  TransformWithStateUnsupportedOperation = {
+    new TransformWithStateUnsupportedOperation(operationName, entity)
+  }
+  def valueShouldBeNonNull(typeOfState: String): TransformWithStateValueShouldBeNonNull = {
+    new TransformWithStateValueShouldBeNonNull(typeOfState)
+  }
+}
+
+class TransformWithStateMultipleColumnFamilies(stateStoreProvider: String)
+  extends SparkUnsupportedOperationException(
+    errorClass = "STAT_STORE_MULTIPLE_COLUMN_FAMILIES",

Review Comment:
   typo here ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46864][SS] Onboard Arbitrary StateV2 onto New Error Class Framework [spark]

Posted by "srielau (via GitHub)" <gi...@apache.org>.
srielau commented on code in PR #44883:
URL: https://github.com/apache/spark/pull/44883#discussion_r1473357694


##########
common/utils/src/main/resources/error/error-classes.json:
##########
@@ -3230,6 +3236,18 @@
     ],
     "sqlState" : "0A000"
   },
+  "STATE_STORE_MULTIPLE_VALUES_PER_KEY" : {
+    "message" : [
+      "Store does not support multiple values per key"
+    ],
+    "sqlState" : "42802"
+  },
+  "STATE_STORE_UNSUPPORTED_OPERATION" : {
+    "message" : [
+      "<operationType> operation not supported with <entity>"
+    ],
+    "sqlState" : "XXKST"

Review Comment:
   Can a user get here? If so that would imply a 0A*** sqlstate rather than an internal error.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46864][SS] Onboard Arbitrary StateV2 onto New Error Class Framework [spark]

Posted by "ericm-db (via GitHub)" <gi...@apache.org>.
ericm-db commented on code in PR #44883:
URL: https://github.com/apache/spark/pull/44883#discussion_r1467924064


##########
common/utils/src/main/resources/error/error-classes.json:
##########
@@ -3383,6 +3401,18 @@
     ],
     "sqlState" : "428EK"
   },
+  "TWS_IMPLICIT_KEY_NOT_FOUND" : {

Review Comment:
   So do we still want to define this in `error-classes.json`, or just throw an InternalError from `StateStoreErrors`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46864][SS] Onboard Arbitrary StateV2 onto New Error Class Framework [spark]

Posted by "HeartSaVioR (via GitHub)" <gi...@apache.org>.
HeartSaVioR commented on code in PR #44883:
URL: https://github.com/apache/spark/pull/44883#discussion_r1467204895


##########
common/utils/src/main/resources/error/error-classes.json:
##########
@@ -3383,6 +3401,18 @@
     ],
     "sqlState" : "428EK"
   },
+  "TWS_IMPLICIT_KEY_NOT_FOUND" : {

Review Comment:
   This looks like an internal error. 
   @MaxGekk What would be the expectation? Do we always use internalError for all internal errors in Spark regardless of which feature triggers it?



##########
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/StateStoreErrors.scala:
##########
@@ -0,0 +1,81 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.streaming.state
+
+import org.apache.spark.{SparkRuntimeException, SparkUnsupportedOperationException}
+
+/**
+ * Object for grouping error messages from (most) exceptions thrown from State API V2
+ *
+ * ERROR_CLASS has a prefix of "STV2_" representing State API V2.
+ */
+object StateStoreErrors {
+  def implicitKeyNotFound(
+                           stateName: String): TransformWithStateImplicitKeyNotFound = {

Review Comment:
   nit: indentation is broken, 4 spaces



##########
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/StateStoreErrors.scala:
##########
@@ -0,0 +1,81 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.streaming.state
+
+import org.apache.spark.{SparkRuntimeException, SparkUnsupportedOperationException}
+
+/**
+ * Object for grouping error messages from (most) exceptions thrown from State API V2
+ *
+ * ERROR_CLASS has a prefix of "STV2_" representing State API V2.

Review Comment:
   Let's call it as TransformWithState rather than State API v2. We now have a finalize method name.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46864][SS] Onboard Arbitrary StateV2 onto New Error Class Framework [spark]

Posted by "anishshri-db (via GitHub)" <gi...@apache.org>.
anishshri-db commented on code in PR #44883:
URL: https://github.com/apache/spark/pull/44883#discussion_r1470377844


##########
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/StateStoreErrors.scala:
##########
@@ -0,0 +1,58 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.streaming.state
+
+import org.apache.spark.{SparkException, SparkUnsupportedOperationException}
+
+/**
+ * Object for grouping error messages from (most) exceptions thrown from State API V2
+ *
+ * ERROR_CLASS has a prefix of "TWS_" or "STATE_STORE_" to indicate where the error is from
+ */
+object StateStoreErrors {
+  def implicitKeyNotFound(
+    stateName: String): SparkException = {
+    SparkException.internalError(
+      msg = s"Implicit key not found in state store for stateName=$stateName",
+      category = "TWS.IMPLICIT_KEY_NOT_FOUND"
+    )
+  }
+
+  def multipleColumnFamiliesNotSupported(stateStoreProvider: String):
+    TransformWithStateMultipleColumnFamilies = {
+      new TransformWithStateMultipleColumnFamilies(stateStoreProvider)
+    }
+
+  def unsupportedOperationException(operationName: String, entity: String):
+    TransformWithStateUnsupportedOperation = {
+      new TransformWithStateUnsupportedOperation(operationName, entity)
+    }
+

Review Comment:
   nit: extra newline ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46864][SS] Onboard Arbitrary StateV2 onto New Error Class Framework [spark]

Posted by "anishshri-db (via GitHub)" <gi...@apache.org>.
anishshri-db commented on PR #44883:
URL: https://github.com/apache/spark/pull/44883#issuecomment-1915336370

   @ericm-db - test failures also seem relevant. could you please take a look ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46864][SS] Onboard Arbitrary StateV2 onto New Error Class Framework [spark]

Posted by "HeartSaVioR (via GitHub)" <gi...@apache.org>.
HeartSaVioR commented on PR #44883:
URL: https://github.com/apache/spark/pull/44883#issuecomment-1919987223

   I'll leave this for today to see @MaxGekk can chime in - I'll merge this by tomorrow if he has no more comment by today. He can also approve and merge by today.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46864][SS] Onboard Arbitrary StateV2 onto New Error Class Framework [spark]

Posted by "ericm-db (via GitHub)" <gi...@apache.org>.
ericm-db commented on code in PR #44883:
URL: https://github.com/apache/spark/pull/44883#discussion_r1470040177


##########
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/StateStoreErrors.scala:
##########
@@ -0,0 +1,79 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.streaming.state
+
+import org.apache.spark.{SparkException, SparkRuntimeException, SparkUnsupportedOperationException}
+
+/**
+ * Object for grouping error messages from (most) exceptions thrown from State API V2
+ *
+ * ERROR_CLASS has a prefix of "TWS_" representing State API V2.
+ */
+object StateStoreErrors {
+  def implicitKeyNotFound(
+    stateName: String): SparkException = {
+    SparkException.internalError(
+      msg = s"Implicit key not found in state store for stateName=$stateName",
+      category = "TWS"
+    )
+  }
+
+  def multipleColumnFamilies(stateStoreProvider: String):
+  TransformWithStateMultipleColumnFamilies = {
+    new TransformWithStateMultipleColumnFamilies(stateStoreProvider)
+  }
+
+  def multipleValuesPerKey(): TransformWithStateMultipleValuesPerKey = {
+    new TransformWithStateMultipleValuesPerKey()
+  }
+
+  def unsupportedOperationException(operationName: String, entity: String):
+  TransformWithStateUnsupportedOperation = {
+    new TransformWithStateUnsupportedOperation(operationName, entity)
+  }
+  def valueShouldBeNonNull(typeOfState: String): TransformWithStateValueShouldBeNonNull = {
+    new TransformWithStateValueShouldBeNonNull(typeOfState)
+  }
+}
+
+class TransformWithStateMultipleColumnFamilies(stateStoreProvider: String)
+  extends SparkUnsupportedOperationException(
+    errorClass = "STAT_STORE_MULTIPLE_COLUMN_FAMILIES",
+    messageParameters = Map("stateStoreProvider" -> stateStoreProvider)
+  )
+
+// Used for ListState
+class TransformWithStateMultipleValuesPerKey()
+  extends SparkRuntimeException(
+    errorClass = "STATE_STORE_STORE_MULTIPLE_VALUES_PER_KEY",
+    messageParameters = Map.empty
+  )
+
+class TransformWithStateUnsupportedOperation(operationType: String, entity: String)

Review Comment:
   I thought it would be good to have a general unsupportedoperation exception



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46864][SS] Onboard Arbitrary StateV2 onto New Error Class Framework [spark]

Posted by "ericm-db (via GitHub)" <gi...@apache.org>.
ericm-db commented on code in PR #44883:
URL: https://github.com/apache/spark/pull/44883#discussion_r1470060539


##########
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/StateStoreErrors.scala:
##########
@@ -0,0 +1,79 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.streaming.state
+
+import org.apache.spark.{SparkException, SparkRuntimeException, SparkUnsupportedOperationException}
+
+/**
+ * Object for grouping error messages from (most) exceptions thrown from State API V2
+ *
+ * ERROR_CLASS has a prefix of "TWS_" representing State API V2.
+ */
+object StateStoreErrors {
+  def implicitKeyNotFound(
+    stateName: String): SparkException = {
+    SparkException.internalError(
+      msg = s"Implicit key not found in state store for stateName=$stateName",
+      category = "TWS"
+    )
+  }
+
+  def multipleColumnFamilies(stateStoreProvider: String):
+  TransformWithStateMultipleColumnFamilies = {
+    new TransformWithStateMultipleColumnFamilies(stateStoreProvider)
+  }
+
+  def multipleValuesPerKey(): TransformWithStateMultipleValuesPerKey = {
+    new TransformWithStateMultipleValuesPerKey()
+  }
+
+  def unsupportedOperationException(operationName: String, entity: String):
+  TransformWithStateUnsupportedOperation = {
+    new TransformWithStateUnsupportedOperation(operationName, entity)
+  }
+  def valueShouldBeNonNull(typeOfState: String): TransformWithStateValueShouldBeNonNull = {
+    new TransformWithStateValueShouldBeNonNull(typeOfState)
+  }
+}
+
+class TransformWithStateMultipleColumnFamilies(stateStoreProvider: String)
+  extends SparkUnsupportedOperationException(
+    errorClass = "STAT_STORE_MULTIPLE_COLUMN_FAMILIES",
+    messageParameters = Map("stateStoreProvider" -> stateStoreProvider)
+  )
+
+// Used for ListState
+class TransformWithStateMultipleValuesPerKey()
+  extends SparkRuntimeException(
+    errorClass = "STATE_STORE_STORE_MULTIPLE_VALUES_PER_KEY",
+    messageParameters = Map.empty
+  )
+
+class TransformWithStateUnsupportedOperation(operationType: String, entity: String)

Review Comment:
   Just realized I use it here: https://github.com/apache/spark/pull/44883/files#diff-b7ff06fb7d0cd23d338c498a7119948e36377eff967a95c03bd449a66c815414R140



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46864][SS] Onboard Arbitrary StateV2 onto New Error Class Framework [spark]

Posted by "anishshri-db (via GitHub)" <gi...@apache.org>.
anishshri-db commented on code in PR #44883:
URL: https://github.com/apache/spark/pull/44883#discussion_r1470253315


##########
sql/core/src/test/scala/org/apache/spark/sql/streaming/TransformWithStateSuite.scala:
##########
@@ -167,4 +167,14 @@ class TransformWithStateValidationSuite extends StateStoreMetricsTest {
       }
     )
   }
+
+  test("transformWithState - creating colFamily with MemoryStateStore should fail") {
+    val stateStore = new MemoryStateStore()

Review Comment:
   Could we just use HDFSBackedStateStoreProvider instead ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46864][SS] Onboard Arbitrary StateV2 onto New Error Class Framework [spark]

Posted by "anishshri-db (via GitHub)" <gi...@apache.org>.
anishshri-db commented on code in PR #44883:
URL: https://github.com/apache/spark/pull/44883#discussion_r1470023373


##########
sql/core/src/test/scala/org/apache/spark/sql/execution/streaming/state/MemoryStateStore.scala:
##########
@@ -30,7 +30,7 @@ class MemoryStateStore extends StateStore() {
   }
 
   override def createColFamilyIfAbsent(colFamilyName: String): Unit = {
-    throw new UnsupportedOperationException("Creating multiple column families is not supported")
+    throw StateStoreErrors.multipleColumnFamilies("MemoryStateStoreProvider")

Review Comment:
   Is it possible to augment more tests to ensure that the NERF is being used correctly ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46864][SS] Onboard Arbitrary StateV2 onto New Error Class Framework [spark]

Posted by "HeartSaVioR (via GitHub)" <gi...@apache.org>.
HeartSaVioR closed pull request #44883: [SPARK-46864][SS] Onboard Arbitrary StateV2 onto New Error Class Framework
URL: https://github.com/apache/spark/pull/44883


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46864][SS] Onboard Arbitrary StateV2 onto New Error Class Framework [spark]

Posted by "ericm-db (via GitHub)" <gi...@apache.org>.
ericm-db commented on code in PR #44883:
URL: https://github.com/apache/spark/pull/44883#discussion_r1470396488


##########
docs/sql-error-conditions-internal-error-tws-error-class.md:
##########
@@ -0,0 +1,37 @@
+---
+layout: global
+title: INTERNAL_ERROR_TWS error class
+displayTitle: INTERNAL_ERROR_TWS error class
+license: |
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+
+     http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+---
+
+<!--
+  DO NOT EDIT THIS FILE.
+  It was generated automatically by `org.apache.spark.SparkThrowableSuite`.
+-->
+
+[SQLSTATE: XX000](sql-error-conditions-sqlstates.html#class-XX-internal-error)
+
+Internal error using TransformWithStateOperator: 
+
+This error class has the following derived error classes:
+
+## IMPLICIT_KEY_NOT_FOUND
+
+`<message>`
+

Review Comment:
   This is a generated file



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46864][SS] Onboard Arbitrary StateV2 onto New Error Class Framework [spark]

Posted by "anishshri-db (via GitHub)" <gi...@apache.org>.
anishshri-db commented on code in PR #44883:
URL: https://github.com/apache/spark/pull/44883#discussion_r1470379938


##########
sql/core/src/test/scala/org/apache/spark/sql/execution/streaming/state/ValueStateSuite.scala:
##########
@@ -184,4 +187,19 @@ class ValueStateSuite extends SharedSparkSession
       assert(testState2.get() === null)
     }
   }
+
+
+  test("Creating colFamily with HDFS should fail") {

Review Comment:
   nit: `colFamily with HDFSBackedStateStoreProvider should fail`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46864][SS] Onboard Arbitrary StateV2 onto New Error Class Framework [spark]

Posted by "anishshri-db (via GitHub)" <gi...@apache.org>.
anishshri-db commented on code in PR #44883:
URL: https://github.com/apache/spark/pull/44883#discussion_r1470026099


##########
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/StateStoreErrors.scala:
##########
@@ -0,0 +1,79 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.streaming.state
+
+import org.apache.spark.{SparkException, SparkRuntimeException, SparkUnsupportedOperationException}
+
+/**
+ * Object for grouping error messages from (most) exceptions thrown from State API V2
+ *
+ * ERROR_CLASS has a prefix of "TWS_" representing State API V2.

Review Comment:
   This needs to be updated ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46864][SS] Onboard Arbitrary StateV2 onto New Error Class Framework [spark]

Posted by "HeartSaVioR (via GitHub)" <gi...@apache.org>.
HeartSaVioR commented on PR #44883:
URL: https://github.com/apache/spark/pull/44883#issuecomment-1922203998

   Thanks! Merging to master.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46864][SS] Onboard Arbitrary StateV2 onto New Error Class Framework [spark]

Posted by "anishshri-db (via GitHub)" <gi...@apache.org>.
anishshri-db commented on code in PR #44883:
URL: https://github.com/apache/spark/pull/44883#discussion_r1470403998


##########
sql/core/src/test/scala/org/apache/spark/sql/execution/streaming/state/ValueStateSuite.scala:
##########
@@ -184,4 +187,19 @@ class ValueStateSuite extends SharedSparkSession
       assert(testState2.get() === null)
     }
   }
+

Review Comment:
   nit: extra newline ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46864][SS] Onboard Arbitrary StateV2 onto New Error Class Framework [spark]

Posted by "anishshri-db (via GitHub)" <gi...@apache.org>.
anishshri-db commented on code in PR #44883:
URL: https://github.com/apache/spark/pull/44883#discussion_r1470041828


##########
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/StateStoreErrors.scala:
##########
@@ -0,0 +1,79 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.streaming.state
+
+import org.apache.spark.{SparkException, SparkRuntimeException, SparkUnsupportedOperationException}
+
+/**
+ * Object for grouping error messages from (most) exceptions thrown from State API V2
+ *
+ * ERROR_CLASS has a prefix of "TWS_" representing State API V2.
+ */
+object StateStoreErrors {
+  def implicitKeyNotFound(
+    stateName: String): SparkException = {
+    SparkException.internalError(
+      msg = s"Implicit key not found in state store for stateName=$stateName",
+      category = "TWS"
+    )
+  }
+
+  def multipleColumnFamilies(stateStoreProvider: String):
+  TransformWithStateMultipleColumnFamilies = {
+    new TransformWithStateMultipleColumnFamilies(stateStoreProvider)
+  }
+
+  def multipleValuesPerKey(): TransformWithStateMultipleValuesPerKey = {
+    new TransformWithStateMultipleValuesPerKey()
+  }
+
+  def unsupportedOperationException(operationName: String, entity: String):
+  TransformWithStateUnsupportedOperation = {
+    new TransformWithStateUnsupportedOperation(operationName, entity)
+  }
+  def valueShouldBeNonNull(typeOfState: String): TransformWithStateValueShouldBeNonNull = {
+    new TransformWithStateValueShouldBeNonNull(typeOfState)
+  }
+}
+
+class TransformWithStateMultipleColumnFamilies(stateStoreProvider: String)
+  extends SparkUnsupportedOperationException(
+    errorClass = "STAT_STORE_MULTIPLE_COLUMN_FAMILIES",
+    messageParameters = Map("stateStoreProvider" -> stateStoreProvider)
+  )
+
+// Used for ListState
+class TransformWithStateMultipleValuesPerKey()
+  extends SparkRuntimeException(
+    errorClass = "STATE_STORE_STORE_MULTIPLE_VALUES_PER_KEY",
+    messageParameters = Map.empty
+  )
+
+class TransformWithStateUnsupportedOperation(operationType: String, entity: String)

Review Comment:
   ok - maybe we can just add as needed ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46864][SS] Onboard Arbitrary StateV2 onto New Error Class Framework [spark]

Posted by "anishshri-db (via GitHub)" <gi...@apache.org>.
anishshri-db commented on code in PR #44883:
URL: https://github.com/apache/spark/pull/44883#discussion_r1470251048


##########
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/HDFSBackedStateStoreProvider.scala:
##########
@@ -256,8 +256,7 @@ private[sql] class HDFSBackedStateStoreProvider extends StateStoreProvider with
 
     // TODO: add support for multiple col families with HDFSBackedStateStoreProvider
     if (useColumnFamilies) {
-      throw new UnsupportedOperationException("Multiple column families are not supported with " +
-        s"HDFSBackedStateStoreProvider")
+      throw StateStoreErrors.multipleColumnFamilies("HDFSStateStoreProvider")

Review Comment:
   Maybe rephrase as `multipleColumnFamiliesNotSupported` ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46864][SS] Onboard Arbitrary StateV2 onto New Error Class Framework [spark]

Posted by "MaxGekk (via GitHub)" <gi...@apache.org>.
MaxGekk commented on code in PR #44883:
URL: https://github.com/apache/spark/pull/44883#discussion_r1467323511


##########
common/utils/src/main/resources/error/error-classes.json:
##########
@@ -3383,6 +3401,18 @@
     ],
     "sqlState" : "428EK"
   },
+  "TWS_IMPLICIT_KEY_NOT_FOUND" : {

Review Comment:
   Yep. Also we have categories of internal errors, see
   https://github.com/apache/spark/blob/f9f413e5ff6abe00a664e2dc75fb0ade2ff2986a/common/utils/src/main/scala/org/apache/spark/SparkException.scala#L99
   You should add a suffix to your feature specific internal errors.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46864][SS] Onboard Arbitrary StateV2 onto New Error Class Framework [spark]

Posted by "anishshri-db (via GitHub)" <gi...@apache.org>.
anishshri-db commented on code in PR #44883:
URL: https://github.com/apache/spark/pull/44883#discussion_r1470251848


##########
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/StateStoreErrors.scala:
##########
@@ -0,0 +1,58 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.streaming.state
+
+import org.apache.spark.{SparkException, SparkUnsupportedOperationException}
+
+/**
+ * Object for grouping error messages from (most) exceptions thrown from State API V2
+ *
+ * ERROR_CLASS has a prefix of "TWS_" or "STATE_STORE_" to indicate where the error is from
+ */
+object StateStoreErrors {
+  def implicitKeyNotFound(
+    stateName: String): SparkException = {
+    SparkException.internalError(
+      msg = s"Implicit key not found in state store for stateName=$stateName",
+      category = "TWS.IMPLICIT_KEY_NOT_FOUND"
+    )
+  }
+
+  def multipleColumnFamilies(stateStoreProvider: String):
+  TransformWithStateMultipleColumnFamilies = {

Review Comment:
   Indent seems off ?



##########
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/StateStoreErrors.scala:
##########
@@ -0,0 +1,58 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.streaming.state
+
+import org.apache.spark.{SparkException, SparkUnsupportedOperationException}
+
+/**
+ * Object for grouping error messages from (most) exceptions thrown from State API V2
+ *
+ * ERROR_CLASS has a prefix of "TWS_" or "STATE_STORE_" to indicate where the error is from
+ */
+object StateStoreErrors {
+  def implicitKeyNotFound(
+    stateName: String): SparkException = {
+    SparkException.internalError(
+      msg = s"Implicit key not found in state store for stateName=$stateName",
+      category = "TWS.IMPLICIT_KEY_NOT_FOUND"
+    )
+  }
+
+  def multipleColumnFamilies(stateStoreProvider: String):
+  TransformWithStateMultipleColumnFamilies = {
+    new TransformWithStateMultipleColumnFamilies(stateStoreProvider)
+  }
+
+  def unsupportedOperationException(operationName: String, entity: String):
+  TransformWithStateUnsupportedOperation = {

Review Comment:
   same here ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46864][SS] Onboard Arbitrary StateV2 onto New Error Class Framework [spark]

Posted by "anishshri-db (via GitHub)" <gi...@apache.org>.
anishshri-db commented on code in PR #44883:
URL: https://github.com/apache/spark/pull/44883#discussion_r1470023703


##########
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/StateStoreErrors.scala:
##########
@@ -0,0 +1,79 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.streaming.state
+
+import org.apache.spark.{SparkException, SparkRuntimeException, SparkUnsupportedOperationException}
+
+/**
+ * Object for grouping error messages from (most) exceptions thrown from State API V2
+ *
+ * ERROR_CLASS has a prefix of "TWS_" representing State API V2.
+ */
+object StateStoreErrors {
+  def implicitKeyNotFound(
+    stateName: String): SparkException = {
+    SparkException.internalError(
+      msg = s"Implicit key not found in state store for stateName=$stateName",
+      category = "TWS"
+    )
+  }
+
+  def multipleColumnFamilies(stateStoreProvider: String):
+  TransformWithStateMultipleColumnFamilies = {
+    new TransformWithStateMultipleColumnFamilies(stateStoreProvider)
+  }
+
+  def multipleValuesPerKey(): TransformWithStateMultipleValuesPerKey = {
+    new TransformWithStateMultipleValuesPerKey()
+  }
+
+  def unsupportedOperationException(operationName: String, entity: String):
+  TransformWithStateUnsupportedOperation = {
+    new TransformWithStateUnsupportedOperation(operationName, entity)
+  }
+  def valueShouldBeNonNull(typeOfState: String): TransformWithStateValueShouldBeNonNull = {
+    new TransformWithStateValueShouldBeNonNull(typeOfState)
+  }
+}
+
+class TransformWithStateMultipleColumnFamilies(stateStoreProvider: String)
+  extends SparkUnsupportedOperationException(
+    errorClass = "STAT_STORE_MULTIPLE_COLUMN_FAMILIES",
+    messageParameters = Map("stateStoreProvider" -> stateStoreProvider)
+  )
+
+// Used for ListState

Review Comment:
   Lets remove these for now ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46864][SS] Onboard Arbitrary StateV2 onto New Error Class Framework [spark]

Posted by "anishshri-db (via GitHub)" <gi...@apache.org>.
anishshri-db commented on code in PR #44883:
URL: https://github.com/apache/spark/pull/44883#discussion_r1470376655


##########
docs/sql-error-conditions-internal-error-tws-error-class.md:
##########
@@ -0,0 +1,37 @@
+---
+layout: global
+title: INTERNAL_ERROR_TWS error class
+displayTitle: INTERNAL_ERROR_TWS error class
+license: |
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+
+     http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+---
+
+<!--
+  DO NOT EDIT THIS FILE.
+  It was generated automatically by `org.apache.spark.SparkThrowableSuite`.
+-->
+
+[SQLSTATE: XX000](sql-error-conditions-sqlstates.html#class-XX-internal-error)
+
+Internal error using TransformWithStateOperator: 
+
+This error class has the following derived error classes:
+
+## IMPLICIT_KEY_NOT_FOUND
+
+`<message>`
+

Review Comment:
   Nit: extra newline(s) ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46864][SS] Onboard Arbitrary StateV2 onto New Error Class Framework [spark]

Posted by "anishshri-db (via GitHub)" <gi...@apache.org>.
anishshri-db commented on code in PR #44883:
URL: https://github.com/apache/spark/pull/44883#discussion_r1470379623


##########
sql/core/src/test/scala/org/apache/spark/sql/execution/streaming/state/RocksDBSuite.scala:
##########
@@ -689,6 +689,20 @@ class RocksDBSuite extends AlsoTestWithChangelogCheckpointingEnabled with Shared
     }
   }
 
+  testWithChangelogCheckpointingEnabled("RocksDB: Unsupported Operations" +
+    " with Changelog Checkpointing") {
+    val dfsRootDir = new File(Utils.createTempDir().getAbsolutePath + "/state/1/1")
+    val fileManager = new RocksDBFileManager(
+      dfsRootDir.getAbsolutePath, Utils.createTempDir(), new Configuration)
+    val changelogWriter = fileManager.getChangeLogWriter(1)
+
+    val ex = intercept[UnsupportedOperationException] {
+      changelogWriter.put("a", "1", "testColFamily")

Review Comment:
   lets add a check for `delete` too ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46864][SS] Onboard Arbitrary StateV2 onto New Error Class Framework [spark]

Posted by "MaxGekk (via GitHub)" <gi...@apache.org>.
MaxGekk commented on code in PR #44883:
URL: https://github.com/apache/spark/pull/44883#discussion_r1470709842


##########
common/utils/src/main/resources/error/error-classes.json:
##########
@@ -1656,6 +1656,19 @@
     ],
     "sqlState" : "XX000"
   },
+  "INTERNAL_ERROR_TWS" : {
+    "message" : [
+      "Internal error using TransformWithStateOperator: "
+    ],
+    "subClass" : {
+      "IMPLICIT_KEY_NOT_FOUND" : {

Review Comment:
   Internal errors should be not visible to users, so, don't need to introduce any sub-classes. Please, remove it and move `<message>` to `"Internal error using TransformWithStateOperator: "`



##########
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/StateStoreErrors.scala:
##########
@@ -0,0 +1,57 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.streaming.state
+
+import org.apache.spark.{SparkException, SparkUnsupportedOperationException}
+
+/**
+ * Object for grouping error messages from (most) exceptions thrown from State API V2
+ *
+ * ERROR_CLASS has a prefix of "TWS_" or "STATE_STORE_" to indicate where the error is from

Review Comment:
   Update the comment.



##########
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/StateStoreErrors.scala:
##########
@@ -0,0 +1,57 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.streaming.state
+
+import org.apache.spark.{SparkException, SparkUnsupportedOperationException}
+
+/**
+ * Object for grouping error messages from (most) exceptions thrown from State API V2
+ *
+ * ERROR_CLASS has a prefix of "TWS_" or "STATE_STORE_" to indicate where the error is from
+ */
+object StateStoreErrors {
+  def implicitKeyNotFound(
+    stateName: String): SparkException = {

Review Comment:
   ```suggestion
     def implicitKeyNotFound(stateName: String): SparkException = {
   ```



##########
sql/core/src/test/scala/org/apache/spark/sql/execution/streaming/state/ValueStateSuite.scala:
##########
@@ -184,4 +187,18 @@ class ValueStateSuite extends SharedSparkSession
       assert(testState2.get() === null)
     }
   }
+
+  test("colFamily with HDFSBackedStateStoreProvider should fail") {
+    val storeId = StateStoreId(newDir(), Random.nextInt(), 0)
+    val provider = new HDFSBackedStateStoreProvider()
+    val storeConf = new StateStoreConf(new SQLConf())
+    val ex = intercept[StateStoreMultipleColumnFamiliesNotSupportedException] {
+      provider.init(
+        storeId, keySchema, valueSchema, 0, useColumnFamilies = true,
+        storeConf, new Configuration)
+    }
+    assert(ex.getMessage.contains("Creating multiple column families" +
+      " with HDFSStateStoreProvider is not supported"))
+    assert(ex.getMessage.contains("[STATE_STORE_MULTIPLE_COLUMN_FAMILIES]"))

Review Comment:
   Please, use `checkError` here and other places.



##########
common/utils/src/main/resources/error/error-classes.json:
##########
@@ -3230,6 +3243,24 @@
     ],
     "sqlState" : "0A000"
   },
+  "STATE_STORE_MULTIPLE_COLUMN_FAMILIES" : {
+    "message" : [
+      "Creating multiple column families with <stateStoreProvider> is not supported"

Review Comment:
   Cannot it be a sub-class of `UNSUPPORTED_FEATURE`?



##########
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/StateStoreErrors.scala:
##########
@@ -0,0 +1,57 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.streaming.state
+
+import org.apache.spark.{SparkException, SparkUnsupportedOperationException}
+
+/**
+ * Object for grouping error messages from (most) exceptions thrown from State API V2
+ *
+ * ERROR_CLASS has a prefix of "TWS_" or "STATE_STORE_" to indicate where the error is from
+ */
+object StateStoreErrors {
+  def implicitKeyNotFound(
+    stateName: String): SparkException = {
+    SparkException.internalError(
+      msg = s"Implicit key not found in state store for stateName=$stateName",
+      category = "TWS.IMPLICIT_KEY_NOT_FOUND"

Review Comment:
   Please, don't mix `category` and error sub-class.



##########
common/utils/src/main/resources/error/error-classes.json:
##########
@@ -3230,6 +3243,24 @@
     ],
     "sqlState" : "0A000"
   },
+  "STATE_STORE_MULTIPLE_COLUMN_FAMILIES" : {
+    "message" : [
+      "Creating multiple column families with <stateStoreProvider> is not supported"

Review Comment:
   ```suggestion
         "Creating multiple column families with <stateStoreProvider> is not supported."
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org