You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Jungtaek Lim (Jira)" <ji...@apache.org> on 2023/10/12 07:28:00 UTC

[jira] [Updated] (SPARK-45511) SPIP: State Data Source - Reader

     [ https://issues.apache.org/jira/browse/SPARK-45511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jungtaek Lim updated SPARK-45511:
---------------------------------
    Description: 
State Store has been a black box from the introduction of the stateful operator. It has been the “internal” data to the streaming query, and Spark does not expose the data outside of the streaming query. There is no feature/tool for users to read and modify the content of state stores.

Specific to the ability to read the state, the lack of feature brings up various limitations like following:
 * Users are unable to see the content in the state store, leading to inability to debug.
 * Users have to perform some indirect approach on verifying the content of the state store in unit tests. The only option they can take is relying on the output of the query.

Given that, we propose to introduce a feature which enables users to read the state from the outside of the streaming query.

SPIP: [https://docs.google.com/document/d/1_iVf_CIu2RZd3yWWF6KoRNlBiz5NbSIK0yThqG0EvPY/edit?usp=sharing]

 

 

  was:
State Store has been a black box from the introduction of the stateful operator. It has been the “internal” data to the streaming query, and Spark does not expose the data outside of the streaming query. There is no feature/tool for users to read and modify the content of state stores.

Specific to the ability to read the state, the lack of feature brings up various limitations like following:
 * Users are unable to see the content in the state store, leading to inability to debug.
 * Users have to perform some indirect approach on verifying the content of the state store in unit tests. The only option they can take is relying on the output of the query.

Given that, we propose to introduce a feature which enables users to read the state from the outside of the streaming query.

SPIP: [https://docs.google.com/document/d/1HjEupRv8TRFeULtJuxRq_tEG1Wq-9UNu-ctGgCYRke0/edit?usp=sharing]

 

 


> SPIP: State Data Source - Reader
> --------------------------------
>
>                 Key: SPARK-45511
>                 URL: https://issues.apache.org/jira/browse/SPARK-45511
>             Project: Spark
>          Issue Type: New Feature
>          Components: Structured Streaming
>    Affects Versions: 4.0.0
>            Reporter: Jungtaek Lim
>            Priority: Major
>              Labels: SPIP
>
> State Store has been a black box from the introduction of the stateful operator. It has been the “internal” data to the streaming query, and Spark does not expose the data outside of the streaming query. There is no feature/tool for users to read and modify the content of state stores.
> Specific to the ability to read the state, the lack of feature brings up various limitations like following:
>  * Users are unable to see the content in the state store, leading to inability to debug.
>  * Users have to perform some indirect approach on verifying the content of the state store in unit tests. The only option they can take is relying on the output of the query.
> Given that, we propose to introduce a feature which enables users to read the state from the outside of the streaming query.
> SPIP: [https://docs.google.com/document/d/1_iVf_CIu2RZd3yWWF6KoRNlBiz5NbSIK0yThqG0EvPY/edit?usp=sharing]
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org