You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Zhongpu Chen <ch...@gmail.com> on 2023/02/24 02:29:12 UTC

Should we always mark ValueState as "transient" for RichFunctions

Hi,

When I am reading the code from flink-training-repo [1], I noticed the 
following code:

```java

public static class EnrichmentFunction
         extends RichCoFlatMapFunction<TaxiRide, TaxiFare, RideAndFare> {

     private ValueState<TaxiRide>rideState; private ValueState<TaxiFare>fareState; ... }

```

 From my understanding, since ValueState variables here are scoped to 
each instance, they should not be serialized for the performance sake. 
Thus, we should always mark them with "transient". Similar discussion 
can be found here [2].

Should we always mark ValueState as "transient", and why? Please help me 
to figure it out.

[1] 
https://github.com/apache/flink-training/blob/master/rides-and-fares/src/solution/java/org/apache/flink/training/solutions/ridesandfares/RidesAndFaresSolution.java

[2] 
https://stackoverflow.com/questions/72556202/flink-managed-state-as-transient

Re: Should we always mark ValueState as "transient" for RichFunctions

Posted by Gen Luo <lu...@gmail.com>.
Hi,

ValueState is a handle rather than an actual value. So it should never be
serialized. In fact, ValueState itself is not a Serializable. It should be
ok to always mark it as transient.

In this case, I suppose it works because the ValueState is not set (which
happens during the runtime) when the function is serialized (while
deploying). But it's not good.

On Fri, Feb 24, 2023 at 10:29 AM Zhongpu Chen <ch...@gmail.com> wrote:

> Hi,
>
> When I am reading the code from flink-training-repo [1], I noticed the
> following code:
>
> ```java
>
> public static class EnrichmentFunction
>         extends RichCoFlatMapFunction<TaxiRide, TaxiFare, RideAndFare> {
>
>     private ValueState<TaxiRide> rideState;    private ValueState<TaxiFare> fareState;
>     ...
> }
>
> ```
>
> From my understanding, since ValueState variables here are scoped to each
> instance, they should not be serialized for the performance sake. Thus, we
> should always mark them with "transient". Similar discussion can be found
> here [2].
>
> Should we always mark ValueState as "transient", and why? Please help me
> to figure it out.
>
> [1]
> https://github.com/apache/flink-training/blob/master/rides-and-fares/src/solution/java/org/apache/flink/training/solutions/ridesandfares/RidesAndFaresSolution.java
>
> [2]
> https://stackoverflow.com/questions/72556202/flink-managed-state-as-transient
>