You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@beam.apache.org by Alan Zhang via dev <de...@beam.apache.org> on 2023/01/25 06:35:14 UTC

MapState/SetState(aka, MultimapUserState) are not fully supported in Beam portability framework?

Hi everyone,

Why don’t we have some interfaces(e.g. MultimapUserStateHandler and MultimapUserStateHandlerFactory) for supporting MultimapUserState defined in the class StateRequestHandlers<https://github.com/apache/beam/blob/master/runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/state/StateRequestHandlers.java#L69>? Is this support on plan but not implement yet or there were some concerns, and we don’t want to support it? Or this class is not the right place to define these MultimapUserState related handler interfaces?

For example, for supporting the BagUserState, I saw this class defined two related interfaces BagUserStateHandler<https://github.com/apache/beam/blob/master/runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/state/StateRequestHandlers.java#L192> and BagUserStateHandlerFactory<https://github.com/apache/beam/blob/master/runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/state/StateRequestHandlers.java#L215>, and the runners(Samza/Flink/Spark) can have their own implementation(e.g. Samza’s SamzaStateRequestHandlers<https://github.com/apache/beam/blob/master/runners/samza/src/main/java/org/apache/beam/runners/samza/runtime/SamzaStateRequestHandlers.java#L123>) for these interfaces to support ValueState, BagState and CombingState.

I saw the existing Fn Harness implementation is able to handle MapState and SetState by using FnApiStateAccessor<https://github.com/apache/beam/blob/master/sdks/java/harness/src/main/java/org/apache/beam/fn/harness/state/FnApiStateAccessor.java#L444>, and build the right StateRequest<https://github.com/apache/beam/blob/master/model/fn-execution/src/main/proto/org/apache/beam/model/fn_execution/v1/beam_fn_api.proto#L662>/StateKey for them. So Beam Fn APIs can provide these interfaces to let each runner to integrate, then I would think MultimapUserState is fully supported in Beam portability framework.

----------------
A little bit introduction for myself:

This is Alan from Linkedin. We are building a new managed platform which is powered by Samza runner and Beam portability framework, and we wanted to let all Linkedin Beam use cases get benefit from this new portable architecture eventually.
But there are few feature gaps between classic Samza runner and portable Samza runner, the user state support is one of the gaps. The classic Samza runner support 5 major user state types: ValueState, BagState, CombingState, MapState and SetState, while the existing portable Samza runner only supports ValueState, BagState and CombingState. I’m trying to address this state feature gap now.



--
Best,
Alan Zhang