You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Ismaël Mejía (Jira)" <ji...@apache.org> on 2019/10/07 10:01:00 UTC

[jira] [Updated] (BEAM-8271) StateGetRequest/Response continuation_token should be string

     [ https://issues.apache.org/jira/browse/BEAM-8271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ismaël Mejía updated BEAM-8271:
-------------------------------
    Status: Open  (was: Triage Needed)

> StateGetRequest/Response continuation_token should be string
> ------------------------------------------------------------
>
>                 Key: BEAM-8271
>                 URL: https://issues.apache.org/jira/browse/BEAM-8271
>             Project: Beam
>          Issue Type: Improvement
>          Components: beam-model
>            Reporter: Chad Dombrova
>            Assignee: Chad Dombrova
>            Priority: Major
>
> I've been working on adding typing to the python code and I came across a discrepancy between regarding the type of the continuation token.  The .proto defines it as bytes, but the code treats it as a string (i.e. unicode):
>  
> {code:java}
> // A request to get state.
> message StateGetRequest {
>   // (Optional) If specified, signals to the runner that the response
>   // should resume from the following continuation token.
>   //
>   // If unspecified, signals to the runner that the response should start
>   // from the beginning of the logical continuable stream.
>   bytes continuation_token = 1;
> }
> // A response to get state representing a logical byte stream which can be
> // continued using the state API.
> message StateGetResponse {
>   // (Optional) If specified, represents a token which can be used with the
>   // state API to get the next chunk of this logical byte stream. The end of
>   // the logical byte stream is signalled by this field being unset.
>   bytes continuation_token = 1;
>   // Represents a part of a logical byte stream. Elements within
>   // the logical byte stream are encoded in the nested context and
>   // concatenated together.
>   bytes data = 2;
> } 
> {code}
> From FnApiRunner.StateServicer:
> {code:python}
>     def blocking_get(self, state_key, continuation_token=None):
>       with self._lock:
>         full_state = self._state[self._to_key(state_key)]
>         if self._use_continuation_tokens:
>           # The token is "nonce:index".
>           if not continuation_token:
>             token_base = 'token_%x' % len(self._continuations)
>             self._continuations[token_base] = tuple(full_state)
>             return b'', '%s:0' % token_base
>           else:
>             token_base, index = continuation_token.split(':')
>             ix = int(index)
>             full_state = self._continuations[token_base]
>             if ix == len(full_state):
>               return b'', None
>             else:
>               return full_state[ix], '%s:%d' % (token_base, ix + 1)
>         else:
>           assert not continuation_token
>           return b''.join(full_state), None
> {code}
> This could be a problem in python3.  
> All other id values are string, whereas bytes is reserved for data, so I think that the proto should be changed to string. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)