You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Chad Dombrova (Jira)" <ji...@apache.org> on 2019/09/18 03:49:00 UTC

[jira] [Created] (BEAM-8271) StateGetRequest/Response continuation_token should be string

Chad Dombrova created BEAM-8271:
-----------------------------------

             Summary: StateGetRequest/Response continuation_token should be string
                 Key: BEAM-8271
                 URL: https://issues.apache.org/jira/browse/BEAM-8271
             Project: Beam
          Issue Type: Improvement
          Components: beam-model
            Reporter: Chad Dombrova
            Assignee: Chad Dombrova


I've been working on adding typing to the python code and I came across a discrepancy between regarding the type of the continuation token.  The .proto defines it as bytes, but the code treats it as a string (i.e. unicode):

 
{code:java}
// A request to get state.
message StateGetRequest {
  // (Optional) If specified, signals to the runner that the response
  // should resume from the following continuation token.
  //
  // If unspecified, signals to the runner that the response should start
  // from the beginning of the logical continuable stream.
  bytes continuation_token = 1;
}

// A response to get state representing a logical byte stream which can be
// continued using the state API.
message StateGetResponse {
  // (Optional) If specified, represents a token which can be used with the
  // state API to get the next chunk of this logical byte stream. The end of
  // the logical byte stream is signalled by this field being unset.
  bytes continuation_token = 1;

  // Represents a part of a logical byte stream. Elements within
  // the logical byte stream are encoded in the nested context and
  // concatenated together.
  bytes data = 2;
} 
{code}
From FnApiRunner.StateServicer:
{code:python}
    def blocking_get(self, state_key, continuation_token=None):
      with self._lock:
        full_state = self._state[self._to_key(state_key)]
        if self._use_continuation_tokens:
          # The token is "nonce:index".
          if not continuation_token:
            token_base = 'token_%x' % len(self._continuations)
            self._continuations[token_base] = tuple(full_state)
            return b'', '%s:0' % token_base
          else:
            token_base, index = continuation_token.split(':')
            ix = int(index)
            full_state = self._continuations[token_base]
            if ix == len(full_state):
              return b'', None
            else:
              return full_state[ix], '%s:%d' % (token_base, ix + 1)
        else:
          assert not continuation_token
          return b''.join(full_state), None
{code}

This could be a problem in python3.  

All other id values are string, whereas bytes is reserved for data, so I think that the proto should be changed to string. 





--
This message was sent by Atlassian Jira
(v8.3.4#803005)