You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2021/03/09 20:32:24 UTC

[GitHub] [beam] abergmeier edited a comment on pull request #14174: [BEAM-XXX] Port join extensions to Python

abergmeier edited a comment on pull request #14174:
URL: https://github.com/apache/beam/pull/14174#issuecomment-794157903


   This is a first naive code attempt. For the following tasks I could really use some guidance with the following questions:
   1. How to handle Tags.
      ~~In Java, the code uses internal knowledge to create what seems to be a class identity? There seem to be some Tags being used in Python but it doesn't seem to have the same power.
      Would be great if someone could shed some light why Tags are there in the first place and how the strategy in Python is.~~
     Since this produces a simpler `dict` in Python, will hardcode the strings.
   
   2. How to handle KV
      I have a hard time finding examples in code that use KV. Since there is a typehint I used this as far as possible in Python code. For the actual KV I then used tuples. Is this fine?
   
   3. Coders
      There seems to be some Coder support but it seems like there is no equivalent to `PCollection.setCoder`. How is this supposed to work in Python?
   
   4. CoGbk*
   There seem to be no further util classes around `CoGbk`*. I assume for the most part will have to implement these.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org