You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2022/08/18 14:36:56 UTC

[GitHub] [beam] LemonU commented on issue #20979: Portable runners should be able to issue checkpoints to Splittable DoFn

LemonU commented on issue #20979:
URL: https://github.com/apache/beam/issues/20979#issuecomment-1219575981

   I have managed to pull off the workaround of adding `--experiments=use_deprecated_read` mentioned in the original Jira ticket.
   
   First, you'll need to start the Java expansion service on your own. If you are deploying the expansion service on docker, you can simply pull the flink job service image (`apache/beam_flink1.14_job_server:2.40.0` is what I used) and override the entrypoint with the following commands
   ```
   java -cp /opt/apache/beam/jars/* org.apache.beam.sdk.expansion.service.ExpansionService 8097  \
   --javaClassLookupAllowlistFile="*" \
   --defaultEnvironmentType=<your environment type here> \
   --defaultEnvironmentConfig=<your environment config here> \
   --experiments=use_deprecated_read
   ```
   
   Explanation on each of the flags:
   
   `-cp /opt/apache/beam/jars/*`: this is where the expansion service jars is located in the container
   
   `8097`: this specifies the port the expansion service should be opened on
   
   `--javaClassLookupAllowlistFile="*"`: this is so that all transforms registered under the expansion service can be requested for external expansion
   
   `--defaultEnvironmentType=<your environment type here>` and `--defaultEnvironmentConfig=<your environment config here>`: this specifies the `Environment` that the Java transforms you requested from this expansion service should be executed in. Be advised, your pipeline's environment configs will not affect this value, and the values set here for the expansion service will override that of your pipeline's. 
   That is, let's say you are running a python pipeline with `--environment_type=EXTERNAL --environment_config=localhost:50000` and the expansion service is started with `--defaultEnvironmentType=DOCKER`, and you are requesting the expansion for the kafka IO transforms from the expansion service, the resulting pipeline Protobuf payload will have all stages' environment being set to the `EXTERNAL` environment but the Kafka IO transforms that you requested from the expansion service, which will be set to the `DOCKER` environment.
   
   `--experiments=use_deprecated_read`: this is so that the legacy `Read` transform will replace the new SDF-based Kafka Read transform when the expansion service is expanding the kafka IO stage.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org