You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Eddie Wang (Jira)" <ji...@apache.org> on 2021/10/04 02:17:00 UTC
[jira] [Commented] (BEAM-9449) Consider passing pipeline options
for expansion service.
[ https://issues.apache.org/jira/browse/BEAM-9449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17423748#comment-17423748 ]
Eddie Wang commented on BEAM-9449:
----------------------------------
not stale.
i've spent the past few weekends trying to compile a custom beam container to support a k8s deployment of the java harness. the main use-case here being to leverage the Pubsub/KafkaIO libraries from the Java sdk inside of the Python environment.
i was able to customize the default environment in the ExpansionService.java file to use an `EXTERNAL` environment by default, but was unsure how to setup the java sdk worker pool... python workers have a worker pool flag, but there is not such option for the java beam sdk.
after that, i modified the default environment to `EMBEDDED` in a final attempt to get the PubsubIO library working on Flink. Unfortunately, this is leading to class not found issues: `Caused by: java.lang.NoClassDefFoundError: Could not initialize class org.apache.beam.runners.core.construction.CoderTranslation` from the Flink taskmanager.
IO is a such a fundamental primitive of any data-pipeline, and unfortunately... most of the transforms in Python require the Java expansion service in order to work.
> Consider passing pipeline options for expansion service.
> --------------------------------------------------------
>
> Key: BEAM-9449
> URL: https://issues.apache.org/jira/browse/BEAM-9449
> Project: Beam
> Issue Type: New Feature
> Components: beam-model, cross-language
> Reporter: Robert Bradshaw
> Priority: P2
> Labels: stale-P2
> Time Spent: 2h 40m
> Remaining Estimate: 0h
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)