You are viewing a plain text version of this content. The canonical link for it is here.

Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2020/06/22 11:00:06 UTC

[GitHub] [beam] piotr-szuberski commented on a change in pull request #12023: [BEAM-10135] Add Python wrapper for Jdbc Write external transform

piotr-szuberski commented on a change in pull request #12023:
URL: https://github.com/apache/beam/pull/12023#discussion_r443479475



##########
File path: sdks/python/apache_beam/io/external/jdbc.py
##########
@@ -0,0 +1,116 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""
+  PTransforms for supporting Jdbc in Python pipelines. These transforms do not
+  run a Jdbc client in Python. Instead, they expand to ExternalTransforms
+  which the Expansion Service resolves to the Java SDK's JdbcIO.
+
+  Note: To use these transforms, you need to start a Java Expansion Service.
+  Please refer to the portability documentation on how to do that. Flink Users
+  can use the built-in Expansion Service of the Flink Runner's Job Server. The
+  expansion service address has to be provided when instantiating the
+  transforms.
+
+  If you start Flink's Job Server, the expansion service will be started on
+  port 8097. This is also the configured default for this transform. For a

Review comment:
       There are 2 approaches.
   1. Start flink job server using `sdks:java:io:expansion-service:runShadow` which runs expansion server at localhost:8097
   2. Create shadow-jars for expansion service and flink job server via gradle `sdks:java:io:expansion-service:shadowJar`, `runners:flink:1.10:job-server:shadowJar` and leave expansion_service as None, then those jars will be used with random ports.
   I din't dig into it but in 2) at first it tries to run expansion service and when it fails it runs the flink job server.
   
   My main example for the code I wrote was kafka.py and generate_sequence external transforms




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org