You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Eugene Huang (JIRA)" <ji...@apache.org> on 2018/11/21 21:19:00 UTC
[jira] [Created] (BEAM-6107) Logging on Dataflow for Python's Beam
2.7.0 and 2.8.0 not working
Eugene Huang created BEAM-6107:
----------------------------------
Summary: Logging on Dataflow for Python's Beam 2.7.0 and 2.8.0 not working
Key: BEAM-6107
URL: https://issues.apache.org/jira/browse/BEAM-6107
Project: Beam
Issue Type: Bug
Components: runner-dataflow
Affects Versions: 2.8.0, 2.7.0
Reporter: Eugene Huang
Assignee: Tyler Akidau
Hello, my name is Eugene. This is my first bug I am submitting to the Apache Beam project!
I'm using Python's SDK on Google Cloud Platform to prototype with Beam locally inside Datalab and then submit a job to Dataflow. In GCP, I'm using Dataflow and use the logging as basically a print statement to check the state of objects. For example, after a GroupByKey, if I run Beam locally, the grouped elements are in a list. In Dataflow, the group elements sometimes will be a generator. Hence, logging is very important to me.
When submitting a job to Dataflow using Beam 2.6.0, the logging works correctly. When I click on a box on the Dataflow job's user interface, I can see a bunch of logs. However, when submitting a job to Dataflow using Beam 2.7.0 or 2.8.0, the logs do not appear at all-it is as if logging did not happen at all.
My guess is that this logging problem is specific to Dataflow/GCP and might not apply to the other runners. In that case, I'm creating this bug here since I don't know where else to raise this issue.
If there are any questions, I can provide screenshots or anything you like.
Thank you,
Eugene
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)