You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Niels van Eck (Jira)" <ji...@apache.org> on 2020/08/04 00:59:00 UTC

[jira] [Commented] (BEAM-7934) Dataflow Python SDK logging: step_id is always empty string

    [ https://issues.apache.org/jira/browse/BEAM-7934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17170485#comment-17170485 ] 

Niels van Eck commented on BEAM-7934:
-------------------------------------

I have been testing out some streaming Python DataFlow pipelines, and I can confirm that issue is still present. 

FWIW, the labels seem to have become a bit more elaborate, but are still missing {{step_id}}

  
{code:java}
labels: {
  dataflow.googleapis.com/job_name: "my-job"
  dataflow.googleapis.com/job_id: "2020-08-03_17_37_40-345678etc"
  dataflow.googleapis.com/region: "us-central1"
  compute.googleapis.com/resource_type: "instance"
  compute.googleapis.com/resource_id: "1234567890"
  compute.googleapis.com/resource_name: "my-job-08031737-djfiosiafjias"
}{code}
 
  

> Dataflow Python SDK logging: step_id is always empty string
> -----------------------------------------------------------
>
>                 Key: BEAM-7934
>                 URL: https://issues.apache.org/jira/browse/BEAM-7934
>             Project: Beam
>          Issue Type: Bug
>          Components: runner-dataflow, sdk-py-core
>    Affects Versions: 2.13.0
>            Reporter: James Hutchison
>            Priority: P2
>
> Using the dataflow runner, log messages always show up in stackdriver with the step_id as the empty string, so filtering log messages for a step doesn't work.
> {code:java}
> resource: {
>   labels: {
>     job_id: "<job id>" 
>     job_name: "<job name>" 
>     project_id: "<project id>" 
>     region: "<region>" 
>     step_id: "" 
>   }
>   type: "dataflow_step" 
> }{code}
> Another user seems to have posted in the old github repo and appears to be seeing the same problem based on their output:
> [https://github.com/GoogleCloudPlatform/DataflowPythonSDK/issues/62]
> From what I can tell is only affecting streaming pipelines



--
This message was sent by Atlassian Jira
(v8.3.4#803005)