You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@gobblin.apache.org by GitBox <gi...@apache.org> on 2022/11/28 19:02:08 UTC

[GitHub] [gobblin] Will-Lo opened a new pull request, #3610: [GOBBLIN-1750] Add schemas for observability events in GaaS

Will-Lo opened a new pull request, #3610:
URL: https://github.com/apache/gobblin/pull/3610

   Dear Gobblin maintainers,
   
   Please accept this PR. I understand that it will not be reviewed until I have checked off all the steps below!
   
   
   ### JIRA
   - [ ] My PR addresses the following [Gobblin JIRA](https://issues.apache.org/jira/browse/GOBBLIN/) issues and references them in the PR title. For example, "[GOBBLIN-XXX] My Gobblin PR"
       - https://issues.apache.org/jira/browse/GOBBLIN-1750
   
   
   ### Description
   - [ ] Here are some details about my PR, including screenshots (if applicable):
   
   Users of GaaS have a difficult time monitoring their flows. Regular metrics do not have the cardinality needed to effectively determine the status of a pipeline, and GTE are too broad and are hard for users to create meaningful queries from.
   
   This PR sets up the schema of an observability event which would be used as a monitoring baseline in GaaS
   
   ### Tests
   - [ ] My PR adds the following unit tests __OR__ does not need testing for this extremely good reason:
   
   
   ### Commits
   - [ ] My commits all reference JIRA issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)":
       1. Subject is separated from body by a blank line
       2. Subject is limited to 50 characters
       3. Subject does not end with a period
       4. Subject uses the imperative mood ("add", not "adding")
       5. Body wraps at 72 characters
       6. Body explains "what" and "why", not "how"
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@gobblin.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [gobblin] phet commented on a diff in pull request #3610: [GOBBLIN-1750] Add schemas for observability events in GaaS

Posted by GitBox <gi...@apache.org>.
phet commented on code in PR #3610:
URL: https://github.com/apache/gobblin/pull/3610#discussion_r1041836329


##########
gobblin-metrics-libs/gobblin-metrics-base/src/main/avro/GaaSObservabilityEventV0.avsc:
##########
@@ -0,0 +1,162 @@
+{
+  "type": "record",
+  "name": "GaaSObservabilityEventV0",
+  "namespace" : "org.apache.gobblin.metrics",
+  "fields": [
+    {
+      "name": "timestamp",
+      "type": "long",
+      "doc": "Time at which event was created in millis"
+    }, {
+      "name" : "flowGroup",
+      "type" : "string",
+      "doc" : "Flow group for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name" : "flowName",
+      "type" : "string",
+      "doc" : "Flow name for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name" : "flowExecutionId",
+      "type" : "long",
+      "doc" : "Flow execution id for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name": "jobSentToExecutor",
+      "type": "boolean",
+      "doc": "Whether or not this job was able to be sent to a job executor."
+    }, {
+      "name": "lastFlowModificationTime",
+      "type": "long",
+      "doc": "Timestamp in millis when the flow config was last modified"
+    }, {
+      "name" : "edgeId",
+      "type" : "string",
+      "doc" : "Flow edge id, in format <src_node>_<dest_node>_<edge_id>",
+      "compliance" : "NONE"
+    }, {
+      "name": "jobName",
+      "type": "string",
+      "doc": "The name of the Gobblin job, found in the job template. One edge can contain multiple jobs",
+      "compliance" : "NONE"
+    }, {
+      "name": "jobType",
+      "type": {
+        "type": "enum",
+        "name": "JobType",
+        "symbols": [
+          "COPY",
+          "RETENTION",
+          "GOBBLIN"
+        ],
+        "symbolDocs": {
+          "COPY": "Gobblin distcp job",
+          "RETENTION": "Gobblin retention job",
+          "GOBBLIN": "Any Gobblin job"
+        }
+      },
+      "doc": "Gobblin job type running on GaaS, determined by the compiled job template.",
+      "compliance": "NONE"
+    }, {
+      "name": "jobStatus",
+      "type": {
+        "type": "enum",
+        "name": "JobStatus",
+        "symbols": [
+          "SUCCEEDED",
+          "FAILED",
+          "CANCELLED"
+        ],
+        "doc": "Final job status for this job in the GaaS flow",
+        "compliance": "NONE"
+      }
+    }, {
+      "name": "jobStartTime",
+      "type": "long",
+      "doc": "Start time of the job in millis",
+      "compliance": "NONE"
+    }, {
+      "name": "jobEndTime",
+      "type": "long",
+      "doc": "Finish time of the job in millis",
+      "compliance": "NONE"
+    }, {
+      "name": "proxyUser",
+      "type": "string",
+      "doc": "Proxy user (if applicable) that ran the Gobblin job",
+      "compliance": "NONE"
+    }, {
+      "name": "executorLink",
+      "type": "string",
+      "doc": "Link to where the job is running, currently limited to Azkaban",
+      "compliance": "NONE"
+    }, {
+      "name": "executorId",
+      "type": "string",

Review Comment:
   great Q on compilation failure!  probably both ought to be null.
   
   compilation error events should be purely for *scheduled flows* that previously compiled--failure at submission time should be an interactive error only, no event.  nonetheless, as we need them, perhaps we're asking `jobStatus.FAILURE` to do too much:  it's already execution failure as well as failure to send to executor--now compilation failure as well?  it becomes a status *category*, a catch-all.  better might be another status, like `INVALID` or else named `COMPILATION_FAILURE` / `UNRESOLVED` (?).
   
   in fact, more I consider that, I'd seek to discern `EXECUTION_FAILURE` from `SUBMISSION_FAILURE` (I actually dislike the overloaded terminology, where the user 'submits' a flow to gaas, and gaas 'submits' the job to the executor.)  other possible terms might be `QUEUEING_FAILURE` or `LOST` or `UNAVAILABLE`...



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@gobblin.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [gobblin] phet commented on a diff in pull request #3610: [GOBBLIN-1750] Add schemas for observability events in GaaS

Posted by GitBox <gi...@apache.org>.
phet commented on code in PR #3610:
URL: https://github.com/apache/gobblin/pull/3610#discussion_r1040570689


##########
gobblin-metrics-libs/gobblin-metrics-base/src/main/avro/GaaSObservabilityEventV0.avsc:
##########
@@ -0,0 +1,162 @@
+{
+  "type": "record",
+  "name": "GaaSObservabilityEventV0",

Review Comment:
   good wisdom for us to iterate after achieving e2e functionality, so I suggest a top-level "doc" string to clarify the clear intent to supersede this schema.
   
   an alt. suffix that just occurred to me is `Experimental`



##########
gobblin-metrics-libs/gobblin-metrics-base/src/main/avro/GaaSObservabilityEventV0.avsc:
##########
@@ -0,0 +1,162 @@
+{
+  "type": "record",
+  "name": "GaaSObservabilityEventV0",
+  "namespace" : "org.apache.gobblin.metrics",
+  "fields": [
+    {
+      "name": "timestamp",
+      "type": "long",
+      "doc": "Time at which event was created in millis"
+    }, {
+      "name" : "flowGroup",
+      "type" : "string",
+      "doc" : "Flow group for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name" : "flowName",
+      "type" : "string",
+      "doc" : "Flow name for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name" : "flowExecutionId",
+      "type" : "long",
+      "doc" : "Flow execution id for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name": "jobSentToExecutor",
+      "type": "boolean",
+      "doc": "Whether or not this job was able to be sent to a job executor."
+    }, {
+      "name": "lastFlowModificationTime",
+      "type": "long",
+      "doc": "Timestamp in millis when the flow config was last modified"
+    }, {
+      "name" : "edgeId",

Review Comment:
   minor temptation to call this "flowGraphEdgeId" for clarity



##########
gobblin-metrics-libs/gobblin-metrics-base/src/main/avro/GaaSObservabilityEventV0.avsc:
##########
@@ -0,0 +1,162 @@
+{
+  "type": "record",
+  "name": "GaaSObservabilityEventV0",
+  "namespace" : "org.apache.gobblin.metrics",
+  "fields": [
+    {
+      "name": "timestamp",
+      "type": "long",
+      "doc": "Time at which event was created in millis"
+    }, {
+      "name" : "flowGroup",
+      "type" : "string",
+      "doc" : "Flow group for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name" : "flowName",
+      "type" : "string",
+      "doc" : "Flow name for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name" : "flowExecutionId",
+      "type" : "long",
+      "doc" : "Flow execution id for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name": "jobSentToExecutor",
+      "type": "boolean",
+      "doc": "Whether or not this job was able to be sent to a job executor."
+    }, {
+      "name": "lastFlowModificationTime",
+      "type": "long",
+      "doc": "Timestamp in millis when the flow config was last modified"
+    }, {
+      "name" : "edgeId",
+      "type" : "string",
+      "doc" : "Flow edge id, in format <src_node>_<dest_node>_<edge_id>",
+      "compliance" : "NONE"
+    }, {
+      "name": "jobName",
+      "type": "string",
+      "doc": "The name of the Gobblin job, found in the job template. One edge can contain multiple jobs",
+      "compliance" : "NONE"
+    }, {
+      "name": "jobType",
+      "type": {
+        "type": "enum",
+        "name": "JobType",
+        "symbols": [
+          "COPY",
+          "RETENTION",
+          "GOBBLIN"
+        ],
+        "symbolDocs": {
+          "COPY": "Gobblin distcp job",
+          "RETENTION": "Gobblin retention job",
+          "GOBBLIN": "Any Gobblin job"
+        }
+      },
+      "doc": "Gobblin job type running on GaaS, determined by the compiled job template.",
+      "compliance": "NONE"
+    }, {
+      "name": "jobStatus",
+      "type": {
+        "type": "enum",
+        "name": "JobStatus",
+        "symbols": [
+          "SUCCEEDED",
+          "FAILED",
+          "CANCELLED"
+        ],
+        "doc": "Final job status for this job in the GaaS flow",
+        "compliance": "NONE"
+      }
+    }, {
+      "name": "jobStartTime",
+      "type": "long",
+      "doc": "Start time of the job in millis",
+      "compliance": "NONE"
+    }, {
+      "name": "jobEndTime",
+      "type": "long",
+      "doc": "Finish time of the job in millis",
+      "compliance": "NONE"
+    }, {
+      "name": "proxyUser",
+      "type": "string",
+      "doc": "Proxy user (if applicable) that ran the Gobblin job",
+      "compliance": "NONE"
+    }, {
+      "name": "executorLink",
+      "type": "string",
+      "doc": "Link to where the job is running, currently limited to Azkaban",

Review Comment:
   nit: where the job *ran*?



##########
gobblin-metrics-libs/gobblin-metrics-base/src/main/avro/GaaSObservabilityEventV0.avsc:
##########
@@ -0,0 +1,162 @@
+{
+  "type": "record",
+  "name": "GaaSObservabilityEventV0",
+  "namespace" : "org.apache.gobblin.metrics",
+  "fields": [
+    {
+      "name": "timestamp",
+      "type": "long",
+      "doc": "Time at which event was created in millis"
+    }, {
+      "name" : "flowGroup",
+      "type" : "string",
+      "doc" : "Flow group for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name" : "flowName",
+      "type" : "string",
+      "doc" : "Flow name for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name" : "flowExecutionId",
+      "type" : "long",
+      "doc" : "Flow execution id for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name": "jobSentToExecutor",
+      "type": "boolean",
+      "doc": "Whether or not this job was able to be sent to a job executor."
+    }, {
+      "name": "lastFlowModificationTime",
+      "type": "long",
+      "doc": "Timestamp in millis when the flow config was last modified"
+    }, {
+      "name" : "edgeId",
+      "type" : "string",
+      "doc" : "Flow edge id, in format <src_node>_<dest_node>_<edge_id>",
+      "compliance" : "NONE"
+    }, {
+      "name": "jobName",
+      "type": "string",
+      "doc": "The name of the Gobblin job, found in the job template. One edge can contain multiple jobs",
+      "compliance" : "NONE"
+    }, {
+      "name": "jobType",
+      "type": {
+        "type": "enum",
+        "name": "JobType",
+        "symbols": [
+          "COPY",
+          "RETENTION",
+          "GOBBLIN"
+        ],
+        "symbolDocs": {
+          "COPY": "Gobblin distcp job",
+          "RETENTION": "Gobblin retention job",
+          "GOBBLIN": "Any Gobblin job"
+        }
+      },
+      "doc": "Gobblin job type running on GaaS, determined by the compiled job template.",
+      "compliance": "NONE"
+    }, {
+      "name": "jobStatus",
+      "type": {
+        "type": "enum",
+        "name": "JobStatus",
+        "symbols": [
+          "SUCCEEDED",
+          "FAILED",
+          "CANCELLED"
+        ],
+        "doc": "Final job status for this job in the GaaS flow",
+        "compliance": "NONE"
+      }
+    }, {
+      "name": "jobStartTime",
+      "type": "long",
+      "doc": "Start time of the job in millis",
+      "compliance": "NONE"
+    }, {
+      "name": "jobEndTime",
+      "type": "long",
+      "doc": "Finish time of the job in millis",
+      "compliance": "NONE"
+    }, {
+      "name": "proxyUser",
+      "type": "string",
+      "doc": "Proxy user (if applicable) that ran the Gobblin job",
+      "compliance": "NONE"
+    }, {
+      "name": "executorLink",
+      "type": "string",
+      "doc": "Link to where the job is running, currently limited to Azkaban",
+      "compliance": "NONE"
+    }, {
+      "name": "executorId",
+      "type": "string",
+      "doc": "The ID of the spec executor that ran the job",
+      "compliance": "NONE"
+    },
+    {
+      "name": "issues",
+      "type": {
+        "type": "array",
+        "items": {
+          "type": "record",
+          "name": "Issue",
+          "doc": "Issue describes a specific unique problem in the job or application.\n\nIssue can be generated from log entries, health checks, and other places.",
+          "fields": [
+            {
+              "name": "time",
+              "type": {
+                "type": "typeref",
+                "name": "Timestamp",
+                "doc": "Epoch/UNIX time in milliseconds\n\nRepresents the number of milliseconds since the epoch of 1970-01-01T00:00:00Z",
+                "ref": "long"
+              },
+              "doc": "Time when the issue have occurred"

Review Comment:
   nit: strike -have- (perhaps carry over from `Issue.pdl`...)



##########
gobblin-metrics-libs/gobblin-metrics-base/src/main/avro/GaaSObservabilityEventV0.avsc:
##########
@@ -0,0 +1,162 @@
+{
+  "type": "record",
+  "name": "GaaSObservabilityEventV0",
+  "namespace" : "org.apache.gobblin.metrics",
+  "fields": [
+    {
+      "name": "timestamp",
+      "type": "long",
+      "doc": "Time at which event was created in millis"
+    }, {
+      "name" : "flowGroup",
+      "type" : "string",
+      "doc" : "Flow group for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name" : "flowName",
+      "type" : "string",
+      "doc" : "Flow name for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name" : "flowExecutionId",
+      "type" : "long",
+      "doc" : "Flow execution id for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name": "jobSentToExecutor",
+      "type": "boolean",
+      "doc": "Whether or not this job was able to be sent to a job executor."
+    }, {
+      "name": "lastFlowModificationTime",
+      "type": "long",
+      "doc": "Timestamp in millis when the flow config was last modified"
+    }, {
+      "name" : "edgeId",
+      "type" : "string",
+      "doc" : "Flow edge id, in format <src_node>_<dest_node>_<edge_id>",
+      "compliance" : "NONE"
+    }, {
+      "name": "jobName",
+      "type": "string",
+      "doc": "The name of the Gobblin job, found in the job template. One edge can contain multiple jobs",
+      "compliance" : "NONE"
+    }, {
+      "name": "jobType",
+      "type": {
+        "type": "enum",
+        "name": "JobType",
+        "symbols": [
+          "COPY",
+          "RETENTION",
+          "GOBBLIN"
+        ],
+        "symbolDocs": {
+          "COPY": "Gobblin distcp job",
+          "RETENTION": "Gobblin retention job",
+          "GOBBLIN": "Any Gobblin job"
+        }
+      },
+      "doc": "Gobblin job type running on GaaS, determined by the compiled job template.",
+      "compliance": "NONE"
+    }, {
+      "name": "jobStatus",
+      "type": {
+        "type": "enum",
+        "name": "JobStatus",
+        "symbols": [
+          "SUCCEEDED",
+          "FAILED",
+          "CANCELLED"
+        ],
+        "doc": "Final job status for this job in the GaaS flow",
+        "compliance": "NONE"
+      }
+    }, {
+      "name": "jobStartTime",
+      "type": "long",
+      "doc": "Start time of the job in millis",
+      "compliance": "NONE"
+    }, {
+      "name": "jobEndTime",
+      "type": "long",
+      "doc": "Finish time of the job in millis",
+      "compliance": "NONE"
+    }, {
+      "name": "proxyUser",
+      "type": "string",
+      "doc": "Proxy user (if applicable) that ran the Gobblin job",
+      "compliance": "NONE"
+    }, {
+      "name": "executorLink",
+      "type": "string",
+      "doc": "Link to where the job is running, currently limited to Azkaban",
+      "compliance": "NONE"
+    }, {
+      "name": "executorId",
+      "type": "string",

Review Comment:
   would this and the executorUrl still be set in cases where the job was not orchestrated / sent to an executor, or would it then need to be `null`?  if you wish always to indicate the executor that *would* be used, since it's more where would the job execute, even if it never got submitted, then just update the doc string to clarify



##########
gobblin-metrics-libs/gobblin-metrics-base/src/main/avro/GaaSObservabilityEventV0.avsc:
##########
@@ -0,0 +1,162 @@
+{
+  "type": "record",
+  "name": "GaaSObservabilityEventV0",
+  "namespace" : "org.apache.gobblin.metrics",
+  "fields": [
+    {
+      "name": "timestamp",
+      "type": "long",
+      "doc": "Time at which event was created in millis"
+    }, {
+      "name" : "flowGroup",
+      "type" : "string",
+      "doc" : "Flow group for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name" : "flowName",
+      "type" : "string",
+      "doc" : "Flow name for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name" : "flowExecutionId",
+      "type" : "long",
+      "doc" : "Flow execution id for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name": "jobSentToExecutor",
+      "type": "boolean",
+      "doc": "Whether or not this job was able to be sent to a job executor."
+    }, {
+      "name": "lastFlowModificationTime",
+      "type": "long",
+      "doc": "Timestamp in millis when the flow config was last modified"
+    }, {
+      "name" : "edgeId",
+      "type" : "string",
+      "doc" : "Flow edge id, in format <src_node>_<dest_node>_<edge_id>",
+      "compliance" : "NONE"
+    }, {
+      "name": "jobName",
+      "type": "string",
+      "doc": "The name of the Gobblin job, found in the job template. One edge can contain multiple jobs",
+      "compliance" : "NONE"
+    }, {
+      "name": "jobType",
+      "type": {
+        "type": "enum",
+        "name": "JobType",
+        "symbols": [
+          "COPY",
+          "RETENTION",
+          "GOBBLIN"
+        ],
+        "symbolDocs": {
+          "COPY": "Gobblin distcp job",
+          "RETENTION": "Gobblin retention job",
+          "GOBBLIN": "Any Gobblin job"
+        }
+      },
+      "doc": "Gobblin job type running on GaaS, determined by the compiled job template.",
+      "compliance": "NONE"
+    }, {
+      "name": "jobStatus",
+      "type": {
+        "type": "enum",
+        "name": "JobStatus",
+        "symbols": [
+          "SUCCEEDED",
+          "FAILED",
+          "CANCELLED"
+        ],
+        "doc": "Final job status for this job in the GaaS flow",
+        "compliance": "NONE"
+      }
+    }, {
+      "name": "jobStartTime",
+      "type": "long",
+      "doc": "Start time of the job in millis",
+      "compliance": "NONE"
+    }, {
+      "name": "jobEndTime",
+      "type": "long",
+      "doc": "Finish time of the job in millis",
+      "compliance": "NONE"
+    }, {
+      "name": "proxyUser",
+      "type": "string",
+      "doc": "Proxy user (if applicable) that ran the Gobblin job",
+      "compliance": "NONE"
+    }, {
+      "name": "executorLink",

Review Comment:
   nit: `executorUrl`



##########
gobblin-metrics-libs/gobblin-metrics-base/src/main/avro/GaaSObservabilityEventV0.avsc:
##########
@@ -0,0 +1,162 @@
+{
+  "type": "record",
+  "name": "GaaSObservabilityEventV0",
+  "namespace" : "org.apache.gobblin.metrics",
+  "fields": [
+    {
+      "name": "timestamp",
+      "type": "long",
+      "doc": "Time at which event was created in millis"
+    }, {
+      "name" : "flowGroup",
+      "type" : "string",
+      "doc" : "Flow group for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name" : "flowName",
+      "type" : "string",
+      "doc" : "Flow name for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name" : "flowExecutionId",
+      "type" : "long",
+      "doc" : "Flow execution id for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name": "jobSentToExecutor",
+      "type": "boolean",
+      "doc": "Whether or not this job was able to be sent to a job executor."
+    }, {
+      "name": "lastFlowModificationTime",
+      "type": "long",
+      "doc": "Timestamp in millis when the flow config was last modified"
+    }, {
+      "name" : "edgeId",
+      "type" : "string",
+      "doc" : "Flow edge id, in format <src_node>_<dest_node>_<edge_id>",
+      "compliance" : "NONE"
+    }, {
+      "name": "jobName",
+      "type": "string",
+      "doc": "The name of the Gobblin job, found in the job template. One edge can contain multiple jobs",
+      "compliance" : "NONE"
+    }, {
+      "name": "jobType",
+      "type": {
+        "type": "enum",
+        "name": "JobType",
+        "symbols": [
+          "COPY",
+          "RETENTION",
+          "GOBBLIN"
+        ],

Review Comment:
   I'm not convinced on the modeling, so suggest omitting during our experimentation phase.  we'll refine the event later with what we've learned about processing, when graduating beyond that phase.



##########
gobblin-metrics-libs/gobblin-metrics-base/src/main/avro/GaaSObservabilityEventV0.avsc:
##########
@@ -0,0 +1,162 @@
+{
+  "type": "record",
+  "name": "GaaSObservabilityEventV0",
+  "namespace" : "org.apache.gobblin.metrics",
+  "fields": [
+    {
+      "name": "timestamp",
+      "type": "long",
+      "doc": "Time at which event was created in millis"
+    }, {
+      "name" : "flowGroup",
+      "type" : "string",
+      "doc" : "Flow group for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name" : "flowName",
+      "type" : "string",
+      "doc" : "Flow name for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name" : "flowExecutionId",
+      "type" : "long",
+      "doc" : "Flow execution id for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name": "jobSentToExecutor",
+      "type": "boolean",
+      "doc": "Whether or not this job was able to be sent to a job executor."

Review Comment:
   this boolean is suspicious and perhaps could be derived from richer, more informative other fields.  do you see it as a sub-category of `FAILED` status?
   
   maybe a `jobOrchestratedTime`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@gobblin.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [gobblin] phet commented on a diff in pull request #3610: [GOBBLIN-1750] Add schemas for observability events in GaaS

Posted by GitBox <gi...@apache.org>.
phet commented on code in PR #3610:
URL: https://github.com/apache/gobblin/pull/3610#discussion_r1041817919


##########
gobblin-metrics-libs/gobblin-metrics-base/src/main/avro/GaaSObservabilityEventV0.avsc:
##########
@@ -0,0 +1,162 @@
+{
+  "type": "record",
+  "name": "GaaSObservabilityEventV0",
+  "namespace" : "org.apache.gobblin.metrics",
+  "fields": [
+    {
+      "name": "timestamp",
+      "type": "long",
+      "doc": "Time at which event was created in millis"
+    }, {
+      "name" : "flowGroup",
+      "type" : "string",
+      "doc" : "Flow group for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name" : "flowName",
+      "type" : "string",
+      "doc" : "Flow name for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name" : "flowExecutionId",
+      "type" : "long",
+      "doc" : "Flow execution id for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name": "jobSentToExecutor",
+      "type": "boolean",
+      "doc": "Whether or not this job was able to be sent to a job executor."

Review Comment:
   I wondered the same about the `jobEndTime`... that won't have a meaningful value in the absence of execution, will it?  in avro, I generally prefer a "non-complex" union w/ `null`, rather than a sentinel value like `-1`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@gobblin.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [gobblin] Will-Lo commented on a diff in pull request #3610: [GOBBLIN-1750] Add schemas for observability events in GaaS

Posted by GitBox <gi...@apache.org>.
Will-Lo commented on code in PR #3610:
URL: https://github.com/apache/gobblin/pull/3610#discussion_r1040297695


##########
gobblin-metrics-libs/gobblin-metrics-base/src/main/avro/GaaSObservabilityEvent.avsc:
##########
@@ -0,0 +1,110 @@
+{
+  "type": "record",
+  "name": "GaaSObservabilityEvent",
+  "namespace" : "org.apache.gobblin.metrics",
+  "fields": [
+    {
+      "name": "timestamp",
+      "type": "long",
+      "default": 0,
+      "doc": "Time at which event was created in millis"
+    },
+    {
+      "name" : "flowGroup",
+      "type" : "string",
+      "doc" : "Flow group for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name" : "flowName",
+      "type" : "string",
+      "doc" : "Flow name for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name" : "flowExecutionId",
+      "type" : "string",
+      "doc" : "Flow execution id for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name": "flowCompiled",
+      "type": "boolean",
+      "default": false,
+      "doc": "Whether or not this flow was compiled successfully."
+    }, {
+      "name": "lastFlowModifiedTimestamp",
+      "type": "long",
+      "doc": "Timestamp in millis when the flow config was last modified"
+    }, {
+      "name" : "edgeId",
+      "type" : "string",
+      "doc" : "Flow edge id, in format <src_node>_<dest_node>_<edge_id>",
+      "compliance" : "NONE"
+    }, {
+      "name": "jobName",
+      "type": "string",
+      "doc": "The name of the Gobblin job, found in the job template. One edge can contain multiple jobs",
+      "compliance" : "NONE"
+    }, {
+      "name": "jobType",
+      "type": {
+        "type": "enum",
+        "name": "JobType",
+        "symbols": [
+          "COPY",
+          "RETENTION",
+          "GOBBLIN"
+        ],
+        "symbolDocs": {
+          "COPY": "Gobblin distcp job",
+          "RETENTION": "Gobblin retention job",
+          "GOBBLIN": "Any Gobblin job"
+        },
+        "default": "GOBBLIN"
+      },
+      "doc": "Gobblin job type running on GaaS, determined by the compiled job template.",
+      "compliance": "NONE"
+    }, {
+      "name": "jobStatus",
+      "type": {
+        "type": "enum",
+        "name": "JobStatus",
+        "symbols": [
+          "SUCCEEDED",
+          "FAILED",
+          "CANCELLED"
+        ],
+        "doc": "Final job status for this job in the GaaS flow"
+      }
+    }, {
+      "name": "jobStartTime",
+      "type": "long",
+      "doc": "Start time of the job in millis"
+    }, {
+      "name": "jobEndTime",
+      "type": "long",
+      "doc": "Finish time of the job in millis"
+    }, {
+      "name": "jobRunTime",
+      "type": "long",
+      "doc": "Time taken for the job pipeline to run in millis",
+      "default": 0
+    }, {
+      "name": "proxyUser",
+      "type": "string",
+      "doc": "Proxy user (if applicable) that ran the Gobblin job",
+      "default": "NA"
+    }, {
+      "name": "executorLink",
+      "type": "string",
+      "doc": "Link to where the job is running, currently limited to Azkaban"
+    }, {
+      "name": "executorId",
+      "type": "string",
+      "doc": "The ID of the spec executor that ran the job"
+    }, {
+      "name": "copyMetadata",
+      "type": ["null","org.apache.gobblin.metadata.GobblinDistcpJobEvent"],

Review Comment:
   Talked with @phet and I think I'll remove this for now actually, I'll add this later and evolve the event once we get a e2e spike going.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@gobblin.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [gobblin] phet commented on a diff in pull request #3610: [GOBBLIN-1750] Add schemas for observability events in GaaS

Posted by GitBox <gi...@apache.org>.
phet commented on code in PR #3610:
URL: https://github.com/apache/gobblin/pull/3610#discussion_r1041842571


##########
gobblin-metrics-libs/gobblin-metrics-base/src/main/avro/GaaSObservabilityEventV0.avsc:
##########
@@ -0,0 +1,162 @@
+{
+  "type": "record",
+  "name": "GaaSObservabilityEventV0",
+  "namespace" : "org.apache.gobblin.metrics",
+  "fields": [
+    {
+      "name": "timestamp",
+      "type": "long",
+      "doc": "Time at which event was created in millis"
+    }, {
+      "name" : "flowGroup",
+      "type" : "string",
+      "doc" : "Flow group for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name" : "flowName",
+      "type" : "string",
+      "doc" : "Flow name for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name" : "flowExecutionId",
+      "type" : "long",
+      "doc" : "Flow execution id for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name": "jobSentToExecutor",
+      "type": "boolean",
+      "doc": "Whether or not this job was able to be sent to a job executor."
+    }, {
+      "name": "lastFlowModificationTime",
+      "type": "long",
+      "doc": "Timestamp in millis when the flow config was last modified"
+    }, {
+      "name" : "edgeId",
+      "type" : "string",
+      "doc" : "Flow edge id, in format <src_node>_<dest_node>_<edge_id>",
+      "compliance" : "NONE"
+    }, {
+      "name": "jobName",
+      "type": "string",
+      "doc": "The name of the Gobblin job, found in the job template. One edge can contain multiple jobs",
+      "compliance" : "NONE"
+    }, {
+      "name": "jobType",
+      "type": {
+        "type": "enum",
+        "name": "JobType",
+        "symbols": [
+          "COPY",
+          "RETENTION",
+          "GOBBLIN"
+        ],
+        "symbolDocs": {
+          "COPY": "Gobblin distcp job",
+          "RETENTION": "Gobblin retention job",
+          "GOBBLIN": "Any Gobblin job"
+        }
+      },
+      "doc": "Gobblin job type running on GaaS, determined by the compiled job template.",
+      "compliance": "NONE"
+    }, {
+      "name": "jobStatus",
+      "type": {
+        "type": "enum",
+        "name": "JobStatus",
+        "symbols": [
+          "SUCCEEDED",
+          "FAILED",
+          "CANCELLED"
+        ],
+        "doc": "Final job status for this job in the GaaS flow",
+        "compliance": "NONE"
+      }
+    }, {
+      "name": "jobStartTime",
+      "type": "long",
+      "doc": "Start time of the job in millis",
+      "compliance": "NONE"
+    }, {
+      "name": "jobEndTime",
+      "type": "long",
+      "doc": "Finish time of the job in millis",
+      "compliance": "NONE"
+    }, {
+      "name": "proxyUser",
+      "type": "string",
+      "doc": "Proxy user (if applicable) that ran the Gobblin job",
+      "compliance": "NONE"
+    }, {
+      "name": "executorLink",
+      "type": "string",
+      "doc": "Link to where the job is running, currently limited to Azkaban",
+      "compliance": "NONE"
+    }, {
+      "name": "executorId",
+      "type": "string",

Review Comment:
   great Q on compilation failure! probably both ought to be null.
   
   compilation error events should be purely for *scheduled flows that previously compiled*--failure at submission time should be an interactive error only, no event. nonetheless, given we need them, perhaps we're asking `jobStatus.FAILURE` to do too much: it's already execution failure as well as failure to send to executor--now compilation failure as well? it becomes a status *category*, a catch-all. better might be another status, like `INVALID` or else named `COMPILATION_FAILURE` / `UNRESOLVED` (?).
   
   in fact, more I consider that, I'd seek also to discern `EXECUTION_FAILURE` from `SUBMISSION_FAILURE` (actually I dislike the overloaded terminology, where the user 'submits' a flow to gaas, and gaas 'submits' the job to the executor.) other possible terms might be `QUEUEING_FAILURE` or `LOST` or `UNAVAILABLE`, even `SKIPPED` if we consider possible quota overrun...  but overall, whatever the names, three distinct statuses to parallel the kinds of failure
   
   I realize that would depart from exclusively passing along only status names already defined in the gaas job state .pdl, but faced w/ the alternative imprecision, I believe that justifiable
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@gobblin.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [gobblin] phet commented on a diff in pull request #3610: [GOBBLIN-1750] Add schemas for observability events in GaaS

Posted by GitBox <gi...@apache.org>.
phet commented on code in PR #3610:
URL: https://github.com/apache/gobblin/pull/3610#discussion_r1049174660


##########
gobblin-metrics-libs/gobblin-metrics-base/src/main/avro/GaaSObservabilityEventExperimental.avsc:
##########
@@ -0,0 +1,167 @@
+{
+  "type": "record",
+  "name": "GaaSObservabilityEventExperimental",
+  "namespace": "org.apache.gobblin.metrics",
+  "doc": "An experimental format for GaaS to emit events during and after a job is executed.",
+  "fields": [
+    {
+      "name": "timestamp",
+      "type": "long",
+      "doc": "Time at which event was created in millis"
+    },
+    {
+      "name": "flowGroup",
+      "type": "string",
+      "doc": "Flow group for the GaaS flow",
+      "compliance": "NONE"
+    },
+    {
+      "name": "flowName",
+      "type": "string",
+      "doc": "Flow name for the GaaS flow",
+      "compliance": "NONE"
+    },
+    {
+      "name": "flowExecutionId",
+      "type": "long",
+      "doc": "Flow execution id for the GaaS flow",
+      "compliance": "NONE"
+    },
+    {
+      "name": "jobOrchestratedTime",
+      "type": [
+        "null",
+        "long"
+      ],
+      "doc": "Timestamp when the job was successfully sent to the job executor, null if it was unable to be sent."
+    },

Review Comment:
   I recommend moving down near `jobStartTime` and `jobEndTime`, since it's conceptually related, no?



##########
gobblin-metrics-libs/gobblin-metrics-base/src/main/avro/GaaSObservabilityEventV0.avsc:
##########
@@ -0,0 +1,135 @@
+{
+  "type": "record",
+  "name": "GaaSObservabilityEventExperimental",
+  "namespace" : "org.apache.gobblin.metrics",
+  "doc": "An experimental feature for GaaS to emit events during and after a job is executed.",
+  "fields": [
+    {
+      "name": "timestamp",
+      "type": "long",
+      "doc": "Time at which event was created in millis"
+    }, {
+      "name" : "flowGroup",
+      "type" : "string",
+      "doc" : "Flow group for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name" : "flowName",
+      "type" : "string",
+      "doc" : "Flow name for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name" : "flowExecutionId",
+      "type" : "long",
+      "doc" : "Flow execution id for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name": "jobOrchestratedTime",
+      "type": "long",
+      "doc": "Timestamp when the job was successfully sent to the job executor, -1 if it was unable to be sent."
+    }, {
+      "name": "lastFlowModificationTime",
+      "type": "long",
+      "doc": "Timestamp in millis when the flow config was last modified"
+    }, {
+      "name" : "flowGraphEdgeId",
+      "type" : "string",
+      "doc" : "Flow edge id, in format <src_node>_<dest_node>_<edge_id>",
+      "compliance" : "NONE"
+    }, {
+      "name": "jobName",
+      "type": "string",
+      "doc": "The name of the Gobblin job, found in the job template. One edge can contain multiple jobs",
+      "compliance" : "NONE"
+    }, {
+      "name": "jobStatus",
+      "type": {
+        "type": "enum",
+        "name": "JobStatus",
+        "symbols": [
+          "SUCCEEDED",
+          "FAILED",
+          "CANCELLED"
+        ],
+        "doc": "Final job status for this job in the GaaS flow",
+        "compliance": "NONE"
+      }
+    }, {
+      "name": "jobStartTime",
+      "type": "long",
+      "doc": "Start time of the job in millis",
+      "compliance": "NONE"
+    }, {
+      "name": "jobEndTime",
+      "type": "long",
+      "doc": "Finish time of the job in millis",
+      "compliance": "NONE"
+    }, {
+      "name": "proxyUser",
+      "type": "string",
+      "doc": "Proxy user (if applicable) that ran the Gobblin job",

Review Comment:
   what of the proposal to generalize this to `executionUserUrn`?



##########
gobblin-metrics-libs/gobblin-metrics-base/src/main/avro/GaaSObservabilityEventExperimental.avsc:
##########
@@ -0,0 +1,167 @@
+{
+  "type": "record",
+  "name": "GaaSObservabilityEventExperimental",
+  "namespace": "org.apache.gobblin.metrics",
+  "doc": "An experimental format for GaaS to emit events during and after a job is executed.",
+  "fields": [
+    {
+      "name": "timestamp",
+      "type": "long",
+      "doc": "Time at which event was created in millis"
+    },
+    {
+      "name": "flowGroup",
+      "type": "string",
+      "doc": "Flow group for the GaaS flow",
+      "compliance": "NONE"
+    },
+    {
+      "name": "flowName",
+      "type": "string",
+      "doc": "Flow name for the GaaS flow",
+      "compliance": "NONE"
+    },
+    {
+      "name": "flowExecutionId",
+      "type": "long",
+      "doc": "Flow execution id for the GaaS flow",
+      "compliance": "NONE"
+    },
+    {
+      "name": "jobOrchestratedTime",
+      "type": [
+        "null",
+        "long"
+      ],
+      "doc": "Timestamp when the job was successfully sent to the job executor, null if it was unable to be sent."
+    },
+    {
+      "name": "lastFlowModificationTime",
+      "type": "long",
+      "doc": "Timestamp in millis since Epoch when the flow config was last modified"
+    },
+    {
+      "name": "flowGraphEdgeId",
+      "type": "string",
+      "doc": "Flow edge id, in format <src_node>_<dest_node>_<edge_id>",
+      "compliance": "NONE"
+    },
+    {
+      "name": "jobName",
+      "type": "string",
+      "doc": "The name of the Gobblin job, found in the job template. One edge can contain multiple jobs",
+      "compliance": "NONE"
+    },
+    {
+      "name": "jobStatus",
+      "type": {
+        "type": "enum",
+        "name": "JobStatus",
+        "symbols": [
+          "SUCCEEDED",
+          "COMPILATION_FAILURE",
+          "SUBMISSION_FAILURE",
+          "EXEUCTION_FAILURE",
+          "CANCELLED"
+        ],
+        "doc": "Final job status for this job in the GaaS flow",
+        "compliance": "NONE"
+      }
+    },
+    {
+      "name": "jobStartTime",
+      "type": "long",
+      "doc": "Start time of the job in millis since Epoch",
+      "compliance": "NONE"
+    },
+    {
+      "name": "jobEndTime",
+      "type": "long",
+      "doc": "Finish time of the job in millis since Epoch",
+      "compliance": "NONE"
+    },
+    {
+      "name": "userIdentity",

Review Comment:
   sorry, I just left a comment when I didn't see this... must have looked at the wrong commit.
   
   looks great.  should we name (and document) to indicate it's meant to be an urn?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@gobblin.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [gobblin] Will-Lo commented on a diff in pull request #3610: [GOBBLIN-1750] Add schemas for observability events in GaaS

Posted by GitBox <gi...@apache.org>.
Will-Lo commented on code in PR #3610:
URL: https://github.com/apache/gobblin/pull/3610#discussion_r1060789952


##########
gobblin-metrics-libs/gobblin-metrics-base/src/main/avro/GaaSObservabilityEventExperimental.avsc:
##########
@@ -0,0 +1,167 @@
+{
+  "type": "record",
+  "name": "GaaSObservabilityEventExperimental",
+  "namespace": "org.apache.gobblin.metrics",
+  "doc": "An experimental format for GaaS to emit events during and after a job is executed.",
+  "fields": [
+    {
+      "name": "timestamp",
+      "type": "long",
+      "doc": "Time at which event was created in millis"
+    },
+    {
+      "name": "flowGroup",
+      "type": "string",
+      "doc": "Flow group for the GaaS flow",
+      "compliance": "NONE"
+    },
+    {
+      "name": "flowName",
+      "type": "string",
+      "doc": "Flow name for the GaaS flow",
+      "compliance": "NONE"
+    },
+    {
+      "name": "flowExecutionId",
+      "type": "long",
+      "doc": "Flow execution id for the GaaS flow",
+      "compliance": "NONE"
+    },
+    {
+      "name": "lastFlowModificationTime",
+      "type": "long",
+      "doc": "Timestamp in millis since Epoch when the flow config was last modified"
+    },
+    {
+      "name": "flowGraphEdgeId",
+      "type": "string",
+      "doc": "Flow edge id, in format <src_node>_<dest_node>_<edge_id>",
+      "compliance": "NONE"
+    },
+    {
+      "name": "jobName",
+      "type": "string",
+      "doc": "The name of the Gobblin job, found in the job template. One edge can contain multiple jobs",
+      "compliance": "NONE"
+    },
+    {
+      "name": "jobStatus",
+      "type": {
+        "type": "enum",
+        "name": "JobStatus",
+        "symbols": [
+          "SUCCEEDED",
+          "COMPILATION_FAILURE",
+          "SUBMISSION_FAILURE",
+          "EXEUCTION_FAILURE",
+          "CANCELLED"
+        ],
+        "doc": "Final job status for this job in the GaaS flow",
+        "compliance": "NONE"
+      }
+    },
+    {
+      "name": "jobOrchestratedTime",
+      "type": [
+        "null",
+        "long"
+      ],
+      "doc": "Timestamp when the job was successfully sent to the job executor, null if it was unable to be sent."
+    },
+    {
+      "name": "jobStartTime",
+      "type": "long",
+      "doc": "Start time of the job in millis since Epoch",
+      "compliance": "NONE"
+    },
+    {
+      "name": "jobEndTime",
+      "type": "long",
+      "doc": "Finish time of the job in millis since Epoch",
+      "compliance": "NONE"
+    },
+    {
+      "name": "executionUserUrn",
+      "type": [
+        "null",
+        "string"
+      ],
+      "doc": "User URN (if applicable) that runs the underlying Gobblin job",
+      "compliance": "NONE"
+    },
+    {
+      "name": "executorUrl",
+      "type": [
+        "null",

Review Comment:
   At least according to the LinkedIn DMRC they much prefer null defaults and values over non-null defaults. Null types and defaults makes the contracts more explicit



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@gobblin.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [gobblin] phet commented on a diff in pull request #3610: [GOBBLIN-1750] Add schemas for observability events in GaaS

Posted by GitBox <gi...@apache.org>.
phet commented on code in PR #3610:
URL: https://github.com/apache/gobblin/pull/3610#discussion_r1041836329


##########
gobblin-metrics-libs/gobblin-metrics-base/src/main/avro/GaaSObservabilityEventV0.avsc:
##########
@@ -0,0 +1,162 @@
+{
+  "type": "record",
+  "name": "GaaSObservabilityEventV0",
+  "namespace" : "org.apache.gobblin.metrics",
+  "fields": [
+    {
+      "name": "timestamp",
+      "type": "long",
+      "doc": "Time at which event was created in millis"
+    }, {
+      "name" : "flowGroup",
+      "type" : "string",
+      "doc" : "Flow group for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name" : "flowName",
+      "type" : "string",
+      "doc" : "Flow name for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name" : "flowExecutionId",
+      "type" : "long",
+      "doc" : "Flow execution id for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name": "jobSentToExecutor",
+      "type": "boolean",
+      "doc": "Whether or not this job was able to be sent to a job executor."
+    }, {
+      "name": "lastFlowModificationTime",
+      "type": "long",
+      "doc": "Timestamp in millis when the flow config was last modified"
+    }, {
+      "name" : "edgeId",
+      "type" : "string",
+      "doc" : "Flow edge id, in format <src_node>_<dest_node>_<edge_id>",
+      "compliance" : "NONE"
+    }, {
+      "name": "jobName",
+      "type": "string",
+      "doc": "The name of the Gobblin job, found in the job template. One edge can contain multiple jobs",
+      "compliance" : "NONE"
+    }, {
+      "name": "jobType",
+      "type": {
+        "type": "enum",
+        "name": "JobType",
+        "symbols": [
+          "COPY",
+          "RETENTION",
+          "GOBBLIN"
+        ],
+        "symbolDocs": {
+          "COPY": "Gobblin distcp job",
+          "RETENTION": "Gobblin retention job",
+          "GOBBLIN": "Any Gobblin job"
+        }
+      },
+      "doc": "Gobblin job type running on GaaS, determined by the compiled job template.",
+      "compliance": "NONE"
+    }, {
+      "name": "jobStatus",
+      "type": {
+        "type": "enum",
+        "name": "JobStatus",
+        "symbols": [
+          "SUCCEEDED",
+          "FAILED",
+          "CANCELLED"
+        ],
+        "doc": "Final job status for this job in the GaaS flow",
+        "compliance": "NONE"
+      }
+    }, {
+      "name": "jobStartTime",
+      "type": "long",
+      "doc": "Start time of the job in millis",
+      "compliance": "NONE"
+    }, {
+      "name": "jobEndTime",
+      "type": "long",
+      "doc": "Finish time of the job in millis",
+      "compliance": "NONE"
+    }, {
+      "name": "proxyUser",
+      "type": "string",
+      "doc": "Proxy user (if applicable) that ran the Gobblin job",
+      "compliance": "NONE"
+    }, {
+      "name": "executorLink",
+      "type": "string",
+      "doc": "Link to where the job is running, currently limited to Azkaban",
+      "compliance": "NONE"
+    }, {
+      "name": "executorId",
+      "type": "string",

Review Comment:
   great Q on compilation failure!  probably both ought to be null.
   
   compilation error events should be purely for *scheduled flows* that previously compiled--failure at submission time should be an interactive error only, no event.  nonetheless, as we need them, perhaps we're asking `jobStatus.FAILURE` to do too much:  it's already execution failure as well as failure to send to executor--now compilation failure as well?  it becomes a status *category*, a catch-all.  better might be another status, like `INVALID` or else named `COMPILATION_FAILURE` / `UNRESOLVED` (?).
   
   in fact, more I consider that, I'd seek to discern `EXECUTION_FAILURE` from `SUBMISSION_FAILURE` (I actually dislike the overloaded terminology, where the user 'submits' a flow to gaas, and gaas 'submits' the job to the executor.)  other possible terms might be `QUEUEING_FAILURE` or `LOST` or `UNAVAILABLE`, even `SKIPPED` if we consider possible quota overrun...
   
   I realize that is a departure from the status as defined in the gaas job state .pdl, but faced w/ the alternative, I believe it would be for the right reason



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@gobblin.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [gobblin] Will-Lo commented on a diff in pull request #3610: [GOBBLIN-1750] Add schemas for observability events in GaaS

Posted by GitBox <gi...@apache.org>.
Will-Lo commented on code in PR #3610:
URL: https://github.com/apache/gobblin/pull/3610#discussion_r1040297508


##########
gobblin-metrics-libs/gobblin-metrics-base/src/main/avro/GaaSObservabilityEvent.avsc:
##########
@@ -0,0 +1,110 @@
+{
+  "type": "record",
+  "name": "GaaSObservabilityEvent",
+  "namespace" : "org.apache.gobblin.metrics",
+  "fields": [
+    {
+      "name": "timestamp",
+      "type": "long",
+      "default": 0,
+      "doc": "Time at which event was created in millis"
+    },
+    {
+      "name" : "flowGroup",
+      "type" : "string",
+      "doc" : "Flow group for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name" : "flowName",
+      "type" : "string",
+      "doc" : "Flow name for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name" : "flowExecutionId",
+      "type" : "string",
+      "doc" : "Flow execution id for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name": "flowCompiled",
+      "type": "boolean",
+      "default": false,
+      "doc": "Whether or not this flow was compiled successfully."
+    }, {
+      "name": "lastFlowModifiedTimestamp",
+      "type": "long",
+      "doc": "Timestamp in millis when the flow config was last modified"
+    }, {
+      "name" : "edgeId",
+      "type" : "string",
+      "doc" : "Flow edge id, in format <src_node>_<dest_node>_<edge_id>",
+      "compliance" : "NONE"
+    }, {
+      "name": "jobName",
+      "type": "string",
+      "doc": "The name of the Gobblin job, found in the job template. One edge can contain multiple jobs",
+      "compliance" : "NONE"
+    }, {
+      "name": "jobType",
+      "type": {
+        "type": "enum",
+        "name": "JobType",
+        "symbols": [
+          "COPY",
+          "RETENTION",
+          "GOBBLIN"
+        ],
+        "symbolDocs": {
+          "COPY": "Gobblin distcp job",
+          "RETENTION": "Gobblin retention job",
+          "GOBBLIN": "Any Gobblin job"
+        },
+        "default": "GOBBLIN"
+      },
+      "doc": "Gobblin job type running on GaaS, determined by the compiled job template.",
+      "compliance": "NONE"
+    }, {
+      "name": "jobStatus",
+      "type": {
+        "type": "enum",
+        "name": "JobStatus",
+        "symbols": [
+          "SUCCEEDED",
+          "FAILED",
+          "CANCELLED"
+        ],
+        "doc": "Final job status for this job in the GaaS flow"
+      }
+    }, {
+      "name": "jobStartTime",
+      "type": "long",
+      "doc": "Start time of the job in millis"
+    }, {
+      "name": "jobEndTime",
+      "type": "long",
+      "doc": "Finish time of the job in millis"
+    }, {
+      "name": "jobRunTime",
+      "type": "long",
+      "doc": "Time taken for the job pipeline to run in millis",
+      "default": 0
+    }, {
+      "name": "proxyUser",
+      "type": "string",
+      "doc": "Proxy user (if applicable) that ran the Gobblin job",

Review Comment:
   I believe proxy users are only needed for jobs sent to Azkaban, I can remove the default for now



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@gobblin.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [gobblin] Will-Lo closed pull request #3610: [GOBBLIN-1750] Add schemas for observability events in GaaS

Posted by GitBox <gi...@apache.org>.
Will-Lo closed pull request #3610: [GOBBLIN-1750] Add schemas for observability events in GaaS
URL: https://github.com/apache/gobblin/pull/3610


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@gobblin.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [gobblin] phet commented on a diff in pull request #3610: [GOBBLIN-1750] Add schemas for observability events in GaaS

Posted by GitBox <gi...@apache.org>.
phet commented on code in PR #3610:
URL: https://github.com/apache/gobblin/pull/3610#discussion_r1041819858


##########
gobblin-metrics-libs/gobblin-metrics-base/src/main/avro/GaaSObservabilityEventV0.avsc:
##########
@@ -0,0 +1,135 @@
+{
+  "type": "record",
+  "name": "GaaSObservabilityEventExperimental",
+  "namespace" : "org.apache.gobblin.metrics",
+  "doc": "An experimental feature for GaaS to emit events during and after a job is executed.",
+  "fields": [
+    {
+      "name": "timestamp",
+      "type": "long",
+      "doc": "Time at which event was created in millis"
+    }, {
+      "name" : "flowGroup",
+      "type" : "string",
+      "doc" : "Flow group for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name" : "flowName",
+      "type" : "string",
+      "doc" : "Flow name for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name" : "flowExecutionId",
+      "type" : "long",
+      "doc" : "Flow execution id for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name": "jobOrchestratedTime",
+      "type": "long",
+      "doc": "Timestamp when the job was successfully sent to the job executor, -1 if it was unable to be sent."

Review Comment:
   I prefer this to the boolean.  and since now conceptually similar, I suggest moving adjacent to the `jobStartTime` field.
   
   also, as mentioned in reply to other comment, I prefer `null` to `-1`, so unless you have strong pref, I recommend that.  reason is that gives notice when the null check has been forgotten, e.g. when read in scala, where `["null", "Foo"]` maps to `Option[Foo]`



##########
gobblin-metrics-libs/gobblin-metrics-base/src/main/avro/GaaSObservabilityEventV0.avsc:
##########
@@ -0,0 +1,162 @@
+{
+  "type": "record",
+  "name": "GaaSObservabilityEventV0",
+  "namespace" : "org.apache.gobblin.metrics",
+  "fields": [
+    {
+      "name": "timestamp",
+      "type": "long",
+      "doc": "Time at which event was created in millis"
+    }, {
+      "name" : "flowGroup",
+      "type" : "string",
+      "doc" : "Flow group for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name" : "flowName",
+      "type" : "string",
+      "doc" : "Flow name for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name" : "flowExecutionId",
+      "type" : "long",
+      "doc" : "Flow execution id for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name": "jobSentToExecutor",
+      "type": "boolean",
+      "doc": "Whether or not this job was able to be sent to a job executor."
+    }, {
+      "name": "lastFlowModificationTime",
+      "type": "long",
+      "doc": "Timestamp in millis when the flow config was last modified"
+    }, {
+      "name" : "edgeId",
+      "type" : "string",
+      "doc" : "Flow edge id, in format <src_node>_<dest_node>_<edge_id>",
+      "compliance" : "NONE"
+    }, {
+      "name": "jobName",
+      "type": "string",
+      "doc": "The name of the Gobblin job, found in the job template. One edge can contain multiple jobs",
+      "compliance" : "NONE"
+    }, {
+      "name": "jobType",
+      "type": {
+        "type": "enum",
+        "name": "JobType",
+        "symbols": [
+          "COPY",
+          "RETENTION",
+          "GOBBLIN"
+        ],
+        "symbolDocs": {
+          "COPY": "Gobblin distcp job",
+          "RETENTION": "Gobblin retention job",
+          "GOBBLIN": "Any Gobblin job"
+        }
+      },
+      "doc": "Gobblin job type running on GaaS, determined by the compiled job template.",
+      "compliance": "NONE"
+    }, {
+      "name": "jobStatus",
+      "type": {
+        "type": "enum",
+        "name": "JobStatus",
+        "symbols": [
+          "SUCCEEDED",
+          "FAILED",
+          "CANCELLED"
+        ],
+        "doc": "Final job status for this job in the GaaS flow",
+        "compliance": "NONE"
+      }
+    }, {
+      "name": "jobStartTime",
+      "type": "long",
+      "doc": "Start time of the job in millis",
+      "compliance": "NONE"
+    }, {
+      "name": "jobEndTime",
+      "type": "long",
+      "doc": "Finish time of the job in millis",
+      "compliance": "NONE"
+    }, {
+      "name": "proxyUser",
+      "type": "string",
+      "doc": "Proxy user (if applicable) that ran the Gobblin job",
+      "compliance": "NONE"
+    }, {
+      "name": "executorLink",
+      "type": "string",
+      "doc": "Link to where the job is running, currently limited to Azkaban",
+      "compliance": "NONE"
+    }, {
+      "name": "executorId",
+      "type": "string",

Review Comment:
   great Q on compilation failure!  probably both ought to be null.
   
   compilation error events should be purely for *scheduled flows* that previously compiled--failure at submission time should be an interactive error only, no event.  nonetheless, as we need them, perhaps we're asking `jobStatus.FAILURE` to do too much:  it's already execution failure as well as failure to send to executor--now compilation failure as well?  it becomes a status *category*, a catch-all.  better might be another status, like `INVALID` or more specific `COMPILATION_FAILURE` / `UNRESOLVED` (?).
   
   in fact, more I consider that, I'd seek to discern `EXECUTION_FAILURE` from `SUBMISSION_FAILURE` (I dislike the overloaded terminology, since the user 'submits' a flow to gaas and then gaas 'submits' the job to the executor.)  other possible terms might be `QUEUEING_FAILURE` or `LOST` or `UNAVAILABLE`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@gobblin.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [gobblin] Will-Lo commented on a diff in pull request #3610: [GOBBLIN-1750] Add schemas for observability events in GaaS

Posted by GitBox <gi...@apache.org>.
Will-Lo commented on code in PR #3610:
URL: https://github.com/apache/gobblin/pull/3610#discussion_r1042556772


##########
gobblin-metrics-libs/gobblin-metrics-base/src/main/avro/GaaSObservabilityEventV0.avsc:
##########
@@ -0,0 +1,135 @@
+{
+  "type": "record",
+  "name": "GaaSObservabilityEventExperimental",
+  "namespace" : "org.apache.gobblin.metrics",
+  "doc": "An experimental feature for GaaS to emit events during and after a job is executed.",
+  "fields": [
+    {
+      "name": "timestamp",
+      "type": "long",
+      "doc": "Time at which event was created in millis"
+    }, {
+      "name" : "flowGroup",
+      "type" : "string",
+      "doc" : "Flow group for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name" : "flowName",
+      "type" : "string",
+      "doc" : "Flow name for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name" : "flowExecutionId",
+      "type" : "long",
+      "doc" : "Flow execution id for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name": "jobOrchestratedTime",
+      "type": "long",
+      "doc": "Timestamp when the job was successfully sent to the job executor, -1 if it was unable to be sent."
+    }, {
+      "name": "lastFlowModificationTime",
+      "type": "long",
+      "doc": "Timestamp in millis when the flow config was last modified"
+    }, {
+      "name" : "flowGraphEdgeId",
+      "type" : "string",
+      "doc" : "Flow edge id, in format <src_node>_<dest_node>_<edge_id>",
+      "compliance" : "NONE"
+    }, {
+      "name": "jobName",
+      "type": "string",
+      "doc": "The name of the Gobblin job, found in the job template. One edge can contain multiple jobs",
+      "compliance" : "NONE"
+    }, {
+      "name": "jobStatus",
+      "type": {
+        "type": "enum",
+        "name": "JobStatus",
+        "symbols": [
+          "SUCCEEDED",
+          "FAILED",
+          "CANCELLED"
+        ],
+        "doc": "Final job status for this job in the GaaS flow",
+        "compliance": "NONE"
+      }
+    }, {
+      "name": "jobStartTime",
+      "type": "long",
+      "doc": "Start time of the job in millis",
+      "compliance": "NONE"
+    }, {
+      "name": "jobEndTime",
+      "type": "long",
+      "doc": "Finish time of the job in millis",
+      "compliance": "NONE"
+    }, {
+      "name": "proxyUser",
+      "type": "string",
+      "doc": "Proxy user (if applicable) that ran the Gobblin job",

Review Comment:
   What about Fliptop GaaS where there's no proxy user?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@gobblin.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [gobblin] phet commented on a diff in pull request #3610: [GOBBLIN-1750] Add schemas for observability events in GaaS

Posted by GitBox <gi...@apache.org>.
phet commented on code in PR #3610:
URL: https://github.com/apache/gobblin/pull/3610#discussion_r1041842571


##########
gobblin-metrics-libs/gobblin-metrics-base/src/main/avro/GaaSObservabilityEventV0.avsc:
##########
@@ -0,0 +1,162 @@
+{
+  "type": "record",
+  "name": "GaaSObservabilityEventV0",
+  "namespace" : "org.apache.gobblin.metrics",
+  "fields": [
+    {
+      "name": "timestamp",
+      "type": "long",
+      "doc": "Time at which event was created in millis"
+    }, {
+      "name" : "flowGroup",
+      "type" : "string",
+      "doc" : "Flow group for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name" : "flowName",
+      "type" : "string",
+      "doc" : "Flow name for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name" : "flowExecutionId",
+      "type" : "long",
+      "doc" : "Flow execution id for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name": "jobSentToExecutor",
+      "type": "boolean",
+      "doc": "Whether or not this job was able to be sent to a job executor."
+    }, {
+      "name": "lastFlowModificationTime",
+      "type": "long",
+      "doc": "Timestamp in millis when the flow config was last modified"
+    }, {
+      "name" : "edgeId",
+      "type" : "string",
+      "doc" : "Flow edge id, in format <src_node>_<dest_node>_<edge_id>",
+      "compliance" : "NONE"
+    }, {
+      "name": "jobName",
+      "type": "string",
+      "doc": "The name of the Gobblin job, found in the job template. One edge can contain multiple jobs",
+      "compliance" : "NONE"
+    }, {
+      "name": "jobType",
+      "type": {
+        "type": "enum",
+        "name": "JobType",
+        "symbols": [
+          "COPY",
+          "RETENTION",
+          "GOBBLIN"
+        ],
+        "symbolDocs": {
+          "COPY": "Gobblin distcp job",
+          "RETENTION": "Gobblin retention job",
+          "GOBBLIN": "Any Gobblin job"
+        }
+      },
+      "doc": "Gobblin job type running on GaaS, determined by the compiled job template.",
+      "compliance": "NONE"
+    }, {
+      "name": "jobStatus",
+      "type": {
+        "type": "enum",
+        "name": "JobStatus",
+        "symbols": [
+          "SUCCEEDED",
+          "FAILED",
+          "CANCELLED"
+        ],
+        "doc": "Final job status for this job in the GaaS flow",
+        "compliance": "NONE"
+      }
+    }, {
+      "name": "jobStartTime",
+      "type": "long",
+      "doc": "Start time of the job in millis",
+      "compliance": "NONE"
+    }, {
+      "name": "jobEndTime",
+      "type": "long",
+      "doc": "Finish time of the job in millis",
+      "compliance": "NONE"
+    }, {
+      "name": "proxyUser",
+      "type": "string",
+      "doc": "Proxy user (if applicable) that ran the Gobblin job",
+      "compliance": "NONE"
+    }, {
+      "name": "executorLink",
+      "type": "string",
+      "doc": "Link to where the job is running, currently limited to Azkaban",
+      "compliance": "NONE"
+    }, {
+      "name": "executorId",
+      "type": "string",

Review Comment:
   great Q on compilation failure! probably both ought to be null.
   
   compilation error events should be purely for *scheduled flows that previously compiled*--failure at submission time should be an interactive error only, no event. nonetheless, given we need them, perhaps we're asking `jobStatus.FAILURE` to do too much: it's already execution failure as well as failure to send to executor--now compilation failure as well? it becomes a status *category*, a catch-all. better might be another status, like `INVALID` or else named `COMPILATION_FAILURE` / `UNRESOLVED` (?).
   
   in fact, more I consider that, I'd seek also to discern `EXECUTION_FAILURE` from `SUBMISSION_FAILURE` (actually I dislike the overloaded terminology, where the user 'submits' a flow to gaas, and gaas 'submits' the job to the executor.) other possible terms might be `QUEUEING_FAILURE` or `LOST` or `UNAVAILABLE`, even `SKIPPED` if we consider possible quota overrun...
   
   I realize that would depart from borrowing a status previously defined in the gaas job state .pdl, but faced w/ the alternative imprecision, I believe that justifiable
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@gobblin.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [gobblin] Will-Lo commented on a diff in pull request #3610: [GOBBLIN-1750] Add schemas for observability events in GaaS

Posted by GitBox <gi...@apache.org>.
Will-Lo commented on code in PR #3610:
URL: https://github.com/apache/gobblin/pull/3610#discussion_r1043930441


##########
gobblin-metrics-libs/gobblin-metrics-base/src/main/avro/GaaSObservabilityEventV0.avsc:
##########
@@ -0,0 +1,162 @@
+{
+  "type": "record",
+  "name": "GaaSObservabilityEventV0",
+  "namespace" : "org.apache.gobblin.metrics",
+  "fields": [
+    {
+      "name": "timestamp",
+      "type": "long",
+      "doc": "Time at which event was created in millis"
+    }, {
+      "name" : "flowGroup",
+      "type" : "string",
+      "doc" : "Flow group for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name" : "flowName",
+      "type" : "string",
+      "doc" : "Flow name for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name" : "flowExecutionId",
+      "type" : "long",
+      "doc" : "Flow execution id for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name": "jobSentToExecutor",
+      "type": "boolean",
+      "doc": "Whether or not this job was able to be sent to a job executor."
+    }, {
+      "name": "lastFlowModificationTime",
+      "type": "long",
+      "doc": "Timestamp in millis when the flow config was last modified"
+    }, {
+      "name" : "edgeId",
+      "type" : "string",
+      "doc" : "Flow edge id, in format <src_node>_<dest_node>_<edge_id>",
+      "compliance" : "NONE"
+    }, {
+      "name": "jobName",
+      "type": "string",
+      "doc": "The name of the Gobblin job, found in the job template. One edge can contain multiple jobs",
+      "compliance" : "NONE"
+    }, {
+      "name": "jobType",
+      "type": {
+        "type": "enum",
+        "name": "JobType",
+        "symbols": [
+          "COPY",
+          "RETENTION",
+          "GOBBLIN"
+        ],
+        "symbolDocs": {
+          "COPY": "Gobblin distcp job",
+          "RETENTION": "Gobblin retention job",
+          "GOBBLIN": "Any Gobblin job"
+        }
+      },
+      "doc": "Gobblin job type running on GaaS, determined by the compiled job template.",
+      "compliance": "NONE"
+    }, {
+      "name": "jobStatus",
+      "type": {
+        "type": "enum",
+        "name": "JobStatus",
+        "symbols": [
+          "SUCCEEDED",
+          "FAILED",
+          "CANCELLED"
+        ],
+        "doc": "Final job status for this job in the GaaS flow",
+        "compliance": "NONE"
+      }
+    }, {
+      "name": "jobStartTime",
+      "type": "long",
+      "doc": "Start time of the job in millis",
+      "compliance": "NONE"
+    }, {
+      "name": "jobEndTime",
+      "type": "long",
+      "doc": "Finish time of the job in millis",
+      "compliance": "NONE"
+    }, {
+      "name": "proxyUser",
+      "type": "string",
+      "doc": "Proxy user (if applicable) that ran the Gobblin job",
+      "compliance": "NONE"
+    }, {
+      "name": "executorLink",
+      "type": "string",
+      "doc": "Link to where the job is running, currently limited to Azkaban",
+      "compliance": "NONE"
+    }, {
+      "name": "executorId",
+      "type": "string",

Review Comment:
   I'll break up `FAILURE` to `COMPILATION_FAILURE`, `SUBMISSION_FAILURE`, and `EXECUTION_FAILURE`.
   Thanks for this feedback :) We can probably update GaaS to have more statuses too but that would come in a different PR and would be a fairly major API change for users.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@gobblin.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [gobblin] Will-Lo commented on a diff in pull request #3610: [GOBBLIN-1750] Add schemas for observability events in GaaS

Posted by GitBox <gi...@apache.org>.
Will-Lo commented on code in PR #3610:
URL: https://github.com/apache/gobblin/pull/3610#discussion_r1041541078


##########
gobblin-metrics-libs/gobblin-metrics-base/src/main/avro/GaaSObservabilityEventV0.avsc:
##########
@@ -0,0 +1,162 @@
+{
+  "type": "record",
+  "name": "GaaSObservabilityEventV0",
+  "namespace" : "org.apache.gobblin.metrics",
+  "fields": [
+    {
+      "name": "timestamp",
+      "type": "long",
+      "doc": "Time at which event was created in millis"
+    }, {
+      "name" : "flowGroup",
+      "type" : "string",
+      "doc" : "Flow group for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name" : "flowName",
+      "type" : "string",
+      "doc" : "Flow name for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name" : "flowExecutionId",
+      "type" : "long",
+      "doc" : "Flow execution id for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name": "jobSentToExecutor",
+      "type": "boolean",
+      "doc": "Whether or not this job was able to be sent to a job executor."

Review Comment:
   Yes I can repurpose this, what should the value be when the job is unable to be sent? -1?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@gobblin.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [gobblin] phet commented on a diff in pull request #3610: [GOBBLIN-1750] Add schemas for observability events in GaaS

Posted by GitBox <gi...@apache.org>.
phet commented on code in PR #3610:
URL: https://github.com/apache/gobblin/pull/3610#discussion_r1041836329


##########
gobblin-metrics-libs/gobblin-metrics-base/src/main/avro/GaaSObservabilityEventV0.avsc:
##########
@@ -0,0 +1,162 @@
+{
+  "type": "record",
+  "name": "GaaSObservabilityEventV0",
+  "namespace" : "org.apache.gobblin.metrics",
+  "fields": [
+    {
+      "name": "timestamp",
+      "type": "long",
+      "doc": "Time at which event was created in millis"
+    }, {
+      "name" : "flowGroup",
+      "type" : "string",
+      "doc" : "Flow group for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name" : "flowName",
+      "type" : "string",
+      "doc" : "Flow name for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name" : "flowExecutionId",
+      "type" : "long",
+      "doc" : "Flow execution id for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name": "jobSentToExecutor",
+      "type": "boolean",
+      "doc": "Whether or not this job was able to be sent to a job executor."
+    }, {
+      "name": "lastFlowModificationTime",
+      "type": "long",
+      "doc": "Timestamp in millis when the flow config was last modified"
+    }, {
+      "name" : "edgeId",
+      "type" : "string",
+      "doc" : "Flow edge id, in format <src_node>_<dest_node>_<edge_id>",
+      "compliance" : "NONE"
+    }, {
+      "name": "jobName",
+      "type": "string",
+      "doc": "The name of the Gobblin job, found in the job template. One edge can contain multiple jobs",
+      "compliance" : "NONE"
+    }, {
+      "name": "jobType",
+      "type": {
+        "type": "enum",
+        "name": "JobType",
+        "symbols": [
+          "COPY",
+          "RETENTION",
+          "GOBBLIN"
+        ],
+        "symbolDocs": {
+          "COPY": "Gobblin distcp job",
+          "RETENTION": "Gobblin retention job",
+          "GOBBLIN": "Any Gobblin job"
+        }
+      },
+      "doc": "Gobblin job type running on GaaS, determined by the compiled job template.",
+      "compliance": "NONE"
+    }, {
+      "name": "jobStatus",
+      "type": {
+        "type": "enum",
+        "name": "JobStatus",
+        "symbols": [
+          "SUCCEEDED",
+          "FAILED",
+          "CANCELLED"
+        ],
+        "doc": "Final job status for this job in the GaaS flow",
+        "compliance": "NONE"
+      }
+    }, {
+      "name": "jobStartTime",
+      "type": "long",
+      "doc": "Start time of the job in millis",
+      "compliance": "NONE"
+    }, {
+      "name": "jobEndTime",
+      "type": "long",
+      "doc": "Finish time of the job in millis",
+      "compliance": "NONE"
+    }, {
+      "name": "proxyUser",
+      "type": "string",
+      "doc": "Proxy user (if applicable) that ran the Gobblin job",
+      "compliance": "NONE"
+    }, {
+      "name": "executorLink",
+      "type": "string",
+      "doc": "Link to where the job is running, currently limited to Azkaban",
+      "compliance": "NONE"
+    }, {
+      "name": "executorId",
+      "type": "string",

Review Comment:
   great Q on compilation failure!  probably both ought to be null.
   
   compilation error events should be purely for *scheduled flows* that previously compiled--failure at submission time should be an interactive error only, no event.  nonetheless, as we need them, perhaps we're asking `jobStatus.FAILURE` to do too much:  it's already execution failure as well as failure to send to executor--now compilation failure as well?  it becomes a status *category*, a catch-all.  better might be another status, like `INVALID` or else named `COMPILATION_FAILURE` / `UNRESOLVED` (?).
   
   in fact, more I consider that, I'd seek to discern `EXECUTION_FAILURE` from `SUBMISSION_FAILURE` (I actually dislike the overloaded terminology, where the user 'submits' a flow to gaas, and gaas 'submits' the job to the executor.)  other possible terms might be `QUEUEING_FAILURE` or `LOST` or `UNAVAILABLE`, even `SKIPPED` if we consider possible quota overrun...
   
   I realize that would departure from using a status previously defined in the gaas job state .pdl, but faced w/ the alternative, I believe it would be for the right reason



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@gobblin.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [gobblin] Will-Lo closed pull request #3610: [GOBBLIN-1750] Add schemas for observability events in GaaS

Posted by GitBox <gi...@apache.org>.
Will-Lo closed pull request #3610: [GOBBLIN-1750] Add schemas for observability events in GaaS
URL: https://github.com/apache/gobblin/pull/3610


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@gobblin.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [gobblin] Will-Lo merged pull request #3610: [GOBBLIN-1750] Add schemas for observability events in GaaS

Posted by GitBox <gi...@apache.org>.
Will-Lo merged PR #3610:
URL: https://github.com/apache/gobblin/pull/3610


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@gobblin.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [gobblin] phet commented on a diff in pull request #3610: [GOBBLIN-1750] Add schemas for observability events in GaaS

Posted by GitBox <gi...@apache.org>.
phet commented on code in PR #3610:
URL: https://github.com/apache/gobblin/pull/3610#discussion_r1041842094


##########
gobblin-metrics-libs/gobblin-metrics-base/src/main/avro/GaaSObservabilityEventV0.avsc:
##########
@@ -0,0 +1,135 @@
+{
+  "type": "record",
+  "name": "GaaSObservabilityEventExperimental",
+  "namespace" : "org.apache.gobblin.metrics",
+  "doc": "An experimental feature for GaaS to emit events during and after a job is executed.",
+  "fields": [
+    {
+      "name": "timestamp",
+      "type": "long",
+      "doc": "Time at which event was created in millis"
+    }, {
+      "name" : "flowGroup",
+      "type" : "string",
+      "doc" : "Flow group for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name" : "flowName",
+      "type" : "string",
+      "doc" : "Flow name for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name" : "flowExecutionId",
+      "type" : "long",
+      "doc" : "Flow execution id for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name": "jobOrchestratedTime",
+      "type": "long",
+      "doc": "Timestamp when the job was successfully sent to the job executor, -1 if it was unable to be sent."
+    }, {
+      "name": "lastFlowModificationTime",
+      "type": "long",
+      "doc": "Timestamp in millis when the flow config was last modified"
+    }, {
+      "name" : "flowGraphEdgeId",
+      "type" : "string",
+      "doc" : "Flow edge id, in format <src_node>_<dest_node>_<edge_id>",
+      "compliance" : "NONE"
+    }, {
+      "name": "jobName",
+      "type": "string",
+      "doc": "The name of the Gobblin job, found in the job template. One edge can contain multiple jobs",
+      "compliance" : "NONE"
+    }, {
+      "name": "jobStatus",
+      "type": {
+        "type": "enum",
+        "name": "JobStatus",
+        "symbols": [
+          "SUCCEEDED",
+          "FAILED",
+          "CANCELLED"
+        ],
+        "doc": "Final job status for this job in the GaaS flow",
+        "compliance": "NONE"
+      }
+    }, {
+      "name": "jobStartTime",
+      "type": "long",
+      "doc": "Start time of the job in millis",
+      "compliance": "NONE"
+    }, {
+      "name": "jobEndTime",
+      "type": "long",
+      "doc": "Finish time of the job in millis",
+      "compliance": "NONE"
+    }, {
+      "name": "proxyUser",
+      "type": "string",
+      "doc": "Proxy user (if applicable) that ran the Gobblin job",

Review Comment:
   what does "(if applicable)" mean?  I'd suggest always setting this to the effective user executing the job, so non-optional, rather than exploring making it nullable...



##########
gobblin-metrics-libs/gobblin-metrics-base/src/main/avro/GaaSObservabilityEventV0.avsc:
##########
@@ -0,0 +1,135 @@
+{
+  "type": "record",
+  "name": "GaaSObservabilityEventExperimental",
+  "namespace" : "org.apache.gobblin.metrics",
+  "doc": "An experimental feature for GaaS to emit events during and after a job is executed.",

Review Comment:
   nit: yes, the *feature* is experimental, but the most experimental thing about this schema is it's *format*



##########
gobblin-metrics-libs/gobblin-metrics-base/src/main/avro/GaaSObservabilityEventV0.avsc:
##########
@@ -0,0 +1,162 @@
+{
+  "type": "record",
+  "name": "GaaSObservabilityEventV0",
+  "namespace" : "org.apache.gobblin.metrics",
+  "fields": [
+    {
+      "name": "timestamp",
+      "type": "long",
+      "doc": "Time at which event was created in millis"
+    }, {
+      "name" : "flowGroup",
+      "type" : "string",
+      "doc" : "Flow group for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name" : "flowName",
+      "type" : "string",
+      "doc" : "Flow name for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name" : "flowExecutionId",
+      "type" : "long",
+      "doc" : "Flow execution id for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name": "jobSentToExecutor",
+      "type": "boolean",
+      "doc": "Whether or not this job was able to be sent to a job executor."
+    }, {
+      "name": "lastFlowModificationTime",
+      "type": "long",
+      "doc": "Timestamp in millis when the flow config was last modified"
+    }, {
+      "name" : "edgeId",
+      "type" : "string",
+      "doc" : "Flow edge id, in format <src_node>_<dest_node>_<edge_id>",
+      "compliance" : "NONE"
+    }, {
+      "name": "jobName",
+      "type": "string",
+      "doc": "The name of the Gobblin job, found in the job template. One edge can contain multiple jobs",
+      "compliance" : "NONE"
+    }, {
+      "name": "jobType",
+      "type": {
+        "type": "enum",
+        "name": "JobType",
+        "symbols": [
+          "COPY",
+          "RETENTION",
+          "GOBBLIN"
+        ],
+        "symbolDocs": {
+          "COPY": "Gobblin distcp job",
+          "RETENTION": "Gobblin retention job",
+          "GOBBLIN": "Any Gobblin job"
+        }
+      },
+      "doc": "Gobblin job type running on GaaS, determined by the compiled job template.",
+      "compliance": "NONE"
+    }, {
+      "name": "jobStatus",
+      "type": {
+        "type": "enum",
+        "name": "JobStatus",
+        "symbols": [
+          "SUCCEEDED",
+          "FAILED",
+          "CANCELLED"
+        ],
+        "doc": "Final job status for this job in the GaaS flow",
+        "compliance": "NONE"
+      }
+    }, {
+      "name": "jobStartTime",
+      "type": "long",
+      "doc": "Start time of the job in millis",
+      "compliance": "NONE"
+    }, {
+      "name": "jobEndTime",
+      "type": "long",
+      "doc": "Finish time of the job in millis",
+      "compliance": "NONE"
+    }, {
+      "name": "proxyUser",
+      "type": "string",
+      "doc": "Proxy user (if applicable) that ran the Gobblin job",
+      "compliance": "NONE"
+    }, {
+      "name": "executorLink",
+      "type": "string",
+      "doc": "Link to where the job is running, currently limited to Azkaban",
+      "compliance": "NONE"
+    }, {
+      "name": "executorId",
+      "type": "string",

Review Comment:
   
   great Q on compilation failure! probably both ought to be null.
   
   compilation error events should be purely for scheduled flows that previously compiled--failure at submission time should be an interactive error only, no event. nonetheless, given we need them, perhaps we're asking `jobStatus.FAILURE` to do too much: it's already execution failure as well as failure to send to executor--now compilation failure as well? it becomes a status *category*, a catch-all. better might be another status, like `INVALID` or else named `COMPILATION_FAILURE` / `UNRESOLVED` (?).
   
   in fact, more I consider that, I'd seek also to discern `EXECUTION_FAILURE` from `SUBMISSION_FAILURE` (actually I dislike the overloaded terminology, where the user 'submits' a flow to gaas, and gaas 'submits' the job to the executor.) other possible terms might be `QUEUEING_FAILURE` or `LOST` or `UNAVAILABLE`, even `SKIPPED` if we consider possible quota overrun...
   
   I realize that would depart from borrowing a status previously defined in the gaas job state .pdl, but faced w/ the alternative imprecision, I believe that justifiable
   



##########
gobblin-metrics-libs/gobblin-metrics-base/src/main/avro/GaaSObservabilityEventV0.avsc:
##########
@@ -0,0 +1,135 @@
+{
+  "type": "record",
+  "name": "GaaSObservabilityEventExperimental",
+  "namespace" : "org.apache.gobblin.metrics",
+  "doc": "An experimental feature for GaaS to emit events during and after a job is executed.",
+  "fields": [
+    {
+      "name": "timestamp",
+      "type": "long",
+      "doc": "Time at which event was created in millis"
+    }, {
+      "name" : "flowGroup",
+      "type" : "string",
+      "doc" : "Flow group for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name" : "flowName",
+      "type" : "string",
+      "doc" : "Flow name for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name" : "flowExecutionId",
+      "type" : "long",
+      "doc" : "Flow execution id for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name": "jobOrchestratedTime",
+      "type": "long",
+      "doc": "Timestamp when the job was successfully sent to the job executor, -1 if it was unable to be sent."
+    }, {
+      "name": "lastFlowModificationTime",
+      "type": "long",
+      "doc": "Timestamp in millis when the flow config was last modified"
+    }, {
+      "name" : "flowGraphEdgeId",
+      "type" : "string",
+      "doc" : "Flow edge id, in format <src_node>_<dest_node>_<edge_id>",
+      "compliance" : "NONE"
+    }, {
+      "name": "jobName",
+      "type": "string",
+      "doc": "The name of the Gobblin job, found in the job template. One edge can contain multiple jobs",
+      "compliance" : "NONE"
+    }, {
+      "name": "jobStatus",
+      "type": {
+        "type": "enum",
+        "name": "JobStatus",
+        "symbols": [
+          "SUCCEEDED",
+          "FAILED",
+          "CANCELLED"
+        ],
+        "doc": "Final job status for this job in the GaaS flow",
+        "compliance": "NONE"
+      }
+    }, {
+      "name": "jobStartTime",
+      "type": "long",
+      "doc": "Start time of the job in millis",

Review Comment:
   I believe you mean millis since The Epoch, but best to specify



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@gobblin.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [gobblin] phet commented on a diff in pull request #3610: [GOBBLIN-1750] Add schemas for observability events in GaaS

Posted by GitBox <gi...@apache.org>.
phet commented on code in PR #3610:
URL: https://github.com/apache/gobblin/pull/3610#discussion_r1041842571


##########
gobblin-metrics-libs/gobblin-metrics-base/src/main/avro/GaaSObservabilityEventV0.avsc:
##########
@@ -0,0 +1,162 @@
+{
+  "type": "record",
+  "name": "GaaSObservabilityEventV0",
+  "namespace" : "org.apache.gobblin.metrics",
+  "fields": [
+    {
+      "name": "timestamp",
+      "type": "long",
+      "doc": "Time at which event was created in millis"
+    }, {
+      "name" : "flowGroup",
+      "type" : "string",
+      "doc" : "Flow group for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name" : "flowName",
+      "type" : "string",
+      "doc" : "Flow name for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name" : "flowExecutionId",
+      "type" : "long",
+      "doc" : "Flow execution id for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name": "jobSentToExecutor",
+      "type": "boolean",
+      "doc": "Whether or not this job was able to be sent to a job executor."
+    }, {
+      "name": "lastFlowModificationTime",
+      "type": "long",
+      "doc": "Timestamp in millis when the flow config was last modified"
+    }, {
+      "name" : "edgeId",
+      "type" : "string",
+      "doc" : "Flow edge id, in format <src_node>_<dest_node>_<edge_id>",
+      "compliance" : "NONE"
+    }, {
+      "name": "jobName",
+      "type": "string",
+      "doc": "The name of the Gobblin job, found in the job template. One edge can contain multiple jobs",
+      "compliance" : "NONE"
+    }, {
+      "name": "jobType",
+      "type": {
+        "type": "enum",
+        "name": "JobType",
+        "symbols": [
+          "COPY",
+          "RETENTION",
+          "GOBBLIN"
+        ],
+        "symbolDocs": {
+          "COPY": "Gobblin distcp job",
+          "RETENTION": "Gobblin retention job",
+          "GOBBLIN": "Any Gobblin job"
+        }
+      },
+      "doc": "Gobblin job type running on GaaS, determined by the compiled job template.",
+      "compliance": "NONE"
+    }, {
+      "name": "jobStatus",
+      "type": {
+        "type": "enum",
+        "name": "JobStatus",
+        "symbols": [
+          "SUCCEEDED",
+          "FAILED",
+          "CANCELLED"
+        ],
+        "doc": "Final job status for this job in the GaaS flow",
+        "compliance": "NONE"
+      }
+    }, {
+      "name": "jobStartTime",
+      "type": "long",
+      "doc": "Start time of the job in millis",
+      "compliance": "NONE"
+    }, {
+      "name": "jobEndTime",
+      "type": "long",
+      "doc": "Finish time of the job in millis",
+      "compliance": "NONE"
+    }, {
+      "name": "proxyUser",
+      "type": "string",
+      "doc": "Proxy user (if applicable) that ran the Gobblin job",
+      "compliance": "NONE"
+    }, {
+      "name": "executorLink",
+      "type": "string",
+      "doc": "Link to where the job is running, currently limited to Azkaban",
+      "compliance": "NONE"
+    }, {
+      "name": "executorId",
+      "type": "string",

Review Comment:
   great Q on compilation failure! probably both ought to be null.
   
   compilation error events should be purely for *scheduled flows that previously compiled*--failure at submission time should be an interactive error only, no event. nonetheless, given we need them, perhaps we're asking `jobStatus.FAILURE` to do too much: it's already execution failure as well as failure to send to executor--now compilation failure as well? it becomes a status *category*, a catch-all. better might be another status, like `INVALID` or else named `COMPILATION_FAILURE` / `UNRESOLVED` (?).
   
   in fact, more I consider that, I'd seek also to discern `EXECUTION_FAILURE` from `SUBMISSION_FAILURE` (actually I dislike the overloaded terminology, where the user 'submits' a flow to gaas, and gaas 'submits' the job to the executor.) other possible terms might be `QUEUEING_FAILURE` or `LOST` or `UNAVAILABLE`, even `SKIPPED` if we consider possible quota overrun...  but overall, whatever the names, three distinct statuses for each kind of failure
   
   I realize that would depart from borrowing a status previously defined in the gaas job state .pdl, but faced w/ the alternative imprecision, I believe that justifiable
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@gobblin.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [gobblin] phet commented on a diff in pull request #3610: [GOBBLIN-1750] Add schemas for observability events in GaaS

Posted by GitBox <gi...@apache.org>.
phet commented on code in PR #3610:
URL: https://github.com/apache/gobblin/pull/3610#discussion_r1041819858


##########
gobblin-metrics-libs/gobblin-metrics-base/src/main/avro/GaaSObservabilityEventV0.avsc:
##########
@@ -0,0 +1,135 @@
+{
+  "type": "record",
+  "name": "GaaSObservabilityEventExperimental",
+  "namespace" : "org.apache.gobblin.metrics",
+  "doc": "An experimental feature for GaaS to emit events during and after a job is executed.",
+  "fields": [
+    {
+      "name": "timestamp",
+      "type": "long",
+      "doc": "Time at which event was created in millis"
+    }, {
+      "name" : "flowGroup",
+      "type" : "string",
+      "doc" : "Flow group for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name" : "flowName",
+      "type" : "string",
+      "doc" : "Flow name for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name" : "flowExecutionId",
+      "type" : "long",
+      "doc" : "Flow execution id for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name": "jobOrchestratedTime",
+      "type": "long",
+      "doc": "Timestamp when the job was successfully sent to the job executor, -1 if it was unable to be sent."

Review Comment:
   I prefer this to the boolean.  and since now conceptually similar, I suggest moving adjacent to the `jobStartTime` field.
   
   also, as mentioned in reply to other comment, I prefer `null` to `-1`, so unless you have strong pref, I recommend for the reason it gives notice when the null check has been forgotten, e.g. when read in scala, where `["null", "Foo"]` maps to `Option[Foo]`



##########
gobblin-metrics-libs/gobblin-metrics-base/src/main/avro/GaaSObservabilityEventV0.avsc:
##########
@@ -0,0 +1,135 @@
+{
+  "type": "record",
+  "name": "GaaSObservabilityEventExperimental",
+  "namespace" : "org.apache.gobblin.metrics",
+  "doc": "An experimental feature for GaaS to emit events during and after a job is executed.",
+  "fields": [
+    {
+      "name": "timestamp",
+      "type": "long",
+      "doc": "Time at which event was created in millis"
+    }, {
+      "name" : "flowGroup",
+      "type" : "string",
+      "doc" : "Flow group for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name" : "flowName",
+      "type" : "string",
+      "doc" : "Flow name for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name" : "flowExecutionId",
+      "type" : "long",
+      "doc" : "Flow execution id for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name": "jobOrchestratedTime",
+      "type": "long",
+      "doc": "Timestamp when the job was successfully sent to the job executor, -1 if it was unable to be sent."

Review Comment:
   I prefer this to the boolean.  and since now conceptually similar, I suggest moving adjacent to the `jobStartTime` field.
   
   also, as mentioned in reply to other comment, I prefer `null` to `-1`, so unless you have strong pref, I recommend for the reason it gives notice when the null check has been forgotten, e.g. upon access in scala, where `["null", "Foo"]` maps to `Option[Foo]`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@gobblin.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [gobblin] umustafi commented on a diff in pull request #3610: [GOBBLIN-1750] Add schemas for observability events in GaaS

Posted by GitBox <gi...@apache.org>.
umustafi commented on code in PR #3610:
URL: https://github.com/apache/gobblin/pull/3610#discussion_r1036526961


##########
gobblin-metrics-libs/gobblin-metrics-base/src/main/avro/GaaSObservabilityEvent.avsc:
##########
@@ -0,0 +1,110 @@
+{
+  "type": "record",
+  "name": "GaaSObservabilityEvent",
+  "namespace" : "org.apache.gobblin.metrics",
+  "fields": [
+    {
+      "name": "timestamp",
+      "type": "long",
+      "default": 0,
+      "doc": "Time at which event was created in millis"
+    },
+    {
+      "name" : "flowGroup",
+      "type" : "string",
+      "doc" : "Flow group for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name" : "flowName",
+      "type" : "string",
+      "doc" : "Flow name for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name" : "flowExecutionId",
+      "type" : "string",

Review Comment:
   this is usually a long, do we want string version?



##########
gobblin-metrics-libs/gobblin-metrics-base/src/main/avro/GaaSObservabilityEvent.avsc:
##########
@@ -0,0 +1,110 @@
+{
+  "type": "record",
+  "name": "GaaSObservabilityEvent",
+  "namespace" : "org.apache.gobblin.metrics",
+  "fields": [
+    {
+      "name": "timestamp",
+      "type": "long",
+      "default": 0,
+      "doc": "Time at which event was created in millis"
+    },
+    {
+      "name" : "flowGroup",
+      "type" : "string",
+      "doc" : "Flow group for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name" : "flowName",
+      "type" : "string",
+      "doc" : "Flow name for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name" : "flowExecutionId",
+      "type" : "string",
+      "doc" : "Flow execution id for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name": "flowCompiled",
+      "type": "boolean",
+      "default": false,
+      "doc": "Whether or not this flow was compiled successfully."
+    }, {
+      "name": "lastFlowModifiedTimestamp",

Review Comment:
   something like `lastFlowModificationTime` reads better



##########
gobblin-metrics-libs/gobblin-metrics-base/src/main/avro/GaaSObservabilityEvent.avsc:
##########
@@ -0,0 +1,110 @@
+{
+  "type": "record",
+  "name": "GaaSObservabilityEvent",
+  "namespace" : "org.apache.gobblin.metrics",
+  "fields": [
+    {
+      "name": "timestamp",
+      "type": "long",
+      "default": 0,
+      "doc": "Time at which event was created in millis"
+    },
+    {
+      "name" : "flowGroup",
+      "type" : "string",
+      "doc" : "Flow group for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name" : "flowName",
+      "type" : "string",
+      "doc" : "Flow name for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name" : "flowExecutionId",
+      "type" : "string",
+      "doc" : "Flow execution id for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name": "flowCompiled",
+      "type": "boolean",
+      "default": false,
+      "doc": "Whether or not this flow was compiled successfully."
+    }, {
+      "name": "lastFlowModifiedTimestamp",
+      "type": "long",
+      "doc": "Timestamp in millis when the flow config was last modified"
+    }, {
+      "name" : "edgeId",
+      "type" : "string",
+      "doc" : "Flow edge id, in format <src_node>_<dest_node>_<edge_id>",
+      "compliance" : "NONE"
+    }, {
+      "name": "jobName",
+      "type": "string",
+      "doc": "The name of the Gobblin job, found in the job template. One edge can contain multiple jobs",
+      "compliance" : "NONE"
+    }, {
+      "name": "jobType",
+      "type": {
+        "type": "enum",
+        "name": "JobType",
+        "symbols": [
+          "COPY",
+          "RETENTION",
+          "GOBBLIN"
+        ],
+        "symbolDocs": {
+          "COPY": "Gobblin distcp job",
+          "RETENTION": "Gobblin retention job",
+          "GOBBLIN": "Any Gobblin job"
+        },
+        "default": "GOBBLIN"
+      },
+      "doc": "Gobblin job type running on GaaS, determined by the compiled job template.",
+      "compliance": "NONE"
+    }, {
+      "name": "jobStatus",
+      "type": {
+        "type": "enum",
+        "name": "JobStatus",
+        "symbols": [
+          "SUCCEEDED",
+          "FAILED",
+          "CANCELLED"
+        ],
+        "doc": "Final job status for this job in the GaaS flow"
+      }
+    }, {
+      "name": "jobStartTime",
+      "type": "long",
+      "doc": "Start time of the job in millis"
+    }, {
+      "name": "jobEndTime",
+      "type": "long",
+      "doc": "Finish time of the job in millis"
+    }, {
+      "name": "jobRunTime",
+      "type": "long",
+      "doc": "Time taken for the job pipeline to run in millis",
+      "default": 0
+    }, {
+      "name": "proxyUser",
+      "type": "string",
+      "doc": "Proxy user (if applicable) that ran the Gobblin job",

Review Comment:
   for gobblinJob are there no proxyUsers or users? what case is this



##########
gobblin-metrics-libs/gobblin-metrics-base/src/main/avro/GaaSObservabilityEvent.avsc:
##########
@@ -0,0 +1,110 @@
+{
+  "type": "record",
+  "name": "GaaSObservabilityEvent",
+  "namespace" : "org.apache.gobblin.metrics",
+  "fields": [
+    {
+      "name": "timestamp",
+      "type": "long",
+      "default": 0,
+      "doc": "Time at which event was created in millis"
+    },
+    {
+      "name" : "flowGroup",
+      "type" : "string",
+      "doc" : "Flow group for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name" : "flowName",
+      "type" : "string",
+      "doc" : "Flow name for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name" : "flowExecutionId",
+      "type" : "string",
+      "doc" : "Flow execution id for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name": "flowCompiled",
+      "type": "boolean",
+      "default": false,
+      "doc": "Whether or not this flow was compiled successfully."
+    }, {
+      "name": "lastFlowModifiedTimestamp",
+      "type": "long",
+      "doc": "Timestamp in millis when the flow config was last modified"
+    }, {
+      "name" : "edgeId",
+      "type" : "string",
+      "doc" : "Flow edge id, in format <src_node>_<dest_node>_<edge_id>",
+      "compliance" : "NONE"
+    }, {
+      "name": "jobName",
+      "type": "string",
+      "doc": "The name of the Gobblin job, found in the job template. One edge can contain multiple jobs",
+      "compliance" : "NONE"
+    }, {
+      "name": "jobType",
+      "type": {
+        "type": "enum",
+        "name": "JobType",
+        "symbols": [
+          "COPY",
+          "RETENTION",
+          "GOBBLIN"
+        ],
+        "symbolDocs": {
+          "COPY": "Gobblin distcp job",
+          "RETENTION": "Gobblin retention job",
+          "GOBBLIN": "Any Gobblin job"
+        },
+        "default": "GOBBLIN"
+      },
+      "doc": "Gobblin job type running on GaaS, determined by the compiled job template.",
+      "compliance": "NONE"
+    }, {
+      "name": "jobStatus",
+      "type": {
+        "type": "enum",
+        "name": "JobStatus",
+        "symbols": [
+          "SUCCEEDED",
+          "FAILED",
+          "CANCELLED"
+        ],
+        "doc": "Final job status for this job in the GaaS flow"
+      }
+    }, {
+      "name": "jobStartTime",
+      "type": "long",
+      "doc": "Start time of the job in millis"
+    }, {
+      "name": "jobEndTime",
+      "type": "long",
+      "doc": "Finish time of the job in millis"
+    }, {
+      "name": "jobRunTime",
+      "type": "long",
+      "doc": "Time taken for the job pipeline to run in millis",
+      "default": 0
+    }, {
+      "name": "proxyUser",
+      "type": "string",
+      "doc": "Proxy user (if applicable) that ran the Gobblin job",
+      "default": "NA"
+    }, {
+      "name": "executorLink",
+      "type": "string",
+      "doc": "Link to where the job is running, currently limited to Azkaban"
+    }, {
+      "name": "executorId",
+      "type": "string",
+      "doc": "The ID of the spec executor that ran the job"
+    }, {
+      "name": "copyMetadata",
+      "type": ["null","org.apache.gobblin.metadata.GobblinDistcpJobEvent"],

Review Comment:
   I like this way of adding configurability for future event information. Do you want to add one for Retention at this point as well?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@gobblin.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [gobblin] umustafi commented on a diff in pull request #3610: [GOBBLIN-1750] Add schemas for observability events in GaaS

Posted by GitBox <gi...@apache.org>.
umustafi commented on code in PR #3610:
URL: https://github.com/apache/gobblin/pull/3610#discussion_r1054899586


##########
gobblin-metrics-libs/gobblin-metrics-base/src/main/avro/GaaSObservabilityEventExperimental.avsc:
##########
@@ -0,0 +1,167 @@
+{
+  "type": "record",
+  "name": "GaaSObservabilityEventExperimental",
+  "namespace": "org.apache.gobblin.metrics",
+  "doc": "An experimental format for GaaS to emit events during and after a job is executed.",
+  "fields": [
+    {
+      "name": "timestamp",
+      "type": "long",
+      "doc": "Time at which event was created in millis"
+    },
+    {
+      "name": "flowGroup",
+      "type": "string",
+      "doc": "Flow group for the GaaS flow",
+      "compliance": "NONE"
+    },
+    {
+      "name": "flowName",
+      "type": "string",
+      "doc": "Flow name for the GaaS flow",
+      "compliance": "NONE"
+    },
+    {
+      "name": "flowExecutionId",
+      "type": "long",
+      "doc": "Flow execution id for the GaaS flow",
+      "compliance": "NONE"
+    },
+    {
+      "name": "lastFlowModificationTime",
+      "type": "long",
+      "doc": "Timestamp in millis since Epoch when the flow config was last modified"
+    },
+    {
+      "name": "flowGraphEdgeId",
+      "type": "string",
+      "doc": "Flow edge id, in format <src_node>_<dest_node>_<edge_id>",
+      "compliance": "NONE"
+    },
+    {
+      "name": "jobName",
+      "type": "string",
+      "doc": "The name of the Gobblin job, found in the job template. One edge can contain multiple jobs",
+      "compliance": "NONE"
+    },
+    {
+      "name": "jobStatus",
+      "type": {
+        "type": "enum",
+        "name": "JobStatus",
+        "symbols": [
+          "SUCCEEDED",
+          "COMPILATION_FAILURE",
+          "SUBMISSION_FAILURE",
+          "EXEUCTION_FAILURE",
+          "CANCELLED"
+        ],
+        "doc": "Final job status for this job in the GaaS flow",
+        "compliance": "NONE"
+      }
+    },
+    {
+      "name": "jobOrchestratedTime",
+      "type": [
+        "null",
+        "long"
+      ],
+      "doc": "Timestamp when the job was successfully sent to the job executor, null if it was unable to be sent."
+    },
+    {
+      "name": "jobStartTime",
+      "type": "long",
+      "doc": "Start time of the job in millis since Epoch",
+      "compliance": "NONE"
+    },
+    {
+      "name": "jobEndTime",
+      "type": "long",
+      "doc": "Finish time of the job in millis since Epoch",
+      "compliance": "NONE"
+    },
+    {
+      "name": "executionUserUrn",
+      "type": [
+        "null",
+        "string"
+      ],
+      "doc": "User URN (if applicable) that runs the underlying Gobblin job",
+      "compliance": "NONE"
+    },
+    {
+      "name": "executorUrl",
+      "type": [
+        "null",

Review Comment:
   having a null type check is sometimes extra layer of confusion. Do we prefer this over case of empty string for the value if no urn or executor url?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@gobblin.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [gobblin] Will-Lo commented on a diff in pull request #3610: [GOBBLIN-1750] Add schemas for observability events in GaaS

Posted by GitBox <gi...@apache.org>.
Will-Lo commented on code in PR #3610:
URL: https://github.com/apache/gobblin/pull/3610#discussion_r1041545539


##########
gobblin-metrics-libs/gobblin-metrics-base/src/main/avro/GaaSObservabilityEventV0.avsc:
##########
@@ -0,0 +1,162 @@
+{
+  "type": "record",
+  "name": "GaaSObservabilityEventV0",
+  "namespace" : "org.apache.gobblin.metrics",
+  "fields": [
+    {
+      "name": "timestamp",
+      "type": "long",
+      "doc": "Time at which event was created in millis"
+    }, {
+      "name" : "flowGroup",
+      "type" : "string",
+      "doc" : "Flow group for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name" : "flowName",
+      "type" : "string",
+      "doc" : "Flow name for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name" : "flowExecutionId",
+      "type" : "long",
+      "doc" : "Flow execution id for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name": "jobSentToExecutor",
+      "type": "boolean",
+      "doc": "Whether or not this job was able to be sent to a job executor."
+    }, {
+      "name": "lastFlowModificationTime",
+      "type": "long",
+      "doc": "Timestamp in millis when the flow config was last modified"
+    }, {
+      "name" : "edgeId",
+      "type" : "string",
+      "doc" : "Flow edge id, in format <src_node>_<dest_node>_<edge_id>",
+      "compliance" : "NONE"
+    }, {
+      "name": "jobName",
+      "type": "string",
+      "doc": "The name of the Gobblin job, found in the job template. One edge can contain multiple jobs",
+      "compliance" : "NONE"
+    }, {
+      "name": "jobType",
+      "type": {
+        "type": "enum",
+        "name": "JobType",
+        "symbols": [
+          "COPY",
+          "RETENTION",
+          "GOBBLIN"
+        ],
+        "symbolDocs": {
+          "COPY": "Gobblin distcp job",
+          "RETENTION": "Gobblin retention job",
+          "GOBBLIN": "Any Gobblin job"
+        }
+      },
+      "doc": "Gobblin job type running on GaaS, determined by the compiled job template.",
+      "compliance": "NONE"
+    }, {
+      "name": "jobStatus",
+      "type": {
+        "type": "enum",
+        "name": "JobStatus",
+        "symbols": [
+          "SUCCEEDED",
+          "FAILED",
+          "CANCELLED"
+        ],
+        "doc": "Final job status for this job in the GaaS flow",
+        "compliance": "NONE"
+      }
+    }, {
+      "name": "jobStartTime",
+      "type": "long",
+      "doc": "Start time of the job in millis",
+      "compliance": "NONE"
+    }, {
+      "name": "jobEndTime",
+      "type": "long",
+      "doc": "Finish time of the job in millis",
+      "compliance": "NONE"
+    }, {
+      "name": "proxyUser",
+      "type": "string",
+      "doc": "Proxy user (if applicable) that ran the Gobblin job",
+      "compliance": "NONE"
+    }, {
+      "name": "executorLink",
+      "type": "string",
+      "doc": "Link to where the job is running, currently limited to Azkaban",
+      "compliance": "NONE"
+    }, {
+      "name": "executorId",
+      "type": "string",

Review Comment:
   I'll set the executor url as a union of null, and the executor ID to be the executor that it would have sent to, to detect edges if they are failing to be sent to an executor.
   
   I also wonder what to default values to if a scheduled flow fails to compile due to a flowgraph failure. It would have no executors associated with it



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@gobblin.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [gobblin] codecov-commenter commented on pull request #3610: [GOBBLIN-1750] Add schemas for observability events in GaaS

Posted by GitBox <gi...@apache.org>.
codecov-commenter commented on PR #3610:
URL: https://github.com/apache/gobblin/pull/3610#issuecomment-1340130841

   # [Codecov](https://codecov.io/gh/apache/gobblin/pull/3610?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#3610](https://codecov.io/gh/apache/gobblin/pull/3610?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (979e819) into [master](https://codecov.io/gh/apache/gobblin/commit/c3d1ba89249a739547c464c1bb4b43126849ee71?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (c3d1ba8) will **decrease** coverage by `3.14%`.
   > The diff coverage is `n/a`.
   
   ```diff
   @@             Coverage Diff              @@
   ##             master    #3610      +/-   ##
   ============================================
   - Coverage     46.89%   43.75%   -3.15%     
   + Complexity    10673     2062    -8611     
   ============================================
     Files          2120      408    -1712     
     Lines         83080    17618   -65462     
     Branches       9254     2151    -7103     
   ============================================
   - Hits          38962     7709   -31253     
   + Misses        40548     9051   -31497     
   + Partials       3570      858    -2712     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/gobblin/pull/3610?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [...a/org/apache/gobblin/cluster/GobblinHelixTask.java](https://codecov.io/gh/apache/gobblin/pull/3610/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1jbHVzdGVyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL2NsdXN0ZXIvR29iYmxpbkhlbGl4VGFzay5qYXZh) | `64.51% <0.00%> (-19.36%)` | :arrow_down: |
   | [...lin/util/filesystem/FileSystemInstrumentation.java](https://codecov.io/gh/apache/gobblin/pull/3610/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi11dGlsaXR5L3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3V0aWwvZmlsZXN5c3RlbS9GaWxlU3lzdGVtSW5zdHJ1bWVudGF0aW9uLmphdmE=) | `92.85% <0.00%> (-7.15%)` | :arrow_down: |
   | [...c/main/java/org/apache/gobblin/util/PathUtils.java](https://codecov.io/gh/apache/gobblin/pull/3610/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi11dGlsaXR5L3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3V0aWwvUGF0aFV0aWxzLmphdmE=) | `11.76% <0.00%> (-1.57%)` | :arrow_down: |
   | [.../org/apache/gobblin/cluster/GobblinTaskRunner.java](https://codecov.io/gh/apache/gobblin/pull/3610/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1jbHVzdGVyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL2NsdXN0ZXIvR29iYmxpblRhc2tSdW5uZXIuamF2YQ==) | `62.22% <0.00%> (-1.41%)` | :arrow_down: |
   | [.../org/apache/gobblin/metastore/MysqlStateStore.java](https://codecov.io/gh/apache/gobblin/pull/3610/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1tZXRhc3RvcmUvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2dvYmJsaW4vbWV0YXN0b3JlL015c3FsU3RhdGVTdG9yZS5qYXZh) | `8.09% <0.00%> (-1.11%)` | :arrow_down: |
   | [...rg/apache/gobblin/util/jdbc/DataSourceBuilder.java](https://codecov.io/gh/apache/gobblin/pull/3610/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi11dGlsaXR5L3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3V0aWwvamRiYy9EYXRhU291cmNlQnVpbGRlci5qYXZh) | `0.00% <0.00%> (ø)` | |
   | [...g/apache/gobblin/util/jdbc/DataSourceProvider.java](https://codecov.io/gh/apache/gobblin/pull/3610/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi11dGlsaXR5L3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3V0aWwvamRiYy9EYXRhU291cmNlUHJvdmlkZXIuamF2YQ==) | `0.00% <0.00%> (ø)` | |
   | [...ache/gobblin/metastore/MysqlDataSourceFactory.java](https://codecov.io/gh/apache/gobblin/pull/3610/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1tZXRhc3RvcmUvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2dvYmJsaW4vbWV0YXN0b3JlL015c3FsRGF0YVNvdXJjZUZhY3RvcnkuamF2YQ==) | `83.33% <0.00%> (ø)` | |
   | [...obblin/metastore/JobHistoryDataSourceProvider.java](https://codecov.io/gh/apache/gobblin/pull/3610/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1tZXRhc3RvcmUvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2dvYmJsaW4vbWV0YXN0b3JlL0pvYkhpc3RvcnlEYXRhU291cmNlUHJvdmlkZXIuamF2YQ==) | `62.50% <0.00%> (ø)` | |
   | [.../profile/ManagedIcebergCleanableDatasetFinder.java](https://codecov.io/gh/apache/gobblin/pull/3610/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1kYXRhLW1hbmFnZW1lbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2dvYmJsaW4vZGF0YS9tYW5hZ2VtZW50L3JldGVudGlvbi9wcm9maWxlL01hbmFnZWRJY2ViZXJnQ2xlYW5hYmxlRGF0YXNldEZpbmRlci5qYXZh) | | |
   | ... and [1711 more](https://codecov.io/gh/apache/gobblin/pull/3610/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   :mega: We’re building smart automated test selection to slash your CI/CD build times. [Learn more](https://about.codecov.io/iterative-testing/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@gobblin.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [gobblin] phet commented on a diff in pull request #3610: [GOBBLIN-1750] Add schemas for observability events in GaaS

Posted by GitBox <gi...@apache.org>.
phet commented on code in PR #3610:
URL: https://github.com/apache/gobblin/pull/3610#discussion_r1041842571


##########
gobblin-metrics-libs/gobblin-metrics-base/src/main/avro/GaaSObservabilityEventV0.avsc:
##########
@@ -0,0 +1,162 @@
+{
+  "type": "record",
+  "name": "GaaSObservabilityEventV0",
+  "namespace" : "org.apache.gobblin.metrics",
+  "fields": [
+    {
+      "name": "timestamp",
+      "type": "long",
+      "doc": "Time at which event was created in millis"
+    }, {
+      "name" : "flowGroup",
+      "type" : "string",
+      "doc" : "Flow group for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name" : "flowName",
+      "type" : "string",
+      "doc" : "Flow name for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name" : "flowExecutionId",
+      "type" : "long",
+      "doc" : "Flow execution id for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name": "jobSentToExecutor",
+      "type": "boolean",
+      "doc": "Whether or not this job was able to be sent to a job executor."
+    }, {
+      "name": "lastFlowModificationTime",
+      "type": "long",
+      "doc": "Timestamp in millis when the flow config was last modified"
+    }, {
+      "name" : "edgeId",
+      "type" : "string",
+      "doc" : "Flow edge id, in format <src_node>_<dest_node>_<edge_id>",
+      "compliance" : "NONE"
+    }, {
+      "name": "jobName",
+      "type": "string",
+      "doc": "The name of the Gobblin job, found in the job template. One edge can contain multiple jobs",
+      "compliance" : "NONE"
+    }, {
+      "name": "jobType",
+      "type": {
+        "type": "enum",
+        "name": "JobType",
+        "symbols": [
+          "COPY",
+          "RETENTION",
+          "GOBBLIN"
+        ],
+        "symbolDocs": {
+          "COPY": "Gobblin distcp job",
+          "RETENTION": "Gobblin retention job",
+          "GOBBLIN": "Any Gobblin job"
+        }
+      },
+      "doc": "Gobblin job type running on GaaS, determined by the compiled job template.",
+      "compliance": "NONE"
+    }, {
+      "name": "jobStatus",
+      "type": {
+        "type": "enum",
+        "name": "JobStatus",
+        "symbols": [
+          "SUCCEEDED",
+          "FAILED",
+          "CANCELLED"
+        ],
+        "doc": "Final job status for this job in the GaaS flow",
+        "compliance": "NONE"
+      }
+    }, {
+      "name": "jobStartTime",
+      "type": "long",
+      "doc": "Start time of the job in millis",
+      "compliance": "NONE"
+    }, {
+      "name": "jobEndTime",
+      "type": "long",
+      "doc": "Finish time of the job in millis",
+      "compliance": "NONE"
+    }, {
+      "name": "proxyUser",
+      "type": "string",
+      "doc": "Proxy user (if applicable) that ran the Gobblin job",
+      "compliance": "NONE"
+    }, {
+      "name": "executorLink",
+      "type": "string",
+      "doc": "Link to where the job is running, currently limited to Azkaban",
+      "compliance": "NONE"
+    }, {
+      "name": "executorId",
+      "type": "string",

Review Comment:
   great Q on compilation failure! probably both ought to be null.
   
   compilation error events should be purely for *scheduled flows that previously compiled*--failure at submission time should be an interactive error only, no event. nonetheless, given we need them, perhaps we're asking `jobStatus.FAILURE` to do too much: it's already execution failure as well as failure to send to executor--now compilation failure as well? it becomes a status *category*, a catch-all. better might be another status, like `INVALID` or else named `COMPILATION_FAILURE` / `UNRESOLVED` (?).
   
   in fact, more I consider that, I'd seek also to discern `EXECUTION_FAILURE` from `SUBMISSION_FAILURE` (actually I dislike the overloaded terminology, where the user 'submits' a flow to gaas, and gaas 'submits' the job to the executor.) other possible terms might be `QUEUEING_FAILURE` or `LOST` or `UNAVAILABLE`, even `SKIPPED` if we consider possible quota overrun...  but overall, whatever the names, three distinct statuses to parallel the kinds of failure
   
   I realize that would depart from borrowing a status previously defined in the gaas job state .pdl, but faced w/ the alternative imprecision, I believe that justifiable
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@gobblin.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [gobblin] Will-Lo commented on a diff in pull request #3610: [GOBBLIN-1750] Add schemas for observability events in GaaS

Posted by GitBox <gi...@apache.org>.
Will-Lo commented on code in PR #3610:
URL: https://github.com/apache/gobblin/pull/3610#discussion_r1042556772


##########
gobblin-metrics-libs/gobblin-metrics-base/src/main/avro/GaaSObservabilityEventV0.avsc:
##########
@@ -0,0 +1,135 @@
+{
+  "type": "record",
+  "name": "GaaSObservabilityEventExperimental",
+  "namespace" : "org.apache.gobblin.metrics",
+  "doc": "An experimental feature for GaaS to emit events during and after a job is executed.",
+  "fields": [
+    {
+      "name": "timestamp",
+      "type": "long",
+      "doc": "Time at which event was created in millis"
+    }, {
+      "name" : "flowGroup",
+      "type" : "string",
+      "doc" : "Flow group for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name" : "flowName",
+      "type" : "string",
+      "doc" : "Flow name for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name" : "flowExecutionId",
+      "type" : "long",
+      "doc" : "Flow execution id for the GaaS flow",
+      "compliance" : "NONE"
+    }, {
+      "name": "jobOrchestratedTime",
+      "type": "long",
+      "doc": "Timestamp when the job was successfully sent to the job executor, -1 if it was unable to be sent."
+    }, {
+      "name": "lastFlowModificationTime",
+      "type": "long",
+      "doc": "Timestamp in millis when the flow config was last modified"
+    }, {
+      "name" : "flowGraphEdgeId",
+      "type" : "string",
+      "doc" : "Flow edge id, in format <src_node>_<dest_node>_<edge_id>",
+      "compliance" : "NONE"
+    }, {
+      "name": "jobName",
+      "type": "string",
+      "doc": "The name of the Gobblin job, found in the job template. One edge can contain multiple jobs",
+      "compliance" : "NONE"
+    }, {
+      "name": "jobStatus",
+      "type": {
+        "type": "enum",
+        "name": "JobStatus",
+        "symbols": [
+          "SUCCEEDED",
+          "FAILED",
+          "CANCELLED"
+        ],
+        "doc": "Final job status for this job in the GaaS flow",
+        "compliance": "NONE"
+      }
+    }, {
+      "name": "jobStartTime",
+      "type": "long",
+      "doc": "Start time of the job in millis",
+      "compliance": "NONE"
+    }, {
+      "name": "jobEndTime",
+      "type": "long",
+      "doc": "Finish time of the job in millis",
+      "compliance": "NONE"
+    }, {
+      "name": "proxyUser",
+      "type": "string",
+      "doc": "Proxy user (if applicable) that ran the Gobblin job",

Review Comment:
   What about GaaS executors that take no proxy user? It's an Azkaban/HDFS concept.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@gobblin.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org