You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@gobblin.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2022/03/23 01:14:00 UTC

[jira] [Work logged] (GOBBLIN-1627) provide option to change data node names

     [ https://issues.apache.org/jira/browse/GOBBLIN-1627?focusedWorklogId=746289&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-746289 ]

ASF GitHub Bot logged work on GOBBLIN-1627:
-------------------------------------------

                Author: ASF GitHub Bot
            Created on: 23/Mar/22 01:13
            Start Date: 23/Mar/22 01:13
    Worklog Time Spent: 10m 
      Work Description: phet commented on a change in pull request #3484:
URL: https://github.com/apache/gobblin/pull/3484#discussion_r832745283



##########
File path: gobblin-service/src/main/java/org/apache/gobblin/service/modules/flowgraph/pathfinder/AbstractPathFinder.java
##########
@@ -155,6 +159,26 @@
     this.destDatasetDescriptor = DatasetDescriptorUtils.constructDatasetDescriptor(destDatasetDescriptorConfig);
   }
 
+  static List<String> getDataNodes(Config flowConfig, String identifierKey, Config config) {

Review comment:
       nit: is `config` truly the `sysConfig`?  good to contextualize `Config flowConfig`; other `Config` deserves the same

##########
File path: gobblin-service/src/main/java/org/apache/gobblin/service/modules/flowgraph/pathfinder/AbstractPathFinder.java
##########
@@ -155,6 +159,26 @@
     this.destDatasetDescriptor = DatasetDescriptorUtils.constructDatasetDescriptor(destDatasetDescriptorConfig);
   }
 
+  static List<String> getDataNodes(Config flowConfig, String identifierKey, Config config) {
+    List<String> dataNodes = ConfigUtils.getStringList(flowConfig, identifierKey);
+    return dataNodes.stream().map(dataNode -> convertDataNode(dataNode, config)).collect(Collectors.toList());
+  }
+
+  private static String getDataNode(Config flowConfig, String identifierKey, Config config) {
+    String dataNode = ConfigUtils.getString(flowConfig, identifierKey, "");
+    return convertDataNode(dataNode, config);
+  }
+
+  protected static String convertDataNode(String dataNode, Config config) {

Review comment:
       nit: semantically, this is not node 'conversion', but node ID aliasing.  suggest: more accurate terminology

##########
File path: gobblin-service/src/main/java/org/apache/gobblin/service/modules/flowgraph/pathfinder/AbstractPathFinder.java
##########
@@ -77,18 +79,20 @@
   protected Long flowExecutionId;
   protected FlowSpec flowSpec;
   protected Config flowConfig;
+  protected Config sysConfig;
 
-  AbstractPathFinder(FlowGraph flowGraph, FlowSpec flowSpec)
+  AbstractPathFinder(FlowGraph flowGraph, FlowSpec flowSpec, Config config)
       throws ReflectiveOperationException {
     this.flowGraph = flowGraph;
     this.flowSpec = flowSpec;
     this.flowExecutionId = FlowUtils.getOrCreateFlowExecutionId(flowSpec);
     this.flowConfig = flowSpec.getConfig().withValue(ConfigurationKeys.FLOW_EXECUTION_ID_KEY, ConfigValueFactory.fromAnyRef(flowExecutionId));
+    this.sysConfig = config;
 
     //Get src/dest DataNodes from the flow config
-    String srcNodeId = ConfigUtils.getString(flowConfig, ServiceConfigKeys.FLOW_SOURCE_IDENTIFIER_KEY, "");
+    String srcNodeId = getDataNode(flowConfig, ServiceConfigKeys.FLOW_SOURCE_IDENTIFIER_KEY, this.sysConfig);
+    List<String> destNodeIds = getDataNodes(flowConfig, ServiceConfigKeys.FLOW_DESTINATION_IDENTIFIER_KEY, this.sysConfig);

Review comment:
       could these calls now throw?  e.g. the splitter can't make a `Map`.  if so, that would fail the instance init.  would be good to explicitly catch and compose a clear message.
   
   anyway, this is system config.  why introduce that to a class that doesn't otherwise need to inspect it?  couldn't we construct the map from `Config sysConfig` in the caller, such as the `MultiHopFlowCompiler`, then pass something definite and strongly typed in here?  (hint: adopt Null Pattern of an empty `Map`, as needed for simplicity.)

##########
File path: gobblin-service/src/main/java/org/apache/gobblin/service/modules/flowgraph/pathfinder/AbstractPathFinder.java
##########
@@ -60,6 +61,7 @@
 public abstract class AbstractPathFinder implements PathFinder {
   private static final String SOURCE_PREFIX = "source";
   private static final String DESTINATION_PREFIX = "destination";
+  public static final String DATANODE_CONVERSTION_MAP = ServiceConfigKeys.GOBBLIN_SERVICE_PREFIX + "datanode.conversion.map";

Review comment:
       misspelling on the constant, although config key is fine.
   
   more importantly, please document the format of the mappings.  you could do this through the name--e.g. `DATA_NODE_ID_TO_ALIAS_MAP`-- or through a code comment.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@gobblin.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Issue Time Tracking
-------------------

            Worklog Id:     (was: 746289)
    Remaining Estimate: 0h
            Time Spent: 10m

> provide option to change data node names
> ----------------------------------------
>
>                 Key: GOBBLIN-1627
>                 URL: https://issues.apache.org/jira/browse/GOBBLIN-1627
>             Project: Apache Gobblin
>          Issue Type: Improvement
>            Reporter: Arjun Singh Bora
>            Priority: Major
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> data node names may need some conversion, e.g. we may only have one node Node1, but it may be referred by multiple names Node1-dev, Node1-prod, so I am providing a config to supply a map for all this conversation.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)