You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@gobblin.apache.org by GitBox <gi...@apache.org> on 2020/01/30 04:46:39 UTC

[GitHub] [incubator-gobblin] sv2000 commented on a change in pull request #2877: [GOBBLIN-1035] make hive dataset descriptor accepts regexed db and tables

sv2000 commented on a change in pull request #2877: [GOBBLIN-1035] make hive dataset descriptor accepts regexed db and tables
URL: https://github.com/apache/incubator-gobblin/pull/2877#discussion_r372757547
 
 

 ##########
 File path: gobblin-service/src/main/java/org/apache/gobblin/service/modules/dataset/HiveDatasetDescriptor.java
 ##########
 @@ -69,12 +70,21 @@ public HiveDatasetDescriptor(Config config) throws IOException {
         .withValue(CONFLICT_POLICY, ConfigValueFactory.fromAnyRef(conflictPolicy))
         .withValue(PARTITION_COLUMN, ConfigValueFactory.fromAnyRef(partitionColumn))
         .withValue(PARTITION_FORMAT, ConfigValueFactory.fromAnyRef(partitionFormat))
-        .withValue(WHITELIST_TABLES, ConfigValueFactory.fromAnyRef(createWhitelistedTables())
+        .withValue(HiveDatasetFinder.HIVE_DATASET_PREFIX + "." + WhitelistBlacklist.WHITELIST,
+            ConfigValueFactory.fromAnyRef(createWhitelistedTables())
         ));
   }
 
-  private String createWhitelistedTables() {
-    return this.tableName.replace(',', '|');
+  // If the db name contains wildcards, whitelist is created <regex_db>.*
+  // Otherwise, whitelist is created as <db>.tables.
+  // This is the format which HiveDatasetFinder understands.
+  // e.g. db=ei_tracking, table=zephyr*,abook*, whitelist will ei_tracking.zephyr*|abook*
 
 Review comment:
   Your javadoc is a little confusing to read. Do you mean to say that for an input with db = ei_tracking and table = zephyr*,abook*, the method will return the whitelist pattern ei_tracking.zephyr*|abook*?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services