You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2021/01/08 23:12:10 UTC

[GitHub] [druid] jihoonson commented on a change in pull request #10677: Add URI based allowPrefexList and denyPrefixList

jihoonson commented on a change in pull request #10677:
URL: https://github.com/apache/druid/pull/10677#discussion_r554239833



##########
File path: indexing-service/src/main/java/org/apache/druid/indexing/common/task/CompactionTask.java
##########
@@ -153,6 +154,9 @@
   @JsonIgnore
   private final RetryPolicyFactory retryPolicyFactory;
 
+  @JsonIgnore
+  private final InputSourceSecurityConfig securityConfig;

Review comment:
       Why not `InputSourceSecurityConfig.ALLOW_ALL`? The system-wide securityConfig is ignored in DruidInputSource anyway.

##########
File path: indexing-service/src/main/java/org/apache/druid/indexing/common/task/IndexTask.java
##########
@@ -1044,17 +1048,21 @@ public IndexIOConfig(
       if (firehoseFactory != null && inputFormat != null) {
         throw new IAE("Cannot use firehose and inputFormat together. Try using inputSource instead of firehose.");
       }
+      if (inputSource != null) {
+        inputSource.validateAllowDenyPrefixList(securityConfig);

Review comment:
       I guess you wanted to validate inputSource in the constructor to fail early. However, this seems possible to cause some issues because it will throw exception on validation failure, which in turn making creating an ioConfig instance failed. For example, suppose you had an ingestion spec stored in the metadata store. Then, for some reason, you wanted to update the security config to forbid some URIs which are in the ingestion spec in the metadata store. The update will break reading ingestion specs from metadata store which happens in the Overlord. It will also unnecessarily perform the duplicate check multiple times during ingestion. I think this should be called somewhere in the code path of ingestion, such as `Task.isReady()`.

##########
File path: extensions-core/hdfs-storage/src/main/java/org/apache/druid/inputsource/hdfs/HdfsInputSource.java
##########
@@ -214,10 +215,23 @@ public boolean needsFormat()
   private void cachePathsIfNeeded() throws IOException
   {
     if (cachedPaths == null) {
-      cachedPaths = ImmutableList.copyOf(Preconditions.checkNotNull(getPaths(inputPaths, configuration), "paths"));
+      Collection<Path> paths = Preconditions.checkNotNull(getPaths(inputPaths, configuration), "paths");
+      cachedPaths = ImmutableList.copyOf(paths);
     }
   }
 
+  @Override
+  public void validateAllowDenyPrefixList(InputSourceSecurityConfig securityConfig)
+  {
+    try {
+      cachePathsIfNeeded();

Review comment:
       I think you won't probably need to cache paths since this will unnecessarily populate the cache in the Overlord.

##########
File path: core/src/main/java/org/apache/druid/data/input/AbstractInputSource.java
##########
@@ -31,13 +31,16 @@
  */
 public abstract class AbstractInputSource implements InputSource
 {
+  private transient boolean validated = false;

Review comment:
       Why `transient`? We don't use `Serializable`.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org