You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2020/02/13 01:53:17 UTC

[GitHub] [druid] zachjsh opened a new pull request #9356: Add Azure config options for segment prefix and max listing length

zachjsh opened a new pull request #9356: Add Azure config options for segment prefix and max listing length
URL: https://github.com/apache/druid/pull/9356
 
 
   Added configuration options to allow the user to specify the prefix
   within the segment container to store the segment files. Also
   added a configuration option to allow the user to specify the
   maximum number of input files to stream for each iteration.
   
   <hr>
   
   This PR has:
   - [ ] been self-reviewed.
      - [ ] using the [concurrency checklist](https://github.com/apache/druid/blob/master/dev/code-review/concurrency.md) (Remove this item if the PR doesn't have any relation to concurrency.)
   - [ ] added documentation for new or modified features or behaviors.
   - [ ] added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
   - [ ] added or updated version, license, or notice information in [licenses.yaml](https://github.com/apache/druid/blob/master/licenses.yaml)
   - [ ] added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
   - [ ] added unit tests or modified existing tests to cover new code paths.
   - [ ] added integration tests.
   - [ ] been tested in a test Druid cluster.
   
   <!-- Check the items by putting "x" in the brackets for the done things. Not all of these items apply to every PR. Remove the items which are not done or not relevant to the PR. None of the items from the checklist above are strictly necessary, but it would be very helpful if you at least self-review the PR. -->
   
   <hr>

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] suneet-s commented on a change in pull request #9356: Add Azure config options for segment prefix and max listing length

Posted by GitBox <gi...@apache.org>.
suneet-s commented on a change in pull request #9356: Add Azure config options for segment prefix and max listing length
URL: https://github.com/apache/druid/pull/9356#discussion_r380844011
 
 

 ##########
 File path: extensions-core/google-extensions/src/test/java/org/apache/druid/data/input/google/GoogleCloudStorageInputSourceTest.java
 ##########
 @@ -169,14 +175,18 @@ public void testWithPrefixesSplit() throws IOException
   public void testReader() throws IOException
   {
     EasyMock.reset(STORAGE);
+    EasyMock.reset(CONFIG);
     addExpectedPrefixObjects(PREFIXES.get(0), ImmutableList.of(EXPECTED_URIS.get(0)));
     addExpectedGetObjectMock(EXPECTED_URIS.get(0));
     addExpectedPrefixObjects(PREFIXES.get(1), ImmutableList.of(EXPECTED_URIS.get(1)));
     addExpectedGetObjectMock(EXPECTED_URIS.get(1));
+    EasyMock.expect(CONFIG.getMaxListingLength()).andReturn(EXPECTED_MAX_LISTING_LENGTH);
     EasyMock.replay(STORAGE);
+    EasyMock.replay(CONFIG);
 
 Review comment:
   nit: looks like this is repeated in multiple tests, maybe move to a helper function?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] jon-wei commented on a change in pull request #9356: Add Azure config options for segment prefix and max listing length

Posted by GitBox <gi...@apache.org>.
jon-wei commented on a change in pull request #9356: Add Azure config options for segment prefix and max listing length
URL: https://github.com/apache/druid/pull/9356#discussion_r382732277
 
 

 ##########
 File path: docs/development/extensions-contrib/azure.md
 ##########
 @@ -29,68 +29,15 @@ To use this Apache Druid extension, make sure to [include](../../development/ext
 
 [Microsoft Azure Storage](http://azure.microsoft.com/en-us/services/storage/) is another option for deep storage. This requires some additional Druid configuration.
 
-|Property|Possible Values|Description|Default|
+|Property|Description|Possible Values|Default|
 |--------|---------------|-----------|-------|
 |`druid.storage.type`|azure||Must be set.|
 |`druid.azure.account`||Azure Storage account name.|Must be set.|
 |`druid.azure.key`||Azure Storage account key.|Must be set.|
 |`druid.azure.container`||Azure Storage container name.|Must be set.|
-|`druid.azure.protocol`|http or https||https|
-|`druid.azure.maxTries`||Number of tries before cancel an Azure operation.|3|
+|`druid.azure.prefix`|prefix to use, i.e. what directory.| |""|
+|`druid.azure.protocol`|the protocol to use|http or https|https|
+|`druid.azure.maxTries`|Number of tries before cancel an Azure operation.| |3|
 
 Review comment:
   cancel -> canceling

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] zachjsh commented on a change in pull request #9356: Add Azure config options for segment prefix and max listing length

Posted by GitBox <gi...@apache.org>.
zachjsh commented on a change in pull request #9356: Add Azure config options for segment prefix and max listing length
URL: https://github.com/apache/druid/pull/9356#discussion_r381584046
 
 

 ##########
 File path: extensions-contrib/azure-extensions/src/main/java/org/apache/druid/storage/azure/AzureDataSegmentPusher.java
 ##########
 @@ -85,7 +89,10 @@ public String getPathForHadoop()
   @Override
   public String getStorageDir(DataSegment dataSegment, boolean useUniquePath)
   {
+    String prefix = segmentConfig.getPrefix();
+    boolean prefixIsNullOrEmpty = (prefix == null || prefix.isEmpty());
 
 Review comment:
   fixed

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] suneet-s commented on a change in pull request #9356: Add Azure config options for segment prefix and max listing length

Posted by GitBox <gi...@apache.org>.
suneet-s commented on a change in pull request #9356: Add Azure config options for segment prefix and max listing length
URL: https://github.com/apache/druid/pull/9356#discussion_r380835990
 
 

 ##########
 File path: extensions-contrib/azure-extensions/src/main/java/org/apache/druid/storage/azure/AzureDataSegmentPusher.java
 ##########
 @@ -70,11 +73,12 @@ public String getPathForHadoop(String dataSource)
   public String getPathForHadoop()
   {
     String hadoopPath = StringUtils.format(
-        "%s://%s@%s.%s/",
-        AzureDataSegmentPuller.AZURE_STORAGE_HADOOP_PROTOCOL,
-        config.getContainer(),
-        config.getAccount(),
-        AzureDataSegmentPuller.AZURE_STORAGE_HOST_ADDRESS
+        "%s://%s@%s.%s/%s",
+        AzureUtils.AZURE_STORAGE_HADOOP_PROTOCOL,
+        segmentConfig.getContainer(),
+        accountConfig.getAccount(),
+        AzureUtils.AZURE_STORAGE_HOST_ADDRESS,
+        segmentConfig.getPrefix().isEmpty() ? "" : segmentConfig.getPrefix() + '/'
 
 Review comment:
   What if prefix ends with a `/` Is there a util that will build the path with only one separator at the end? Is there any harm if the path ends with two `/`

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] suneet-s commented on a change in pull request #9356: Add Azure config options for segment prefix and max listing length

Posted by GitBox <gi...@apache.org>.
suneet-s commented on a change in pull request #9356: Add Azure config options for segment prefix and max listing length
URL: https://github.com/apache/druid/pull/9356#discussion_r380838094
 
 

 ##########
 File path: extensions-contrib/azure-extensions/src/main/java/org/apache/druid/storage/azure/AzureUtils.java
 ##########
 @@ -39,16 +39,25 @@
   @VisibleForTesting
   static final String AZURE_STORAGE_HOST_ADDRESS = "blob.core.windows.net";
 
+  // The azure storage hadoop access pattern is:
+  // wasb[s]://<containername>@<accountname>.blob.core.windows.net/<path>
+  // (from https://docs.microsoft.com/en-us/azure/hdinsight/hdinsight-hadoop-use-blob-storage)
+  static final String AZURE_STORAGE_HADOOP_PROTOCOL = "wasbs";
+
   public static final Predicate<Throwable> AZURE_RETRY = e -> {
-    if (e instanceof URISyntaxException) {
+    Throwable t = e;
+    for (Throwable t2 = e.getCause(); t2 != null; t2 = t2.getCause()) {
+      t = t2;
+    }
 
 Review comment:
   test for unraveling a stacktrace. Should we check an unlimited depth?
   
   This also changes the current behavior where if the top level throwable was a "retryable" exception, we'd retry, but with this change if a StorageException is caused by a RuntimeException we won't retry. Is this intentional?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] suneet-s commented on a change in pull request #9356: Add Azure config options for segment prefix and max listing length

Posted by GitBox <gi...@apache.org>.
suneet-s commented on a change in pull request #9356: Add Azure config options for segment prefix and max listing length
URL: https://github.com/apache/druid/pull/9356#discussion_r380834869
 
 

 ##########
 File path: extensions-contrib/azure-extensions/src/main/java/org/apache/druid/storage/azure/AzureDataSegmentConfig.java
 ##########
 @@ -0,0 +1,74 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.storage.azure;
+
+import com.fasterxml.jackson.annotation.JsonProperty;
+
+import javax.annotation.Nonnull;
+import javax.validation.constraints.Min;
+import javax.validation.constraints.NotNull;
+
+/**
+ * Stores the configuration for segments written to Azure deep storage
+ */
+public class AzureDataSegmentConfig
+{
+  @JsonProperty
+  @NotNull
+  private String container;
+
+  @JsonProperty
+  @Nonnull
 
 Review comment:
   Why `Nonnull` when the previous one is annotated with 'NotNull'?
   
   Also is prefix a required config? Why is assigned to an empty string?
   
   Perhaps using a `@JsonCreator` constructor with Precondition checks will make it clearer what is required in each field
   
   ```
    @JsonCreator
     public AzureDataSegmentConfig(
         ...
         @JsonProperty("prefix") String prefix)
     {
       this.prefix = Preconditions.checkState(!StringUtils.isEmpty(prefix), "prefix must be non empty");
       ...
     }
   ```
   
   Then you don't need all the setters

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] zachjsh commented on a change in pull request #9356: Add Azure config options for segment prefix and max listing length

Posted by GitBox <gi...@apache.org>.
zachjsh commented on a change in pull request #9356: Add Azure config options for segment prefix and max listing length
URL: https://github.com/apache/druid/pull/9356#discussion_r380844347
 
 

 ##########
 File path: extensions-contrib/azure-extensions/src/main/java/org/apache/druid/storage/azure/AzureDataSegmentConfig.java
 ##########
 @@ -0,0 +1,74 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.storage.azure;
+
+import com.fasterxml.jackson.annotation.JsonProperty;
+
+import javax.annotation.Nonnull;
+import javax.validation.constraints.Min;
+import javax.validation.constraints.NotNull;
+
+/**
+ * Stores the configuration for segments written to Azure deep storage
+ */
+public class AzureDataSegmentConfig
+{
+  @JsonProperty
+  @NotNull
+  private String container;
+
+  @JsonProperty
+  @Nonnull
 
 Review comment:
   prefix is not required. Before adding this option segments were written to the root directory within the segment container specified, in a directory named after the datasource. Do we want to change the behavior here and specify a non empty default prefix? I'm not sure how this change would affect users already using the azure extension whose data is already written, will we not be able to find the segment data in this case?
   
   I will fix to @NotNull

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] zachjsh commented on a change in pull request #9356: Add Azure config options for segment prefix and max listing length

Posted by GitBox <gi...@apache.org>.
zachjsh commented on a change in pull request #9356: Add Azure config options for segment prefix and max listing length
URL: https://github.com/apache/druid/pull/9356#discussion_r381584106
 
 

 ##########
 File path: extensions-contrib/azure-extensions/src/main/java/org/apache/druid/storage/azure/AzureUtils.java
 ##########
 @@ -39,16 +39,25 @@
   @VisibleForTesting
   static final String AZURE_STORAGE_HOST_ADDRESS = "blob.core.windows.net";
 
+  // The azure storage hadoop access pattern is:
+  // wasb[s]://<containername>@<accountname>.blob.core.windows.net/<path>
+  // (from https://docs.microsoft.com/en-us/azure/hdinsight/hdinsight-hadoop-use-blob-storage)
+  static final String AZURE_STORAGE_HADOOP_PROTOCOL = "wasbs";
+
   public static final Predicate<Throwable> AZURE_RETRY = e -> {
-    if (e instanceof URISyntaxException) {
+    Throwable t = e;
+    for (Throwable t2 = e.getCause(); t2 != null; t2 = t2.getCause()) {
+      t = t2;
+    }
 
 Review comment:
   fixed

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] zachjsh commented on a change in pull request #9356: Add Azure config options for segment prefix and max listing length

Posted by GitBox <gi...@apache.org>.
zachjsh commented on a change in pull request #9356: Add Azure config options for segment prefix and max listing length
URL: https://github.com/apache/druid/pull/9356#discussion_r381009243
 
 

 ##########
 File path: extensions-contrib/azure-extensions/src/main/java/org/apache/druid/storage/azure/AzureDataSegmentConfig.java
 ##########
 @@ -0,0 +1,74 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.storage.azure;
+
+import com.fasterxml.jackson.annotation.JsonProperty;
+
+import javax.annotation.Nonnull;
+import javax.validation.constraints.Min;
+import javax.validation.constraints.NotNull;
+
+/**
+ * Stores the configuration for segments written to Azure deep storage
+ */
+public class AzureDataSegmentConfig
+{
+  @JsonProperty
+  @NotNull
+  private String container;
+
+  @JsonProperty
+  @Nonnull
+  private String prefix = "";
+
+  @JsonProperty
+  @Min(1)
+  private int maxListingLength = 1024;
 
 Review comment:
   How about AzureInputDataConfig?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] jihoonson commented on a change in pull request #9356: Add Azure config options for segment prefix and max listing length

Posted by GitBox <gi...@apache.org>.
jihoonson commented on a change in pull request #9356: Add Azure config options for segment prefix and max listing length
URL: https://github.com/apache/druid/pull/9356#discussion_r380960445
 
 

 ##########
 File path: extensions-core/google-extensions/src/main/java/org/apache/druid/data/input/google/GoogleCloudStorageInputSource.java
 ##########
 @@ -43,17 +44,20 @@
   private static final int MAX_LISTING_LENGTH = 1024;
 
 Review comment:
   This variable is not used anymore.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] zachjsh commented on a change in pull request #9356: Add Azure config options for segment prefix and max listing length

Posted by GitBox <gi...@apache.org>.
zachjsh commented on a change in pull request #9356: Add Azure config options for segment prefix and max listing length
URL: https://github.com/apache/druid/pull/9356#discussion_r381583825
 
 

 ##########
 File path: extensions-contrib/azure-extensions/src/main/java/org/apache/druid/storage/azure/AzureDataSegmentConfig.java
 ##########
 @@ -0,0 +1,74 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.storage.azure;
+
+import com.fasterxml.jackson.annotation.JsonProperty;
+
+import javax.annotation.Nonnull;
+import javax.validation.constraints.Min;
+import javax.validation.constraints.NotNull;
+
+/**
+ * Stores the configuration for segments written to Azure deep storage
+ */
+public class AzureDataSegmentConfig
+{
+  @JsonProperty
+  @NotNull
+  private String container;
+
+  @JsonProperty
+  @Nonnull
 
 Review comment:
   fixed

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] zachjsh commented on a change in pull request #9356: Add Azure config options for segment prefix and max listing length

Posted by GitBox <gi...@apache.org>.
zachjsh commented on a change in pull request #9356: Add Azure config options for segment prefix and max listing length
URL: https://github.com/apache/druid/pull/9356#discussion_r382757664
 
 

 ##########
 File path: docs/development/extensions-contrib/azure.md
 ##########
 @@ -29,68 +29,15 @@ To use this Apache Druid extension, make sure to [include](../../development/ext
 
 [Microsoft Azure Storage](http://azure.microsoft.com/en-us/services/storage/) is another option for deep storage. This requires some additional Druid configuration.
 
-|Property|Possible Values|Description|Default|
+|Property|Description|Possible Values|Default|
 |--------|---------------|-----------|-------|
 |`druid.storage.type`|azure||Must be set.|
 |`druid.azure.account`||Azure Storage account name.|Must be set.|
 |`druid.azure.key`||Azure Storage account key.|Must be set.|
 |`druid.azure.container`||Azure Storage container name.|Must be set.|
-|`druid.azure.protocol`|http or https||https|
-|`druid.azure.maxTries`||Number of tries before cancel an Azure operation.|3|
+|`druid.azure.prefix`|prefix to use, i.e. what directory.| |""|
+|`druid.azure.protocol`|the protocol to use|http or https|https|
+|`druid.azure.maxTries`|Number of tries before cancel an Azure operation.| |3|
 
 Review comment:
   fixed

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] jihoonson commented on issue #9356: Add Azure config options for segment prefix and max listing length

Posted by GitBox <gi...@apache.org>.
jihoonson commented on issue #9356: Add Azure config options for segment prefix and max listing length
URL: https://github.com/apache/druid/pull/9356#issuecomment-589334579
 
 
   LGTM

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] jihoonson commented on a change in pull request #9356: Add Azure config options for segment prefix and max listing length

Posted by GitBox <gi...@apache.org>.
jihoonson commented on a change in pull request #9356: Add Azure config options for segment prefix and max listing length
URL: https://github.com/apache/druid/pull/9356#discussion_r380978597
 
 

 ##########
 File path: extensions-contrib/azure-extensions/src/main/java/org/apache/druid/storage/azure/AzureDataSegmentConfig.java
 ##########
 @@ -0,0 +1,74 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.storage.azure;
+
+import com.fasterxml.jackson.annotation.JsonProperty;
+
+import javax.annotation.Nonnull;
+import javax.validation.constraints.Min;
+import javax.validation.constraints.NotNull;
+
+/**
+ * Stores the configuration for segments written to Azure deep storage
+ */
+public class AzureDataSegmentConfig
+{
+  @JsonProperty
+  @NotNull
+  private String container;
+
+  @JsonProperty
+  @Nonnull
+  private String prefix = "";
+
+  @JsonProperty
+  @Min(1)
+  private int maxListingLength = 1024;
 
 Review comment:
   Also please add docs for the new configurations.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] jihoonson commented on a change in pull request #9356: Add Azure config options for segment prefix and max listing length

Posted by GitBox <gi...@apache.org>.
jihoonson commented on a change in pull request #9356: Add Azure config options for segment prefix and max listing length
URL: https://github.com/apache/druid/pull/9356#discussion_r380969325
 
 

 ##########
 File path: extensions-core/s3-extensions/src/main/java/org/apache/druid/data/input/s3/S3InputSource.java
 ##########
 @@ -87,6 +91,6 @@ public String toString()
 
   private Iterable<S3ObjectSummary> getIterableObjectsFromPrefixes()
   {
-    return () -> S3Utils.objectSummaryIterator(s3Client, getPrefixes(), MAX_LISTING_LENGTH);
+    return () -> S3Utils.objectSummaryIterator(s3Client, getPrefixes(), segmentPusherConfig.getMaxListingLength());
 
 Review comment:
   `MAX_LISTING_LENGTH` is defined in the parent class (`CloudObjectInputSource`) and is not used anymore. Please remove it.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] zachjsh commented on a change in pull request #9356: Add Azure config options for segment prefix and max listing length

Posted by GitBox <gi...@apache.org>.
zachjsh commented on a change in pull request #9356: Add Azure config options for segment prefix and max listing length
URL: https://github.com/apache/druid/pull/9356#discussion_r381584360
 
 

 ##########
 File path: extensions-core/google-extensions/src/main/java/org/apache/druid/data/input/google/GoogleCloudStorageInputSource.java
 ##########
 @@ -43,17 +44,20 @@
   private static final int MAX_LISTING_LENGTH = 1024;
 
 Review comment:
   fixed

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] jihoonson commented on a change in pull request #9356: Add Azure config options for segment prefix and max listing length

Posted by GitBox <gi...@apache.org>.
jihoonson commented on a change in pull request #9356: Add Azure config options for segment prefix and max listing length
URL: https://github.com/apache/druid/pull/9356#discussion_r380952595
 
 

 ##########
 File path: extensions-contrib/azure-extensions/src/main/java/org/apache/druid/storage/azure/AzureByteSource.java
 ##########
 @@ -63,12 +63,12 @@ public InputStream openStream(long offset) throws IOException
       return azureStorage.getBlobInputStream(offset, containerName, blobPath);
     }
     catch (StorageException | URISyntaxException e) {
-      log.warn("Exception when opening stream to azure resource, containerName: %s, blobPath: %s, Error: %s",
-               containerName, blobPath, e.getMessage()
-      );
       if (AzureUtils.AZURE_RETRY.apply(e)) {
         throw new IOException("Recoverable exception", e);
       }
+      log.warn("Exception when opening stream to azure resource, containerName: %s, blobPath: %s, Error: %s",
 
 Review comment:
   Should the log level be error instead of warn?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] jihoonson commented on a change in pull request #9356: Add Azure config options for segment prefix and max listing length

Posted by GitBox <gi...@apache.org>.
jihoonson commented on a change in pull request #9356: Add Azure config options for segment prefix and max listing length
URL: https://github.com/apache/druid/pull/9356#discussion_r380959673
 
 

 ##########
 File path: extensions-contrib/azure-extensions/src/main/java/org/apache/druid/storage/azure/AzureUtils.java
 ##########
 @@ -39,16 +39,25 @@
   @VisibleForTesting
   static final String AZURE_STORAGE_HOST_ADDRESS = "blob.core.windows.net";
 
+  // The azure storage hadoop access pattern is:
+  // wasb[s]://<containername>@<accountname>.blob.core.windows.net/<path>
+  // (from https://docs.microsoft.com/en-us/azure/hdinsight/hdinsight-hadoop-use-blob-storage)
+  static final String AZURE_STORAGE_HADOOP_PROTOCOL = "wasbs";
+
   public static final Predicate<Throwable> AZURE_RETRY = e -> {
-    if (e instanceof URISyntaxException) {
+    Throwable t = e;
+    for (Throwable t2 = e.getCause(); t2 != null; t2 = t2.getCause()) {
+      t = t2;
+    }
 
 Review comment:
   I think the below `if` clauses should be checked in the above `for` loop.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] zachjsh commented on a change in pull request #9356: Add Azure config options for segment prefix and max listing length

Posted by GitBox <gi...@apache.org>.
zachjsh commented on a change in pull request #9356: Add Azure config options for segment prefix and max listing length
URL: https://github.com/apache/druid/pull/9356#discussion_r381584430
 
 

 ##########
 File path: extensions-core/s3-extensions/src/main/java/org/apache/druid/data/input/s3/S3InputSource.java
 ##########
 @@ -87,6 +91,6 @@ public String toString()
 
   private Iterable<S3ObjectSummary> getIterableObjectsFromPrefixes()
   {
-    return () -> S3Utils.objectSummaryIterator(s3Client, getPrefixes(), MAX_LISTING_LENGTH);
+    return () -> S3Utils.objectSummaryIterator(s3Client, getPrefixes(), segmentPusherConfig.getMaxListingLength());
 
 Review comment:
   fixed

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] jon-wei commented on a change in pull request #9356: Add Azure config options for segment prefix and max listing length

Posted by GitBox <gi...@apache.org>.
jon-wei commented on a change in pull request #9356: Add Azure config options for segment prefix and max listing length
URL: https://github.com/apache/druid/pull/9356#discussion_r382731857
 
 

 ##########
 File path: docs/development/extensions-contrib/azure.md
 ##########
 @@ -29,68 +29,15 @@ To use this Apache Druid extension, make sure to [include](../../development/ext
 
 [Microsoft Azure Storage](http://azure.microsoft.com/en-us/services/storage/) is another option for deep storage. This requires some additional Druid configuration.
 
-|Property|Possible Values|Description|Default|
+|Property|Description|Possible Values|Default|
 |--------|---------------|-----------|-------|
 |`druid.storage.type`|azure||Must be set.|
 |`druid.azure.account`||Azure Storage account name.|Must be set.|
 |`druid.azure.key`||Azure Storage account key.|Must be set.|
 |`druid.azure.container`||Azure Storage container name.|Must be set.|
-|`druid.azure.protocol`|http or https||https|
-|`druid.azure.maxTries`||Number of tries before cancel an Azure operation.|3|
+|`druid.azure.prefix`|prefix to use, i.e. what directory.| |""|
 
 Review comment:
   Suggest:
   
   "A prefix string that will be prepended to the blob names for the segments published to Azure deep storage"

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] zachjsh commented on a change in pull request #9356: Add Azure config options for segment prefix and max listing length

Posted by GitBox <gi...@apache.org>.
zachjsh commented on a change in pull request #9356: Add Azure config options for segment prefix and max listing length
URL: https://github.com/apache/druid/pull/9356#discussion_r381583921
 
 

 ##########
 File path: extensions-contrib/azure-extensions/src/main/java/org/apache/druid/storage/azure/AzureDataSegmentPusher.java
 ##########
 @@ -70,11 +73,12 @@ public String getPathForHadoop(String dataSource)
   public String getPathForHadoop()
   {
     String hadoopPath = StringUtils.format(
-        "%s://%s@%s.%s/",
-        AzureDataSegmentPuller.AZURE_STORAGE_HADOOP_PROTOCOL,
-        config.getContainer(),
-        config.getAccount(),
-        AzureDataSegmentPuller.AZURE_STORAGE_HOST_ADDRESS
+        "%s://%s@%s.%s/%s",
+        AzureUtils.AZURE_STORAGE_HADOOP_PROTOCOL,
+        segmentConfig.getContainer(),
+        accountConfig.getAccount(),
+        AzureUtils.AZURE_STORAGE_HOST_ADDRESS,
+        segmentConfig.getPrefix().isEmpty() ? "" : segmentConfig.getPrefix() + '/'
 
 Review comment:
   good catch! fixed

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] suneet-s commented on a change in pull request #9356: Add Azure config options for segment prefix and max listing length

Posted by GitBox <gi...@apache.org>.
suneet-s commented on a change in pull request #9356: Add Azure config options for segment prefix and max listing length
URL: https://github.com/apache/druid/pull/9356#discussion_r380844447
 
 

 ##########
 File path: extensions-core/google-extensions/src/test/java/org/apache/druid/data/input/google/GoogleCloudStorageInputSourceTest.java
 ##########
 @@ -66,8 +67,9 @@
 
 public class GoogleCloudStorageInputSourceTest extends InitializedNullHandlingTest
 {
-  private static final long EXPECTED_MAX_LISTING_LENGTH = 1024L;
+  private static final int EXPECTED_MAX_LISTING_LENGTH = 10;
 
 Review comment:
   nit: MAX_LISTING_LENGTH since we're mocking the maxListingLength() to this value

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] zachjsh commented on a change in pull request #9356: Add Azure config options for segment prefix and max listing length

Posted by GitBox <gi...@apache.org>.
zachjsh commented on a change in pull request #9356: Add Azure config options for segment prefix and max listing length
URL: https://github.com/apache/druid/pull/9356#discussion_r382757746
 
 

 ##########
 File path: docs/development/extensions-core/google.md
 ##########
 @@ -34,6 +34,13 @@ To use this Apache Druid extension, make sure to [include](../../development/ext
 
 To configure connectivity to google cloud, run druid processes with `GOOGLE_APPLICATION_CREDENTIALS=/path/to/service_account_keyfile` in the environment.
 
+|Property|Description|Possible Values|Default|
 
 Review comment:
   fixed

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] jon-wei commented on a change in pull request #9356: Add Azure config options for segment prefix and max listing length

Posted by GitBox <gi...@apache.org>.
jon-wei commented on a change in pull request #9356: Add Azure config options for segment prefix and max listing length
URL: https://github.com/apache/druid/pull/9356#discussion_r382755361
 
 

 ##########
 File path: docs/development/extensions-core/google.md
 ##########
 @@ -34,6 +34,13 @@ To use this Apache Druid extension, make sure to [include](../../development/ext
 
 To configure connectivity to google cloud, run druid processes with `GOOGLE_APPLICATION_CREDENTIALS=/path/to/service_account_keyfile` in the environment.
 
+|Property|Description|Possible Values|Default|
 
 Review comment:
   Hm, this "Required Configuration" and the "Configuration" section that starts at line 56 should probably be merged, the new wording you have is better so let's use that

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] zachjsh commented on a change in pull request #9356: Add Azure config options for segment prefix and max listing length

Posted by GitBox <gi...@apache.org>.
zachjsh commented on a change in pull request #9356: Add Azure config options for segment prefix and max listing length
URL: https://github.com/apache/druid/pull/9356#discussion_r381584232
 
 

 ##########
 File path: extensions-contrib/azure-extensions/src/main/java/org/apache/druid/storage/azure/AzureByteSource.java
 ##########
 @@ -63,12 +63,12 @@ public InputStream openStream(long offset) throws IOException
       return azureStorage.getBlobInputStream(offset, containerName, blobPath);
     }
     catch (StorageException | URISyntaxException e) {
-      log.warn("Exception when opening stream to azure resource, containerName: %s, blobPath: %s, Error: %s",
-               containerName, blobPath, e.getMessage()
-      );
       if (AzureUtils.AZURE_RETRY.apply(e)) {
         throw new IOException("Recoverable exception", e);
       }
+      log.warn("Exception when opening stream to azure resource, containerName: %s, blobPath: %s, Error: %s",
 
 Review comment:
   fixed

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] zachjsh commented on a change in pull request #9356: Add Azure config options for segment prefix and max listing length

Posted by GitBox <gi...@apache.org>.
zachjsh commented on a change in pull request #9356: Add Azure config options for segment prefix and max listing length
URL: https://github.com/apache/druid/pull/9356#discussion_r382757696
 
 

 ##########
 File path: docs/development/extensions-contrib/azure.md
 ##########
 @@ -29,68 +29,15 @@ To use this Apache Druid extension, make sure to [include](../../development/ext
 
 [Microsoft Azure Storage](http://azure.microsoft.com/en-us/services/storage/) is another option for deep storage. This requires some additional Druid configuration.
 
-|Property|Possible Values|Description|Default|
+|Property|Description|Possible Values|Default|
 |--------|---------------|-----------|-------|
 |`druid.storage.type`|azure||Must be set.|
 |`druid.azure.account`||Azure Storage account name.|Must be set.|
 |`druid.azure.key`||Azure Storage account key.|Must be set.|
 |`druid.azure.container`||Azure Storage container name.|Must be set.|
-|`druid.azure.protocol`|http or https||https|
-|`druid.azure.maxTries`||Number of tries before cancel an Azure operation.|3|
+|`druid.azure.prefix`|prefix to use, i.e. what directory.| |""|
 
 Review comment:
   fixed

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] zachjsh commented on a change in pull request #9356: Add Azure config options for segment prefix and max listing length

Posted by GitBox <gi...@apache.org>.
zachjsh commented on a change in pull request #9356: Add Azure config options for segment prefix and max listing length
URL: https://github.com/apache/druid/pull/9356#discussion_r381009243
 
 

 ##########
 File path: extensions-contrib/azure-extensions/src/main/java/org/apache/druid/storage/azure/AzureDataSegmentConfig.java
 ##########
 @@ -0,0 +1,74 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.storage.azure;
+
+import com.fasterxml.jackson.annotation.JsonProperty;
+
+import javax.annotation.Nonnull;
+import javax.validation.constraints.Min;
+import javax.validation.constraints.NotNull;
+
+/**
+ * Stores the configuration for segments written to Azure deep storage
+ */
+public class AzureDataSegmentConfig
+{
+  @JsonProperty
+  @NotNull
+  private String container;
+
+  @JsonProperty
+  @Nonnull
+  private String prefix = "";
+
+  @JsonProperty
+  @Min(1)
+  private int maxListingLength = 1024;
 
 Review comment:
   How about AzureInputDataConfig? And similar classes for AWS and Google

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] jihoonson commented on issue #9356: Add Azure config options for segment prefix and max listing length

Posted by GitBox <gi...@apache.org>.
jihoonson commented on issue #9356: Add Azure config options for segment prefix and max listing length
URL: https://github.com/apache/druid/pull/9356#issuecomment-587928350
 
 
   Added "Design Review" since this PR adds a new user-facing configuration.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] zachjsh commented on a change in pull request #9356: Add Azure config options for segment prefix and max listing length

Posted by GitBox <gi...@apache.org>.
zachjsh commented on a change in pull request #9356: Add Azure config options for segment prefix and max listing length
URL: https://github.com/apache/druid/pull/9356#discussion_r381584292
 
 

 ##########
 File path: extensions-contrib/azure-extensions/src/main/java/org/apache/druid/storage/azure/AzureDataSegmentConfig.java
 ##########
 @@ -0,0 +1,74 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.storage.azure;
+
+import com.fasterxml.jackson.annotation.JsonProperty;
+
+import javax.annotation.Nonnull;
+import javax.validation.constraints.Min;
+import javax.validation.constraints.NotNull;
+
+/**
+ * Stores the configuration for segments written to Azure deep storage
+ */
+public class AzureDataSegmentConfig
+{
+  @JsonProperty
+  @NotNull
+  private String container;
+
+  @JsonProperty
+  @Nonnull
+  private String prefix = "";
+
+  @JsonProperty
+  @Min(1)
+  private int maxListingLength = 1024;
 
 Review comment:
   fixed

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] jihoonson commented on a change in pull request #9356: Add Azure config options for segment prefix and max listing length

Posted by GitBox <gi...@apache.org>.
jihoonson commented on a change in pull request #9356: Add Azure config options for segment prefix and max listing length
URL: https://github.com/apache/druid/pull/9356#discussion_r380961472
 
 

 ##########
 File path: extensions-contrib/azure-extensions/src/main/java/org/apache/druid/storage/azure/AzureDataSegmentConfig.java
 ##########
 @@ -0,0 +1,74 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.storage.azure;
+
+import com.fasterxml.jackson.annotation.JsonProperty;
+
+import javax.annotation.Nonnull;
+import javax.validation.constraints.Min;
+import javax.validation.constraints.NotNull;
+
+/**
+ * Stores the configuration for segments written to Azure deep storage
+ */
+public class AzureDataSegmentConfig
+{
+  @JsonProperty
+  @NotNull
+  private String container;
+
+  @JsonProperty
+  @Nonnull
+  private String prefix = "";
+
+  @JsonProperty
+  @Min(1)
+  private int maxListingLength = 1024;
 
 Review comment:
   Same for other cloud storage types.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] jon-wei merged pull request #9356: Add Azure config options for segment prefix and max listing length

Posted by GitBox <gi...@apache.org>.
jon-wei merged pull request #9356: Add Azure config options for segment prefix and max listing length
URL: https://github.com/apache/druid/pull/9356
 
 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] zachjsh commented on a change in pull request #9356: Add Azure config options for segment prefix and max listing length

Posted by GitBox <gi...@apache.org>.
zachjsh commented on a change in pull request #9356: Add Azure config options for segment prefix and max listing length
URL: https://github.com/apache/druid/pull/9356#discussion_r381584158
 
 

 ##########
 File path: extensions-core/google-extensions/src/test/java/org/apache/druid/data/input/google/GoogleCloudStorageInputSourceTest.java
 ##########
 @@ -66,8 +67,9 @@
 
 public class GoogleCloudStorageInputSourceTest extends InitializedNullHandlingTest
 {
-  private static final long EXPECTED_MAX_LISTING_LENGTH = 1024L;
+  private static final int EXPECTED_MAX_LISTING_LENGTH = 10;
 
 Review comment:
   fixed

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] suneet-s commented on a change in pull request #9356: Add Azure config options for segment prefix and max listing length

Posted by GitBox <gi...@apache.org>.
suneet-s commented on a change in pull request #9356: Add Azure config options for segment prefix and max listing length
URL: https://github.com/apache/druid/pull/9356#discussion_r380836870
 
 

 ##########
 File path: extensions-contrib/azure-extensions/src/main/java/org/apache/druid/storage/azure/AzureDataSegmentPusher.java
 ##########
 @@ -85,7 +89,10 @@ public String getPathForHadoop()
   @Override
   public String getStorageDir(DataSegment dataSegment, boolean useUniquePath)
   {
+    String prefix = segmentConfig.getPrefix();
+    boolean prefixIsNullOrEmpty = (prefix == null || prefix.isEmpty());
 
 Review comment:
   `org.apache.commons.lang.StringUtils.isEmpty(prefix)`

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] jihoonson commented on a change in pull request #9356: Add Azure config options for segment prefix and max listing length

Posted by GitBox <gi...@apache.org>.
jihoonson commented on a change in pull request #9356: Add Azure config options for segment prefix and max listing length
URL: https://github.com/apache/druid/pull/9356#discussion_r380955615
 
 

 ##########
 File path: extensions-contrib/azure-extensions/src/main/java/org/apache/druid/storage/azure/AzureDataSegmentConfig.java
 ##########
 @@ -0,0 +1,74 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.storage.azure;
+
+import com.fasterxml.jackson.annotation.JsonProperty;
+
+import javax.annotation.Nonnull;
+import javax.validation.constraints.Min;
+import javax.validation.constraints.NotNull;
+
+/**
+ * Stores the configuration for segments written to Azure deep storage
+ */
+public class AzureDataSegmentConfig
+{
+  @JsonProperty
+  @NotNull
+  private String container;
+
+  @JsonProperty
+  @Nonnull
+  private String prefix = "";
+
+  @JsonProperty
+  @Min(1)
+  private int maxListingLength = 1024;
 
 Review comment:
   I think this should be in a separate class rather than being in the class for deep storage configuration. I would suggest to add a new class `AzureReadConfig` (I think there could be a better name) that has the new configuration only, so that we can add more read-related configurations in the future.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org