You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2020/03/10 19:32:35 UTC

[GitHub] [druid] maytasm3 opened a new pull request #9501: Adding s3, gcs, azure integration tests

maytasm3 opened a new pull request #9501: Adding s3, gcs, azure integration tests
URL: https://github.com/apache/druid/pull/9501
 
 
   Adding s3, gcs, azure integration tests
   
   ### Description
   
   Adding s3, gcs, azure integration tests which can be run BYO cloud style.
   
   This PR has:
   - [ ] been self-reviewed.
      - [ ] using the [concurrency checklist](https://github.com/apache/druid/blob/master/dev/code-review/concurrency.md) (Remove this item if the PR doesn't have any relation to concurrency.)
   - [ ] added documentation for new or modified features or behaviors.
   - [ ] added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
   - [ ] added or updated version, license, or notice information in [licenses.yaml](https://github.com/apache/druid/blob/master/licenses.yaml)
   - [ ] added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
   - [ ] added unit tests or modified existing tests to cover new code paths.
   - [ ] added integration tests.
   - [ ] been tested in a test Druid cluster.
   
   <!-- Check the items by putting "x" in the brackets for the done things. Not all of these items apply to every PR. Remove the items which are not done or not relevant to the PR. None of the items from the checklist above are strictly necessary, but it would be very helpful if you at least self-review the PR. -->
   
   <hr>
   
   ##### Key changed/added classes in this PR
    * `MyFoo`
    * `OurBar`
    * `TheirBaz`
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] clintropolis commented on a change in pull request #9501: Adding s3, gcs, azure integration tests

Posted by GitBox <gi...@apache.org>.
clintropolis commented on a change in pull request #9501: Adding s3, gcs, azure integration tests
URL: https://github.com/apache/druid/pull/9501#discussion_r392549380
 
 

 ##########
 File path: integration-tests/docker/druid.sh
 ##########
 @@ -73,4 +75,23 @@ setupConfig() {
       var=$(echo "$evar" | sed -e 's?^\([^=]*\)=.*?\1?g' -e 's?_?.?g')
       setKey $DRUID_SERVICE "$var" "$val"
   done
-}
\ No newline at end of file
+}
+
+setupData()
+{
+  # The "query" and "security" test groups require data to be setup before running the tests.
+  # In particular, they requires segments to be download from a pre-existing s3 bucket.
+  # This is done by using the loadSpec put into metadatastore and s3 credientials set below.
+  if [ "$DRUID_INTEGRATION_TEST_GROUP" = "query" ] || [ "$DRUID_INTEGRATION_TEST_GROUP" = "security" ]; then
 
 Review comment:
   Hmm, this doesn't necessarily need to be changed now, but I think longer term we are going to need a mapping of test groups to configurations, but I'm not sure it should be encoded in a file in the container. This part should maybe be repurposed to allow running of initialization scripts that configurations bring with them to the container, and this stuff be in a script that initializes the legacy integration test environment.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] maytasm3 commented on a change in pull request #9501: Adding s3, gcs, azure integration tests

Posted by GitBox <gi...@apache.org>.
maytasm3 commented on a change in pull request #9501: Adding s3, gcs, azure integration tests
URL: https://github.com/apache/druid/pull/9501#discussion_r393220762
 
 

 ##########
 File path: integration-tests/src/test/java/org/apache/druid/tests/indexer/ITAzureParallelIndexTest.java
 ##########
 @@ -0,0 +1,139 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.tests.indexer;
+
+import com.google.common.collect.ImmutableList;
+import com.google.common.collect.ImmutableMap;
+import org.apache.druid.indexer.partitions.DynamicPartitionsSpec;
+import org.apache.druid.java.util.common.Pair;
+import org.apache.druid.java.util.common.StringUtils;
+import org.apache.druid.testing.guice.DruidTestModuleFactory;
+import org.apache.druid.tests.TestNGGroup;
+import org.testng.annotations.DataProvider;
+import org.testng.annotations.Guice;
+import org.testng.annotations.Test;
+
+import java.io.Closeable;
+import java.util.List;
+import java.util.UUID;
+import java.util.function.Function;
+
+/**
+ * IMPORTANT:
+ * To run this test, you must:
+ * 1) Set the variables {@link ITAzureParallelIndexTest#CONTAINER} and {@link ITAzureParallelIndexTest#PATH} for your data
+ * 2) Copy wikipedia_index_data1.json, wikipedia_index_data2.json, and wikipedia_index_data3.json
+ *    located in integration-tests/src/test/resources/data/batch_index to your Azure at the location set in step 1.
+ * 3) Provide -Doverride.config.path=<PATH_TO_FILE> with Azure credentials/configs set. See
+ *    integration-tests/docker/environment-configs/override-examples/azure for env vars to provide.
+ */
+@Test(groups = TestNGGroup.AZURE_DEEP_STORAGE)
+@Guice(moduleFactory = DruidTestModuleFactory.class)
+public class ITAzureParallelIndexTest extends AbstractITBatchIndexTest
+{
+  // START: Change this with the configs for your Azure data
+  private static final String CONTAINER = "my-container";
 
 Review comment:
   Change these to configs. 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] clintropolis merged pull request #9501: Adding s3, gcs, azure integration tests

Posted by GitBox <gi...@apache.org>.
clintropolis merged pull request #9501: Adding s3, gcs, azure integration tests
URL: https://github.com/apache/druid/pull/9501
 
 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] maytasm3 commented on a change in pull request #9501: Adding s3, gcs, azure integration tests

Posted by GitBox <gi...@apache.org>.
maytasm3 commented on a change in pull request #9501: Adding s3, gcs, azure integration tests
URL: https://github.com/apache/druid/pull/9501#discussion_r393220590
 
 

 ##########
 File path: integration-tests/docker/druid.sh
 ##########
 @@ -73,4 +76,20 @@ setupConfig() {
       var=$(echo "$evar" | sed -e 's?^\([^=]*\)=.*?\1?g' -e 's?_?.?g')
       setKey $DRUID_SERVICE "$var" "$val"
   done
-}
\ No newline at end of file
+}
+
+setupData()
 
 Review comment:
   Added comments indicating the above.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] clintropolis commented on a change in pull request #9501: Adding s3, gcs, azure integration tests

Posted by GitBox <gi...@apache.org>.
clintropolis commented on a change in pull request #9501: Adding s3, gcs, azure integration tests
URL: https://github.com/apache/druid/pull/9501#discussion_r392548919
 
 

 ##########
 File path: integration-tests/docker/environment-configs/override-examples/azure
 ##########
 @@ -0,0 +1,28 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+#
+
+#
+# Example of override config file to provide.
+# Please replace <OVERRIDE_THIS> with your cloud configs/credentials
+#
+druid_storage_type=azure
+druid_azure_account=<OVERRIDE_THIS>
+druid_azure_key=<OVERRIDE_THIS>
+druid_azure_container=<OVERRIDE_THIS>
+druid_extensions_loadList=["druid-azure-extensions"]
 
 Review comment:
   this approach with trying to use overrides seems like it is going to be sort of brittle and hard to maintain. Like, if i want to ingest data from s3 into hdfs deep storage, or any combination of extensions, I'm not sure how well this approach will hold up, I guess we'll need separate override files for each configuration? That said, maybe is fine until we determine a better solution to integration test config management.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] maytasm3 commented on a change in pull request #9501: [WIP] Adding s3, gcs, azure integration tests

Posted by GitBox <gi...@apache.org>.
maytasm3 commented on a change in pull request #9501: [WIP] Adding s3, gcs, azure integration tests
URL: https://github.com/apache/druid/pull/9501#discussion_r390618863
 
 

 ##########
 File path: integration-tests/docker/druid.sh
 ##########
 @@ -73,4 +76,20 @@ setupConfig() {
       var=$(echo "$evar" | sed -e 's?^\([^=]*\)=.*?\1?g' -e 's?_?.?g')
       setKey $DRUID_SERVICE "$var" "$val"
   done
-}
\ No newline at end of file
+}
+
+setupData()
 
 Review comment:
   This will work for how our travis CI is setup but if user try running manually locally with...
   1) -DexcludedGroups=not-query
   2) without -Dgroups and without DexcludedGroups
   then the data won't be setup

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] maytasm3 commented on a change in pull request #9501: Adding s3, gcs, azure integration tests

Posted by GitBox <gi...@apache.org>.
maytasm3 commented on a change in pull request #9501: Adding s3, gcs, azure integration tests
URL: https://github.com/apache/druid/pull/9501#discussion_r393264893
 
 

 ##########
 File path: integration-tests/run_cluster.sh
 ##########
 @@ -47,12 +47,25 @@
   # Make directories if they dont exist
   mkdir -p $SHARED_DIR/logs
   mkdir -p $SHARED_DIR/tasklogs
+  mkdir -p $SHARED_DIR/docker/extensions
+  mkdir -p $SHARED_DIR/docker/credentials
 
   # install druid jars
   rm -rf $SHARED_DIR/docker
   cp -R docker $SHARED_DIR/docker
   mvn -B dependency:copy-dependencies -DoutputDirectory=$SHARED_DIR/docker/lib
 
+  # move extensions into a seperate extension folder
+  # For druid-s3-extensions
+  mkdir -p $SHARED_DIR/docker/extensions/druid-s3-extensions
 
 Review comment:
   I agree. It's very hard to separate out the extensions cleanly using `mvn -B dependency:copy-dependencies`
   Moving away from `mvn -B dependency:copy-dependencies` to how it is in normal packaging using the distribution is the right way in my opinion but is a much bigger/unrelated change than the purpose of this PR. For the purpose of this PR, as long as we don't load the extension jar which checks for configs/credentials to be set, we are good.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] maytasm3 commented on a change in pull request #9501: Adding s3, gcs, azure integration tests

Posted by GitBox <gi...@apache.org>.
maytasm3 commented on a change in pull request #9501: Adding s3, gcs, azure integration tests
URL: https://github.com/apache/druid/pull/9501#discussion_r393267066
 
 

 ##########
 File path: integration-tests/docker/druid.sh
 ##########
 @@ -73,4 +75,23 @@ setupConfig() {
       var=$(echo "$evar" | sed -e 's?^\([^=]*\)=.*?\1?g' -e 's?_?.?g')
       setKey $DRUID_SERVICE "$var" "$val"
   done
-}
\ No newline at end of file
+}
+
+setupData()
+{
+  # The "query" and "security" test groups require data to be setup before running the tests.
+  # In particular, they requires segments to be download from a pre-existing s3 bucket.
+  # This is done by using the loadSpec put into metadatastore and s3 credientials set below.
+  if [ "$DRUID_INTEGRATION_TEST_GROUP" = "query" ] || [ "$DRUID_INTEGRATION_TEST_GROUP" = "security" ]; then
 
 Review comment:
   I think that's a good idea. Should definitely revisit this

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] clintropolis commented on a change in pull request #9501: Adding s3, gcs, azure integration tests

Posted by GitBox <gi...@apache.org>.
clintropolis commented on a change in pull request #9501: Adding s3, gcs, azure integration tests
URL: https://github.com/apache/druid/pull/9501#discussion_r393227043
 
 

 ##########
 File path: integration-tests/docker/druid.sh
 ##########
 @@ -73,4 +76,20 @@ setupConfig() {
       var=$(echo "$evar" | sed -e 's?^\([^=]*\)=.*?\1?g' -e 's?_?.?g')
       setKey $DRUID_SERVICE "$var" "$val"
   done
-}
\ No newline at end of file
+}
+
+setupData()
 
 Review comment:
   Ah, one more thing I forgot about, please update the docs in https://github.com/apache/druid/blob/master/integration-tests/README.md to include instructions for how to supply credentials and run these tests

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] maytasm3 commented on a change in pull request #9501: Adding s3, gcs, azure integration tests

Posted by GitBox <gi...@apache.org>.
maytasm3 commented on a change in pull request #9501: Adding s3, gcs, azure integration tests
URL: https://github.com/apache/druid/pull/9501#discussion_r393220762
 
 

 ##########
 File path: integration-tests/src/test/java/org/apache/druid/tests/indexer/ITAzureParallelIndexTest.java
 ##########
 @@ -0,0 +1,139 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.tests.indexer;
+
+import com.google.common.collect.ImmutableList;
+import com.google.common.collect.ImmutableMap;
+import org.apache.druid.indexer.partitions.DynamicPartitionsSpec;
+import org.apache.druid.java.util.common.Pair;
+import org.apache.druid.java.util.common.StringUtils;
+import org.apache.druid.testing.guice.DruidTestModuleFactory;
+import org.apache.druid.tests.TestNGGroup;
+import org.testng.annotations.DataProvider;
+import org.testng.annotations.Guice;
+import org.testng.annotations.Test;
+
+import java.io.Closeable;
+import java.util.List;
+import java.util.UUID;
+import java.util.function.Function;
+
+/**
+ * IMPORTANT:
+ * To run this test, you must:
+ * 1) Set the variables {@link ITAzureParallelIndexTest#CONTAINER} and {@link ITAzureParallelIndexTest#PATH} for your data
+ * 2) Copy wikipedia_index_data1.json, wikipedia_index_data2.json, and wikipedia_index_data3.json
+ *    located in integration-tests/src/test/resources/data/batch_index to your Azure at the location set in step 1.
+ * 3) Provide -Doverride.config.path=<PATH_TO_FILE> with Azure credentials/configs set. See
+ *    integration-tests/docker/environment-configs/override-examples/azure for env vars to provide.
+ */
+@Test(groups = TestNGGroup.AZURE_DEEP_STORAGE)
+@Guice(moduleFactory = DruidTestModuleFactory.class)
+public class ITAzureParallelIndexTest extends AbstractITBatchIndexTest
+{
+  // START: Change this with the configs for your Azure data
+  private static final String CONTAINER = "my-container";
 
 Review comment:
   Changed these to configs

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] clintropolis commented on a change in pull request #9501: Adding s3, gcs, azure integration tests

Posted by GitBox <gi...@apache.org>.
clintropolis commented on a change in pull request #9501: Adding s3, gcs, azure integration tests
URL: https://github.com/apache/druid/pull/9501#discussion_r392548762
 
 

 ##########
 File path: integration-tests/run_cluster.sh
 ##########
 @@ -47,12 +47,25 @@
   # Make directories if they dont exist
   mkdir -p $SHARED_DIR/logs
   mkdir -p $SHARED_DIR/tasklogs
+  mkdir -p $SHARED_DIR/docker/extensions
+  mkdir -p $SHARED_DIR/docker/credentials
 
   # install druid jars
   rm -rf $SHARED_DIR/docker
   cp -R docker $SHARED_DIR/docker
   mvn -B dependency:copy-dependencies -DoutputDirectory=$SHARED_DIR/docker/lib
 
+  # move extensions into a seperate extension folder
+  # For druid-s3-extensions
+  mkdir -p $SHARED_DIR/docker/extensions/druid-s3-extensions
 
 Review comment:
   hmm, this doesn't seem a great way to handle this and doesn't really reflect how this will likely be run in practice, because the extension dependency jars will still all be on the classpath, just the extension jar itself will be in another folder. I think we will probably want to get the druid install similar to how it is in normal packaging to provide a more realistic test environment, which probably precludes using `mvn -B dependency:copy-dependencies`, or at least the way we are currently using it, so that way all extensions are split out in the separate extensions folder and not on the classpath.
   
   I guess this could be fine for now though, and we could figure out how to do this in a future PR.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] clintropolis commented on a change in pull request #9501: Adding s3, gcs, azure integration tests

Posted by GitBox <gi...@apache.org>.
clintropolis commented on a change in pull request #9501: Adding s3, gcs, azure integration tests
URL: https://github.com/apache/druid/pull/9501#discussion_r392548269
 
 

 ##########
 File path: integration-tests/src/test/java/org/apache/druid/tests/indexer/ITAzureParallelIndexTest.java
 ##########
 @@ -0,0 +1,139 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.tests.indexer;
+
+import com.google.common.collect.ImmutableList;
+import com.google.common.collect.ImmutableMap;
+import org.apache.druid.indexer.partitions.DynamicPartitionsSpec;
+import org.apache.druid.java.util.common.Pair;
+import org.apache.druid.java.util.common.StringUtils;
+import org.apache.druid.testing.guice.DruidTestModuleFactory;
+import org.apache.druid.tests.TestNGGroup;
+import org.testng.annotations.DataProvider;
+import org.testng.annotations.Guice;
+import org.testng.annotations.Test;
+
+import java.io.Closeable;
+import java.util.List;
+import java.util.UUID;
+import java.util.function.Function;
+
+/**
+ * IMPORTANT:
+ * To run this test, you must:
+ * 1) Set the variables {@link ITAzureParallelIndexTest#CONTAINER} and {@link ITAzureParallelIndexTest#PATH} for your data
+ * 2) Copy wikipedia_index_data1.json, wikipedia_index_data2.json, and wikipedia_index_data3.json
+ *    located in integration-tests/src/test/resources/data/batch_index to your Azure at the location set in step 1.
+ * 3) Provide -Doverride.config.path=<PATH_TO_FILE> with Azure credentials/configs set. See
+ *    integration-tests/docker/environment-configs/override-examples/azure for env vars to provide.
+ */
+@Test(groups = TestNGGroup.AZURE_DEEP_STORAGE)
+@Guice(moduleFactory = DruidTestModuleFactory.class)
+public class ITAzureParallelIndexTest extends AbstractITBatchIndexTest
+{
+  // START: Change this with the configs for your Azure data
+  private static final String CONTAINER = "my-container";
 
 Review comment:
   does this mean you need to edit this code to run the tests? Can this be a config so it can be passed in?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] maytasm3 commented on a change in pull request #9501: Adding s3, gcs, azure integration tests

Posted by GitBox <gi...@apache.org>.
maytasm3 commented on a change in pull request #9501: Adding s3, gcs, azure integration tests
URL: https://github.com/apache/druid/pull/9501#discussion_r393220762
 
 

 ##########
 File path: integration-tests/src/test/java/org/apache/druid/tests/indexer/ITAzureParallelIndexTest.java
 ##########
 @@ -0,0 +1,139 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.tests.indexer;
+
+import com.google.common.collect.ImmutableList;
+import com.google.common.collect.ImmutableMap;
+import org.apache.druid.indexer.partitions.DynamicPartitionsSpec;
+import org.apache.druid.java.util.common.Pair;
+import org.apache.druid.java.util.common.StringUtils;
+import org.apache.druid.testing.guice.DruidTestModuleFactory;
+import org.apache.druid.tests.TestNGGroup;
+import org.testng.annotations.DataProvider;
+import org.testng.annotations.Guice;
+import org.testng.annotations.Test;
+
+import java.io.Closeable;
+import java.util.List;
+import java.util.UUID;
+import java.util.function.Function;
+
+/**
+ * IMPORTANT:
+ * To run this test, you must:
+ * 1) Set the variables {@link ITAzureParallelIndexTest#CONTAINER} and {@link ITAzureParallelIndexTest#PATH} for your data
+ * 2) Copy wikipedia_index_data1.json, wikipedia_index_data2.json, and wikipedia_index_data3.json
+ *    located in integration-tests/src/test/resources/data/batch_index to your Azure at the location set in step 1.
+ * 3) Provide -Doverride.config.path=<PATH_TO_FILE> with Azure credentials/configs set. See
+ *    integration-tests/docker/environment-configs/override-examples/azure for env vars to provide.
+ */
+@Test(groups = TestNGGroup.AZURE_DEEP_STORAGE)
+@Guice(moduleFactory = DruidTestModuleFactory.class)
+public class ITAzureParallelIndexTest extends AbstractITBatchIndexTest
+{
+  // START: Change this with the configs for your Azure data
+  private static final String CONTAINER = "my-container";
 
 Review comment:
   Will change these to configs

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] maytasm3 commented on a change in pull request #9501: Adding s3, gcs, azure integration tests

Posted by GitBox <gi...@apache.org>.
maytasm3 commented on a change in pull request #9501: Adding s3, gcs, azure integration tests
URL: https://github.com/apache/druid/pull/9501#discussion_r393326716
 
 

 ##########
 File path: integration-tests/docker/druid.sh
 ##########
 @@ -73,4 +76,20 @@ setupConfig() {
       var=$(echo "$evar" | sed -e 's?^\([^=]*\)=.*?\1?g' -e 's?_?.?g')
       setKey $DRUID_SERVICE "$var" "$val"
   done
-}
\ No newline at end of file
+}
+
+setupData()
 
 Review comment:
   Done

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] maytasm3 commented on a change in pull request #9501: Adding s3, gcs, azure integration tests

Posted by GitBox <gi...@apache.org>.
maytasm3 commented on a change in pull request #9501: Adding s3, gcs, azure integration tests
URL: https://github.com/apache/druid/pull/9501#discussion_r393266331
 
 

 ##########
 File path: integration-tests/docker/environment-configs/override-examples/azure
 ##########
 @@ -0,0 +1,28 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+#
+
+#
+# Example of override config file to provide.
+# Please replace <OVERRIDE_THIS> with your cloud configs/credentials
+#
+druid_storage_type=azure
+druid_azure_account=<OVERRIDE_THIS>
+druid_azure_key=<OVERRIDE_THIS>
+druid_azure_container=<OVERRIDE_THIS>
+druid_extensions_loadList=["druid-azure-extensions"]
 
 Review comment:
   The override-examples directory is meant to be a general guideline of what to override. You may override more or less depending on what test you are running/writing. You may combine multiple files from the override-examples directory if you are running combination of extensions (like ingest data from s3 into hdfs deep storage).
   Maybe we can revisit this better for a more user-friendly and easier way 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org