You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2020/06/27 02:00:01 UTC

[GitHub] [druid] maytasm opened a new pull request #10088: Add integration tests for Avro OCF InputFormat

maytasm opened a new pull request #10088:
URL: https://github.com/apache/druid/pull/10088


   Add integration tests for Avro OCF InputFormat
   
   ### Description
   
   Add integration tests for Avro OCF InputFormat introduced in https://github.com/apache/druid/pull/9671
   
   This PR has:
   - [x] been self-reviewed.
   - [ ] added documentation for new or modified features or behaviors.
   - [ ] added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
   - [ ] added or updated version, license, or notice information in [licenses.yaml](https://github.com/apache/druid/blob/master/licenses.yaml)
   - [ ] added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
   - [ ] added unit tests or modified existing tests to cover new code paths, ensuring the threshold for [code coverage](https://github.com/apache/druid/blob/master/dev/code-review/code-coverage.md) is met.
   - [x] added integration tests.
   - [ ] been tested in a test Druid cluster.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] josephglanville commented on a change in pull request #10088: Add integration tests for all InputFormat

Posted by GitBox <gi...@apache.org>.
josephglanville commented on a change in pull request #10088:
URL: https://github.com/apache/druid/pull/10088#discussion_r446524611



##########
File path: integration-tests/src/test/java/org/apache/druid/tests/parallelized/ITLocalInputSourceAllInputFormatTest.java
##########
@@ -0,0 +1,100 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.tests.parallelized;
+
+import com.google.common.collect.ImmutableList;
+import com.google.common.collect.ImmutableMap;
+import org.apache.druid.testing.guice.DruidTestModuleFactory;
+import org.apache.druid.tests.TestNGGroup;
+import org.apache.druid.tests.indexer.AbstractLocalInputSourceParallelIndexTest;
+import org.testng.annotations.Guice;
+import org.testng.annotations.Test;
+
+import java.util.List;
+import java.util.Map;
+
+@Test(groups = TestNGGroup.BATCH_INDEX)
+@Guice(moduleFactory = DruidTestModuleFactory.class)
+public class ITLocalInputSourceAllInputFormatTest extends AbstractLocalInputSourceParallelIndexTest
+{
+  @Test
+  public void testAvroInputFormatIndexDataIngestionSpecWithoutSchema() throws Exception
+  {
+    List fieldList = ImmutableList.of(
+        ImmutableMap.of("name", "timestamp", "type", "string"),
+        ImmutableMap.of("name", "page", "type", "string"),
+        ImmutableMap.of("name", "language", "type", "string"),
+        ImmutableMap.of("name", "user", "type", "string"),
+        ImmutableMap.of("name", "unpatrolled", "type", "string"),
+        ImmutableMap.of("name", "newPage", "type", "string"),
+        ImmutableMap.of("name", "robot", "type", "string"),
+        ImmutableMap.of("name", "anonymous", "type", "string"),
+        ImmutableMap.of("name", "namespace", "type", "string"),
+        ImmutableMap.of("name", "continent", "type", "string"),
+        ImmutableMap.of("name", "country", "type", "string"),
+        ImmutableMap.of("name", "region", "type", "string"),
+        ImmutableMap.of("name", "city", "type", "string"),
+        ImmutableMap.of("name", "added", "type", "int"),
+        ImmutableMap.of("name", "deleted", "type", "int"),
+        ImmutableMap.of("name", "delta", "type", "int")
+    );
+    Map schema = ImmutableMap.of("namespace", "org.apache.druid.data.input",
+                                 "type", "record",
+                                 "name", "wikipedia",
+                                 "fields", fieldList);
+    doIndexTest(InputFormatDetails.AVRO, ImmutableMap.of("schema", schema));
+  }
+
+  @Test
+  public void testAvroInputFormatIndexDataIngestionSpecWithSchema() throws Exception

Review comment:
       Are these named around the wrong way? This one seems to be without whilst the one above seems to be supplying a schema.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] maytasm commented on a change in pull request #10088: Add integration tests for all InputFormat

Posted by GitBox <gi...@apache.org>.
maytasm commented on a change in pull request #10088:
URL: https://github.com/apache/druid/pull/10088#discussion_r447209138



##########
File path: integration-tests/src/test/java/org/apache/druid/tests/parallelized/ITLocalInputSourceAllInputFormatTest.java
##########
@@ -0,0 +1,100 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.tests.parallelized;
+
+import com.google.common.collect.ImmutableList;
+import com.google.common.collect.ImmutableMap;
+import org.apache.druid.testing.guice.DruidTestModuleFactory;
+import org.apache.druid.tests.TestNGGroup;
+import org.apache.druid.tests.indexer.AbstractLocalInputSourceParallelIndexTest;
+import org.testng.annotations.Guice;
+import org.testng.annotations.Test;
+
+import java.util.List;
+import java.util.Map;
+
+@Test(groups = TestNGGroup.INPUT_FORMAT)
+@Guice(moduleFactory = DruidTestModuleFactory.class)
+public class ITLocalInputSourceAllInputFormatTest extends AbstractLocalInputSourceParallelIndexTest
+{
+  @Test
+  public void testAvroInputFormatIndexDataIngestionSpecWithSchema() throws Exception
+  {
+    List fieldList = ImmutableList.of(
+        ImmutableMap.of("name", "timestamp", "type", "string"),
+        ImmutableMap.of("name", "page", "type", "string"),
+        ImmutableMap.of("name", "language", "type", "string"),
+        ImmutableMap.of("name", "user", "type", "string"),
+        ImmutableMap.of("name", "unpatrolled", "type", "string"),
+        ImmutableMap.of("name", "newPage", "type", "string"),
+        ImmutableMap.of("name", "robot", "type", "string"),
+        ImmutableMap.of("name", "anonymous", "type", "string"),
+        ImmutableMap.of("name", "namespace", "type", "string"),
+        ImmutableMap.of("name", "continent", "type", "string"),
+        ImmutableMap.of("name", "country", "type", "string"),
+        ImmutableMap.of("name", "region", "type", "string"),
+        ImmutableMap.of("name", "city", "type", "string"),
+        ImmutableMap.of("name", "added", "type", "int"),
+        ImmutableMap.of("name", "deleted", "type", "int"),
+        ImmutableMap.of("name", "delta", "type", "int")
+    );
+    Map schema = ImmutableMap.of("namespace", "org.apache.druid.data.input",
+                                 "type", "record",
+                                 "name", "wikipedia",
+                                 "fields", fieldList);
+    doIndexTest(InputFormatDetails.AVRO, ImmutableMap.of("schema", schema));
+  }
+
+  @Test
+  public void testAvroInputFormatIndexDataIngestionSpecWithoutSchema() throws Exception
+  {
+    doIndexTest(InputFormatDetails.AVRO);

Review comment:
       This should already be running in parallel (2 at a time). Let me double check.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] maytasm commented on pull request #10088: Add integration tests for all InputFormat

Posted by GitBox <gi...@apache.org>.
maytasm commented on pull request #10088:
URL: https://github.com/apache/druid/pull/10088#issuecomment-650627700


   > Not sure if you want to bother adding a schema evolution test but that is probably only behaviour not covered by these, it's covered by the Avro unit tests already though.
   
   I am only adding basic happy path intending to be smoke tests to make sure we have coverage for all input formats. Integration tests takes more time (~3 mins each) so I don't think we have to cover all the configs of each input format


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] maytasm commented on a change in pull request #10088: Add integration tests for all InputFormat

Posted by GitBox <gi...@apache.org>.
maytasm commented on a change in pull request #10088:
URL: https://github.com/apache/druid/pull/10088#discussion_r447207710



##########
File path: extensions-core/datasketches/src/main/java/org/apache/druid/query/aggregation/datasketches/quantiles/DoublesSketchComplexMetricSerde.java
##########
@@ -77,7 +78,7 @@ public Object extractValue(final InputRow inputRow, final String metricName)
           // This corresponds to "A" in base64, so it is not a digit
           if (objectString.isEmpty()) {
             return DoublesSketchOperations.EMPTY_SKETCH;
-          } else if (Character.isDigit(objectString.charAt(0))) {
+          } else if (NumberUtils.isParsable(objectString)) {

Review comment:
       Sounds good to me.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] clintropolis commented on a change in pull request #10088: Add integration tests for all InputFormat

Posted by GitBox <gi...@apache.org>.
clintropolis commented on a change in pull request #10088:
URL: https://github.com/apache/druid/pull/10088#discussion_r446727236



##########
File path: extensions-core/datasketches/src/main/java/org/apache/druid/query/aggregation/datasketches/quantiles/DoublesSketchComplexMetricSerde.java
##########
@@ -77,7 +78,7 @@ public Object extractValue(final InputRow inputRow, final String metricName)
           // This corresponds to "A" in base64, so it is not a digit
           if (objectString.isEmpty()) {
             return DoublesSketchOperations.EMPTY_SKETCH;
-          } else if (Character.isDigit(objectString.charAt(0))) {
+          } else if (NumberUtils.isParsable(objectString)) {

Review comment:
       Since this method is going to look at every character of the string to check if it's a number, I wonder if it's better to just try to parse it to a double and then use it if it's not null, maybe
   ```java
   ...
             final Double doubleValue;
             if (objectString.isEmpty()) {
               return DoublesSketchOperations.EMPTY_SKETCH;
             } else if ((doubleValue = Doubles.tryParse(objectString)) != null) {
               UpdateDoublesSketch sketch = DoublesSketch.builder().setK(MIN_K).build();
               sketch.update(doubleValue);
               return sketch;
             }
   ...
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] maytasm commented on a change in pull request #10088: Add integration tests for all InputFormat

Posted by GitBox <gi...@apache.org>.
maytasm commented on a change in pull request #10088:
URL: https://github.com/apache/druid/pull/10088#discussion_r446566989



##########
File path: integration-tests/src/test/java/org/apache/druid/tests/parallelized/ITLocalInputSourceAllInputFormatTest.java
##########
@@ -0,0 +1,100 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.tests.parallelized;
+
+import com.google.common.collect.ImmutableList;
+import com.google.common.collect.ImmutableMap;
+import org.apache.druid.testing.guice.DruidTestModuleFactory;
+import org.apache.druid.tests.TestNGGroup;
+import org.apache.druid.tests.indexer.AbstractLocalInputSourceParallelIndexTest;
+import org.testng.annotations.Guice;
+import org.testng.annotations.Test;
+
+import java.util.List;
+import java.util.Map;
+
+@Test(groups = TestNGGroup.BATCH_INDEX)
+@Guice(moduleFactory = DruidTestModuleFactory.class)
+public class ITLocalInputSourceAllInputFormatTest extends AbstractLocalInputSourceParallelIndexTest
+{
+  @Test
+  public void testAvroInputFormatIndexDataIngestionSpecWithoutSchema() throws Exception
+  {
+    List fieldList = ImmutableList.of(
+        ImmutableMap.of("name", "timestamp", "type", "string"),
+        ImmutableMap.of("name", "page", "type", "string"),
+        ImmutableMap.of("name", "language", "type", "string"),
+        ImmutableMap.of("name", "user", "type", "string"),
+        ImmutableMap.of("name", "unpatrolled", "type", "string"),
+        ImmutableMap.of("name", "newPage", "type", "string"),
+        ImmutableMap.of("name", "robot", "type", "string"),
+        ImmutableMap.of("name", "anonymous", "type", "string"),
+        ImmutableMap.of("name", "namespace", "type", "string"),
+        ImmutableMap.of("name", "continent", "type", "string"),
+        ImmutableMap.of("name", "country", "type", "string"),
+        ImmutableMap.of("name", "region", "type", "string"),
+        ImmutableMap.of("name", "city", "type", "string"),
+        ImmutableMap.of("name", "added", "type", "int"),
+        ImmutableMap.of("name", "deleted", "type", "int"),
+        ImmutableMap.of("name", "delta", "type", "int")
+    );
+    Map schema = ImmutableMap.of("namespace", "org.apache.druid.data.input",
+                                 "type", "record",
+                                 "name", "wikipedia",
+                                 "fields", fieldList);
+    doIndexTest(InputFormatDetails.AVRO, ImmutableMap.of("schema", schema));
+  }
+
+  @Test
+  public void testAvroInputFormatIndexDataIngestionSpecWithSchema() throws Exception

Review comment:
       Yea, you are right. Fixed.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] clintropolis commented on a change in pull request #10088: Add integration tests for all InputFormat

Posted by GitBox <gi...@apache.org>.
clintropolis commented on a change in pull request #10088:
URL: https://github.com/apache/druid/pull/10088#discussion_r446731370



##########
File path: integration-tests/src/test/java/org/apache/druid/tests/parallelized/ITLocalInputSourceAllInputFormatTest.java
##########
@@ -0,0 +1,100 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.tests.parallelized;
+
+import com.google.common.collect.ImmutableList;
+import com.google.common.collect.ImmutableMap;
+import org.apache.druid.testing.guice.DruidTestModuleFactory;
+import org.apache.druid.tests.TestNGGroup;
+import org.apache.druid.tests.indexer.AbstractLocalInputSourceParallelIndexTest;
+import org.testng.annotations.Guice;
+import org.testng.annotations.Test;
+
+import java.util.List;
+import java.util.Map;
+
+@Test(groups = TestNGGroup.INPUT_FORMAT)
+@Guice(moduleFactory = DruidTestModuleFactory.class)
+public class ITLocalInputSourceAllInputFormatTest extends AbstractLocalInputSourceParallelIndexTest
+{
+  @Test
+  public void testAvroInputFormatIndexDataIngestionSpecWithSchema() throws Exception
+  {
+    List fieldList = ImmutableList.of(
+        ImmutableMap.of("name", "timestamp", "type", "string"),
+        ImmutableMap.of("name", "page", "type", "string"),
+        ImmutableMap.of("name", "language", "type", "string"),
+        ImmutableMap.of("name", "user", "type", "string"),
+        ImmutableMap.of("name", "unpatrolled", "type", "string"),
+        ImmutableMap.of("name", "newPage", "type", "string"),
+        ImmutableMap.of("name", "robot", "type", "string"),
+        ImmutableMap.of("name", "anonymous", "type", "string"),
+        ImmutableMap.of("name", "namespace", "type", "string"),
+        ImmutableMap.of("name", "continent", "type", "string"),
+        ImmutableMap.of("name", "country", "type", "string"),
+        ImmutableMap.of("name", "region", "type", "string"),
+        ImmutableMap.of("name", "city", "type", "string"),
+        ImmutableMap.of("name", "added", "type", "int"),
+        ImmutableMap.of("name", "deleted", "type", "int"),
+        ImmutableMap.of("name", "delta", "type", "int")
+    );
+    Map schema = ImmutableMap.of("namespace", "org.apache.druid.data.input",
+                                 "type", "record",
+                                 "name", "wikipedia",
+                                 "fields", fieldList);
+    doIndexTest(InputFormatDetails.AVRO, ImmutableMap.of("schema", schema));
+  }
+
+  @Test
+  public void testAvroInputFormatIndexDataIngestionSpecWithoutSchema() throws Exception
+  {
+    doIndexTest(InputFormatDetails.AVRO);

Review comment:
       I wonder if these could all run in parallel in a single test?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] maytasm merged pull request #10088: Add integration tests for all InputFormat

Posted by GitBox <gi...@apache.org>.
maytasm merged pull request #10088:
URL: https://github.com/apache/druid/pull/10088


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org