You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@avro.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2022/05/24 13:49:00 UTC

[jira] [Work logged] (AVRO-3404) Extend the IDL syntax to serve as a .avsc equivalent as well

     [ https://issues.apache.org/jira/browse/AVRO-3404?focusedWorklogId=774013&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-774013 ]

ASF GitHub Bot logged work on AVRO-3404:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 24/May/22 13:48
            Start Date: 24/May/22 13:48
    Worklog Time Spent: 10m 
      Work Description: Christophe-LeSaec commented on code in PR #1589:
URL: https://github.com/apache/avro/pull/1589#discussion_r880498764


##########
lang/java/idl/src/main/java/org/apache/avro/idl/SchemaVisitorAction.java:
##########
@@ -0,0 +1,40 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     https://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.avro.idl;
+
+public enum SchemaVisitorAction {

Review Comment:
   if inspired from java.nio.file.FileVisitResult for walkFileTree(Path start, FileVisitor<? super Path> visitor), it could be renamed in SchemaVisitorResult ?



##########
lang/java/idl/src/main/java/org/apache/avro/idl/Schemas.java:
##########
@@ -0,0 +1,151 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     https://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.avro.idl;
+
+import org.apache.avro.Schema;
+import org.apache.avro.Schema.Field;
+
+import java.util.ArrayDeque;
+import java.util.Collections;
+import java.util.Deque;
+import java.util.IdentityHashMap;
+import java.util.function.Supplier;
+import java.util.stream.Collectors;
+
+/**
+ * Avro Schema utilities, to traverse...
+ */
+public final class Schemas {
+
+  private Schemas() {
+  }
+
+  /**
+   * Depth first visit.
+   */
+  public static <T> T visit(final Schema start, final SchemaVisitor<T> visitor) {

Review Comment:
   Question of taste, i prefer short static method and separation in classes.
   To do so with this code, i can propose this (look very similar, but cut in smaller pieces, sorry for long comment) 
   ```java
   public final class Schemas {
   
     private Schemas() {}
   
     private static abstract class InternalSchemaActions {
       abstract boolean visitNonTerminal(final Schema schema, final Iterable<Schema> itSupp);
       abstract boolean visitTerminal(final Schema schema);
     }
   
     static abstract class VisitAction { // to have a typed queue
       abstract boolean onVisitor(InternalSchemaActions actions);
     }
   
     static class SchemaVisitAction extends VisitAction {
   
       private final Schema schema;
   
       private final IdentityHashMap<Schema, Schema> visited;
   
       public SchemaVisitAction(Schema schema, IdentityHashMap<Schema, Schema> visited) {
         this.schema = schema;
         this.visited = visited;
       }
   
       @Override
       boolean onVisitor(InternalSchemaActions actions) {
         final boolean terminate;
         if (visited.containsKey(schema)) {
           terminate = actions.visitTerminal(schema);
         } else {
           Schema.Type type = schema.getType();
           switch (type) {
           case ARRAY:
             terminate = actions.visitNonTerminal(schema, Collections.singleton(schema.getElementType()));
             visited.put(schema, schema);
             break;
           case RECORD:
             terminate = actions.visitNonTerminal(schema, () -> schema.getFields().stream().map(Field::schema)
                 .collect(Collectors.toCollection(ArrayDeque::new)).descendingIterator());
             visited.put(schema, schema);
             break;
           case UNION:
             terminate = actions.visitNonTerminal(schema, schema.getTypes());
             visited.put(schema, schema);
             break;
           case MAP:
             terminate = actions.visitNonTerminal(schema, Collections.singleton(schema.getValueType()));
             visited.put(schema, schema);
             break;
           default:
             terminate = actions.visitTerminal(schema);
             break;
           }
         }
         return terminate;
       }
     }
   
     static class MethodVisitAction extends VisitAction {
   
       private final Supplier<SchemaVisitorAction> actionGetter;
   
       private final Deque<VisitAction> dq;
   
       public MethodVisitAction(Supplier<SchemaVisitorAction> actionGetter, Deque<VisitAction> dq) {
         this.actionGetter = actionGetter;
         this.dq = dq;
       }
   
       @Override
       boolean onVisitor(InternalSchemaActions actions) {
         boolean ret = false;
         final SchemaVisitorAction action = actionGetter.get();
         switch (action) {
         case CONTINUE:
           break;
         case SKIP_SUBTREE:
           throw new UnsupportedOperationException();
         case SKIP_SIBLINGS:
           while (dq.peek() instanceof SchemaVisitAction) {
             dq.remove();
           }
           break;
         case TERMINATE:
           ret = true;
           break;
         default:
           throw new UnsupportedOperationException("Invalid action " + action);
         }
         return ret;
       }
     }
   
     static class Visit<T> {
       private final SchemaVisitor<T> visitor;
   
       private final IdentityHashMap<Schema, Schema> visited = new IdentityHashMap<>();
   
       private final Deque<VisitAction> dq = new ArrayDeque<>();
   
       public Visit(SchemaVisitor<T> visitor) {
         this.visitor = visitor;
       }
   
       private final InternalSchemaActions actions = new InternalSchemaActions() {
         @Override
         public boolean visitNonTerminal(Schema schema, Iterable<Schema> itSupp) {
           return Visit.this.visitNonTerminal(schema, itSupp);
         }
   
         @Override
         public boolean visitTerminal(Schema schema) {
           return Visit.this.visitTerminal(schema);
         }
       };
   
       public T visit(final Schema start) {
         // Set of Visited Schemas
   
         // Stack that contains the Schemas to process and afterVisitNonTerminal
         // functions.
         // Deque<Either<Schema, Supplier<SchemaVisitorAction>>>
         // Using Either<...> has a cost we want to avoid...
         VisitAction current = new SchemaVisitAction(start, this.visited);
         boolean terminate = false;
         while (current != null && !terminate) {
           terminate = current.onVisitor(this.actions);
           current = dq.poll();
         }
         return visitor.get();
       }
   
       private boolean visitTerminal(final Schema schema) {
         SchemaVisitorAction action = visitor.visitTerminal(schema);
         switch (action) {
         case CONTINUE:
           break;
         case SKIP_SIBLINGS:
           while (dq.peek() instanceof SchemaVisitAction) {
             dq.remove();
           }
           break;
         case TERMINATE:
           return true;
         case SKIP_SUBTREE:
         default:
           throw new UnsupportedOperationException("Invalid action " + action + " for " + schema);
         }
         return false;
       }
   
       private boolean visitNonTerminal(final Schema schema, final Iterable<Schema> itSupp) {
         SchemaVisitorAction action = visitor.visitNonTerminal(schema);
         switch (action) {
         case CONTINUE:
           dq.push(new MethodVisitAction(() -> visitor.afterVisitNonTerminal(schema), dq));
           itSupp.forEach((Schema supp) -> dq.push(new SchemaVisitAction(supp, this.visited)));
           break;
         case SKIP_SUBTREE:
           dq.push(new MethodVisitAction(() -> visitor.afterVisitNonTerminal(schema), dq));
           break;
         case SKIP_SIBLINGS:
           while (dq.peek() instanceof SchemaVisitAction) {
             dq.remove();
           }
           break;
         case TERMINATE:
           return true;
         default:
           throw new UnsupportedOperationException("Invalid action " + action + " for " + schema);
         }
         return false;
       }
     }
   
     /**
      * Depth first visit.
      */
     public static <T> T visit(final Schema start, final SchemaVisitor<T> visitor) {
       // Set of Visited Schemas
       Visit<T> visit = new Visit<>(visitor);
       return visit.visit(start);
     }
   }
   ```





Issue Time Tracking
-------------------

            Worklog Id:     (was: 774013)
    Remaining Estimate: 0h
            Time Spent: 10m

> Extend the IDL syntax to serve as a .avsc equivalent as well
> ------------------------------------------------------------
>
>                 Key: AVRO-3404
>                 URL: https://issues.apache.org/jira/browse/AVRO-3404
>             Project: Apache Avro
>          Issue Type: Improvement
>          Components: build, dependencies, java
>            Reporter: Oscar Westra van Holthe - Kind
>            Assignee: Oscar Westra van Holthe - Kind
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> The Avro IDL format is not widely adopted, as it yields a protocol (most people use schemas only). To remedy this, I propose to extend the IDL syntax.
> Future IDL files will have two options:
>  # Use the existing syntax to define a protocol
> Parsing such a file yields a protocol and a collection of named schemas (that can also be found in the protocol object)
>  # Use the syntax below to define a schema
> Parsing such a file yields a collection of named schemas and optionally a "main schema", which can be a named or an anonymous schema.
> Proposed syntax:
> {code:none}
> // Optional: the default namespace for the whole file (defaults to the null namespace if omitted)
> // (in the existing syntax, the namespace of the protocol has the same function)
> namespace my.name.space;
> // Optional: a main schema to parse the file as a .avsc equivalent
> schema array<Message>;
> // Below here are imports and named schemas in any order
> /* Import rules for the new syntax:
> - the namespace of the imported named types is determined by the imported file
>   (same as for the existing syntax)
> - when importing idl or protocol files, only the named schemas are imported
>   (as this syntax supports only these)
> */
> import idl "common.avdl";
> /**
>  * A message to communicate.
>  */
> record Message {
>   array<Header> headers;
>   string? title = null; // Idea: add null as default value, unless there's an explicit default value
>   string message;
>   timestamp_ms sendTime;
> }
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)