You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@avro.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2022/05/24 13:49:00 UTC
[jira] [Work logged] (AVRO-3404) Extend the IDL syntax to serve as a .avsc equivalent as well
[ https://issues.apache.org/jira/browse/AVRO-3404?focusedWorklogId=774013&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-774013 ]
ASF GitHub Bot logged work on AVRO-3404:
----------------------------------------
Author: ASF GitHub Bot
Created on: 24/May/22 13:48
Start Date: 24/May/22 13:48
Worklog Time Spent: 10m
Work Description: Christophe-LeSaec commented on code in PR #1589:
URL: https://github.com/apache/avro/pull/1589#discussion_r880498764
##########
lang/java/idl/src/main/java/org/apache/avro/idl/SchemaVisitorAction.java:
##########
@@ -0,0 +1,40 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * https://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.avro.idl;
+
+public enum SchemaVisitorAction {
Review Comment:
if inspired from java.nio.file.FileVisitResult for walkFileTree(Path start, FileVisitor<? super Path> visitor), it could be renamed in SchemaVisitorResult ?
##########
lang/java/idl/src/main/java/org/apache/avro/idl/Schemas.java:
##########
@@ -0,0 +1,151 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * https://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.avro.idl;
+
+import org.apache.avro.Schema;
+import org.apache.avro.Schema.Field;
+
+import java.util.ArrayDeque;
+import java.util.Collections;
+import java.util.Deque;
+import java.util.IdentityHashMap;
+import java.util.function.Supplier;
+import java.util.stream.Collectors;
+
+/**
+ * Avro Schema utilities, to traverse...
+ */
+public final class Schemas {
+
+ private Schemas() {
+ }
+
+ /**
+ * Depth first visit.
+ */
+ public static <T> T visit(final Schema start, final SchemaVisitor<T> visitor) {
Review Comment:
Question of taste, i prefer short static method and separation in classes.
To do so with this code, i can propose this (look very similar, but cut in smaller pieces, sorry for long comment)
```java
public final class Schemas {
private Schemas() {}
private static abstract class InternalSchemaActions {
abstract boolean visitNonTerminal(final Schema schema, final Iterable<Schema> itSupp);
abstract boolean visitTerminal(final Schema schema);
}
static abstract class VisitAction { // to have a typed queue
abstract boolean onVisitor(InternalSchemaActions actions);
}
static class SchemaVisitAction extends VisitAction {
private final Schema schema;
private final IdentityHashMap<Schema, Schema> visited;
public SchemaVisitAction(Schema schema, IdentityHashMap<Schema, Schema> visited) {
this.schema = schema;
this.visited = visited;
}
@Override
boolean onVisitor(InternalSchemaActions actions) {
final boolean terminate;
if (visited.containsKey(schema)) {
terminate = actions.visitTerminal(schema);
} else {
Schema.Type type = schema.getType();
switch (type) {
case ARRAY:
terminate = actions.visitNonTerminal(schema, Collections.singleton(schema.getElementType()));
visited.put(schema, schema);
break;
case RECORD:
terminate = actions.visitNonTerminal(schema, () -> schema.getFields().stream().map(Field::schema)
.collect(Collectors.toCollection(ArrayDeque::new)).descendingIterator());
visited.put(schema, schema);
break;
case UNION:
terminate = actions.visitNonTerminal(schema, schema.getTypes());
visited.put(schema, schema);
break;
case MAP:
terminate = actions.visitNonTerminal(schema, Collections.singleton(schema.getValueType()));
visited.put(schema, schema);
break;
default:
terminate = actions.visitTerminal(schema);
break;
}
}
return terminate;
}
}
static class MethodVisitAction extends VisitAction {
private final Supplier<SchemaVisitorAction> actionGetter;
private final Deque<VisitAction> dq;
public MethodVisitAction(Supplier<SchemaVisitorAction> actionGetter, Deque<VisitAction> dq) {
this.actionGetter = actionGetter;
this.dq = dq;
}
@Override
boolean onVisitor(InternalSchemaActions actions) {
boolean ret = false;
final SchemaVisitorAction action = actionGetter.get();
switch (action) {
case CONTINUE:
break;
case SKIP_SUBTREE:
throw new UnsupportedOperationException();
case SKIP_SIBLINGS:
while (dq.peek() instanceof SchemaVisitAction) {
dq.remove();
}
break;
case TERMINATE:
ret = true;
break;
default:
throw new UnsupportedOperationException("Invalid action " + action);
}
return ret;
}
}
static class Visit<T> {
private final SchemaVisitor<T> visitor;
private final IdentityHashMap<Schema, Schema> visited = new IdentityHashMap<>();
private final Deque<VisitAction> dq = new ArrayDeque<>();
public Visit(SchemaVisitor<T> visitor) {
this.visitor = visitor;
}
private final InternalSchemaActions actions = new InternalSchemaActions() {
@Override
public boolean visitNonTerminal(Schema schema, Iterable<Schema> itSupp) {
return Visit.this.visitNonTerminal(schema, itSupp);
}
@Override
public boolean visitTerminal(Schema schema) {
return Visit.this.visitTerminal(schema);
}
};
public T visit(final Schema start) {
// Set of Visited Schemas
// Stack that contains the Schemas to process and afterVisitNonTerminal
// functions.
// Deque<Either<Schema, Supplier<SchemaVisitorAction>>>
// Using Either<...> has a cost we want to avoid...
VisitAction current = new SchemaVisitAction(start, this.visited);
boolean terminate = false;
while (current != null && !terminate) {
terminate = current.onVisitor(this.actions);
current = dq.poll();
}
return visitor.get();
}
private boolean visitTerminal(final Schema schema) {
SchemaVisitorAction action = visitor.visitTerminal(schema);
switch (action) {
case CONTINUE:
break;
case SKIP_SIBLINGS:
while (dq.peek() instanceof SchemaVisitAction) {
dq.remove();
}
break;
case TERMINATE:
return true;
case SKIP_SUBTREE:
default:
throw new UnsupportedOperationException("Invalid action " + action + " for " + schema);
}
return false;
}
private boolean visitNonTerminal(final Schema schema, final Iterable<Schema> itSupp) {
SchemaVisitorAction action = visitor.visitNonTerminal(schema);
switch (action) {
case CONTINUE:
dq.push(new MethodVisitAction(() -> visitor.afterVisitNonTerminal(schema), dq));
itSupp.forEach((Schema supp) -> dq.push(new SchemaVisitAction(supp, this.visited)));
break;
case SKIP_SUBTREE:
dq.push(new MethodVisitAction(() -> visitor.afterVisitNonTerminal(schema), dq));
break;
case SKIP_SIBLINGS:
while (dq.peek() instanceof SchemaVisitAction) {
dq.remove();
}
break;
case TERMINATE:
return true;
default:
throw new UnsupportedOperationException("Invalid action " + action + " for " + schema);
}
return false;
}
}
/**
* Depth first visit.
*/
public static <T> T visit(final Schema start, final SchemaVisitor<T> visitor) {
// Set of Visited Schemas
Visit<T> visit = new Visit<>(visitor);
return visit.visit(start);
}
}
```
Issue Time Tracking
-------------------
Worklog Id: (was: 774013)
Remaining Estimate: 0h
Time Spent: 10m
> Extend the IDL syntax to serve as a .avsc equivalent as well
> ------------------------------------------------------------
>
> Key: AVRO-3404
> URL: https://issues.apache.org/jira/browse/AVRO-3404
> Project: Apache Avro
> Issue Type: Improvement
> Components: build, dependencies, java
> Reporter: Oscar Westra van Holthe - Kind
> Assignee: Oscar Westra van Holthe - Kind
> Priority: Major
> Labels: pull-request-available
> Time Spent: 10m
> Remaining Estimate: 0h
>
> The Avro IDL format is not widely adopted, as it yields a protocol (most people use schemas only). To remedy this, I propose to extend the IDL syntax.
> Future IDL files will have two options:
> # Use the existing syntax to define a protocol
> Parsing such a file yields a protocol and a collection of named schemas (that can also be found in the protocol object)
> # Use the syntax below to define a schema
> Parsing such a file yields a collection of named schemas and optionally a "main schema", which can be a named or an anonymous schema.
> Proposed syntax:
> {code:none}
> // Optional: the default namespace for the whole file (defaults to the null namespace if omitted)
> // (in the existing syntax, the namespace of the protocol has the same function)
> namespace my.name.space;
> // Optional: a main schema to parse the file as a .avsc equivalent
> schema array<Message>;
> // Below here are imports and named schemas in any order
> /* Import rules for the new syntax:
> - the namespace of the imported named types is determined by the imported file
> (same as for the existing syntax)
> - when importing idl or protocol files, only the named schemas are imported
> (as this syntax supports only these)
> */
> import idl "common.avdl";
> /**
> * A message to communicate.
> */
> record Message {
> array<Header> headers;
> string? title = null; // Idea: add null as default value, unless there's an explicit default value
> string message;
> timestamp_ms sendTime;
> }
> {code}
>
--
This message was sent by Atlassian Jira
(v8.20.7#820007)