You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@avro.apache.org by "Rumeshkrishnan (JIRA)" <ji...@apache.org> on 2019/02/04 21:21:00 UTC
[jira] [Commented] (AVRO-2299) Get Plain Schema
[ https://issues.apache.org/jira/browse/AVRO-2299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16760229#comment-16760229 ]
Rumeshkrishnan commented on AVRO-2299:
--------------------------------------
I was going through all the avro types and found that many changes required with respective AVRO types. I have done code changes in new file, if we can able to find the way reuse functionalities and combine the SchemaNormalization.java then it is helpful. I will try to come up with test cases for this SchemaCanonicalizer. The rules as followed.
* Canonical normaliser should filter and order the reserved as well as user given properties.
* User able to normalise schema with additional user defined logical types.
* name, namespace is different keys in Canonical normaliser. it should not reduce as single property `name` for RECORD, FIXED, ENUM types.
* reserved avro property ordering as below, followed by user given properties.
{code:java}
"name", "namespace", "type", "fields", "symbols", "items", "values", "logicalType", "size", "order", "doc", "aliases", "default"{code}
*Current SchemaCanonicalizer.java implementation:*
{code:java}
import org.apache.avro.util.internal.JacksonUtils;
import java.io.IOException;
import java.util.*;
/**
* Collection of static methods for generating the canonical form of
* schemas with reserved properties (see {@link #toCanonicalForm}).
*/
public class SchemaCanonicalizer {
private static final LinkedHashSet<String> RESERVED_PROPERTIES = new LinkedHashSet<>();
private static final LinkedHashSet<LogicalTypes> ADDITIONAL_LOGICAL_TYPES = new LinkedHashSet<>();
private SchemaCanonicalizer() {
}
private SchemaCanonicalizer(LogicalTypes... lts) {
ADDITIONAL_LOGICAL_TYPES.addAll(Arrays.asList(lts));
}
static {
Collections.addAll(RESERVED_PROPERTIES,
"name", "namespace", "type", "fields", "symbols", "items", "values",
"logicalType", "size", "order", "doc", "aliases", "default");
}
public static String toCanonicalForm(Schema s) {
try {
return build(s, new StringBuilder()).toString();
} catch (IOException e) {
// Shouldn't happen, b/c StringBuilder can't throw IOException
throw new RuntimeException(e);
}
}
public static String toCanonicalForm(Schema s, LinkedHashSet<String> properties) {
try {
RESERVED_PROPERTIES.addAll(properties);
return build(s, new StringBuilder()).toString();
} catch (IOException e) {
// Shouldn't happen, b/c StringBuilder can't throw IOException
throw new RuntimeException(e);
}
}
private static Appendable build(Schema s, Appendable o) throws IOException {
Schema.Type st = s.getType();
LogicalType lt = null;
if (ADDITIONAL_LOGICAL_TYPES.isEmpty()) {
lt = s.getLogicalType();
} else {
lt = getLogicalType(s);
}
if (lt == null) {
switch (st) {
default: // boolean, bytes, double, float, int, long, null, string
return o.append('"').append(st.getName()).append('"');
case UNION:
writeUnionType(s, o);
case ARRAY:
writeArrayType(s, o);
case MAP:
writeMapType(s, o);
case ENUM:
writeEnumType(s, o);
case FIXED:
writeFixedType(s, o);
case RECORD:
writeRecordType(s, o);
}
} else {
writeLogicalType(s, lt, o);
}
return o;
}
private static LogicalType getLogicalType(Schema s) {
for (LogicalTypes lts : ADDITIONAL_LOGICAL_TYPES) {
LogicalType lt = LogicalTypes.fromSchema(s);
if (lt != null) return lt;
}
return null;
}
private static Appendable writeLogicalType(Schema s, LogicalType lt, Appendable o) throws IOException {
o.append("{\"type\":\"").append(s.getType().getName()).append("\"");
o.append("\"").append(LogicalType.LOGICAL_TYPE_PROP).append("\":\"").append(lt.getName()).append("\"");
// adding the reserved property
writeProps(o, s.getObjectProps());
return o.append("}");
}
private static Appendable writeUnionType(Schema s, Appendable o) throws IOException {
boolean firstTime = true;
o.append('[');
for (Schema b : s.getTypes()) {
if (!firstTime) o.append(',');
else firstTime = false;
build(b, o);
}
return o.append(']');
}
private static Appendable writeArrayType(Schema s, Appendable o) throws IOException {
o.append("{\"type\":\"").append(s.getType().getName()).append("\"");
build(s.getElementType(), o.append(",\"items\":"));
// adding the reserved property
writeProps(o, s.getObjectProps());
return o.append("}");
}
private static Appendable writeMapType(Schema s, Appendable o) throws IOException {
o.append("{\"type\":\"").append(s.getType().getName()).append("\"");
build(s.getValueType(), o.append(",\"values\":"));
// adding the reserved property
writeProps(o, s.getObjectProps());
return o.append("}");
}
private static Appendable writeFixedType(Schema s, Appendable o) throws IOException {
o.append("{\"name\":\"").append(s.getName()).append("\"");
writeNamespace(o, s.getNamespace());
o.append(",\"type\":\"").append(s.getType().getName()).append("\"");
o.append(",\"size\":").append(Integer.toString(s.getFixedSize()));
writeAliases(o, s.getAliases());
// adding the reserved property
writeProps(o, s.getObjectProps());
return o;
}
private static Appendable writeEnumType(Schema s, Appendable o) throws IOException {
o.append("{\"name\":\"").append(s.getName()).append("\"");
writeNamespace(o, s.getNamespace());
o.append(",\"type\":\"").append(s.getType().getName()).append("\"");
writeDoc(o, s.getDoc());
writeAliases(o, s.getAliases());
boolean firstTime = true;
o.append(",\"symbols\":[");
for (String enumSymbol : s.getEnumSymbols()) {
if (!firstTime) o.append(',');
else firstTime = false;
o.append('"').append(enumSymbol).append('"');
}
o.append("]");
// adding the reserved property
writeProps(o, s.getObjectProps());
return o;
}
private static Appendable writeRecordType(Schema s, Appendable o) throws IOException {
o.append("{\"name\":\"").append(s.getName()).append("\"");
writeNamespace(o, s.getNamespace());
o.append(",\"type\":\"").append(s.getType().getName()).append("\"");
writeDoc(o, s.getDoc());
writeAliases(o, s.getAliases());
boolean firstTime = true;
o.append(",\"fields\":[");
for (Schema.Field f : s.getFields()) {
if (!firstTime) o.append(',');
else firstTime = false;
o.append("{\"name\":\"").append(f.name()).append("\"");
build(f.schema(), o.append(",\"type\":"));
// order
writeOrder(o, f.order());
// doc
writeDoc(o, f.doc());
// aliases
writeAliases(o, f.aliases());
// default
writeDefault(o, f.defaultVal());
o.append("}");
}
o.append("]");
// adding the reserved property
writeProps(o, s.getObjectProps());
return o;
}
private static Appendable writeProps(Appendable o, Map<String, Object> schemaProps) throws IOException {
for (String propKey : RESERVED_PROPERTIES) {
if (schemaProps.containsKey(propKey)) {
String propValue = JacksonUtils.toJsonNode(schemaProps.get(propKey)).toString();
o.append(",\"").append(propKey).append("\":").append(propValue);
}
}
return o;
}
private static Appendable writeNamespace(Appendable o, String namespace) throws IOException {
if (namespace != null) {
o.append(",\"namespace\":\"").append(namespace).append("\"");
}
return o;
}
private static Appendable writeOrder(Appendable o, Schema.Field.Order order) throws IOException {
if (order != null) {
o.append(",\"order\":\"").append(order.toString()).append("\"");
}
return o;
}
private static Appendable writeDoc(Appendable o, String doc) throws IOException {
if (doc != null) {
o.append(",\"doc\":\"").append(doc).append("\"");
}
return o;
}
private static Appendable writeAliases(Appendable o, Set<String> aliases) throws IOException {
if (!aliases.isEmpty()) {
String propValue = JacksonUtils.toJsonNode(aliases).toString();
o.append(",\"aliases\":").append(propValue);
}
return o;
}
private static Appendable writeDefault(Appendable o, Object object) throws IOException {
if (object != null) {
String propValue = JacksonUtils.toJsonNode(object).toString();
o.append(",\"default\":").append(propValue);
}
return o;
}
}
{code}
[~cutting] kindly review the code, mean while I will create the test cases. If any addition in the new canonical normaliser rules and modification kindly let me know.
> Get Plain Schema
> ----------------
>
> Key: AVRO-2299
> URL: https://issues.apache.org/jira/browse/AVRO-2299
> Project: Apache Avro
> Issue Type: New Feature
> Components: java
> Affects Versions: 1.8.2
> Reporter: Rumeshkrishnan
> Priority: Minor
> Labels: features
> Fix For: 1.9.0, 1.8.2, 1.8.3, 1.8.4
>
>
> {panel:title=Avro Schema Reserved Keys:}
> "doc", "fields", "items", "name", "namespace",
> "size", "symbols", "values", "type", "aliases", "default"
> {panel}
> AVRO also supports user defined properties for both Schema and Field.
> Is there way to get the schema with reserved property (key, value)?
> Input Schema:
> {code:java}
> {
> "name": "testSchema",
> "namespace": "com.avro",
> "type": "record",
> "fields": [
> {
> "name": "email",
> "type": "string",
> "doc": "email id",
> "user_field_prop": "xxxxx"
> }
> ],
> "user_schema_prop": "xxxxxx"
> }{code}
> Expected Plain Schema:
> {code:java}
> {
> "name": "testSchema",
> "namespace": "com.avro",
> "type": "record",
> "fields": [
> {
> "name": "email",
> "type": "string",
> "doc": "email id"
> }
> ]
> }
> {code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)