You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@avro.apache.org by "Scott Carey (JIRA)" <ji...@apache.org> on 2013/05/02 04:23:13 UTC

[jira] [Commented] (AVRO-1316) IDL code-generation generates too-long literals for very large schemas

    [ https://issues.apache.org/jira/browse/AVRO-1316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13647205#comment-13647205 ] 

Scott Carey commented on AVRO-1316:
-----------------------------------

I have not, but my schemas are only ~12K.

I assume the problem is in the creation of the SCHEMA$ static field?

We could break the string up into 4k chunks.

However it will be more efficient and significantly less resulting class file size if we use the Schema API programatically.

This isn't too hard.

We go from the below (edited from one line to many for readability):
{code}
  public static final org.apache.avro.Schema SCHEMA$ = new org.apache.avro.Schema.Parser().parse(
  "{\"type\":\"record\",\"name\":\"HandshakeRequest\",\"namespace\":\"org.apache.avro.ipc\",\"fields\":[
    {\"name\":\"clientHash\",\"type\":{\"type\":\"fixed\",\"name\":\"MD5\",\"size\":16}},
    {\"name\":\"clientProtocol\",\"type\":[\"null\",{\"type\":\"string\",\"avro.java.string\":\"String\"}]},
    {\"name\":\"serverHash\",\"type\":\"MD5\"},
    {\"name\":\"meta\",\"type\":[\"null\",{\"type\":\"map\",\"values\":\"bytes\",\"avro.java.string\":\"String\"}]}
  ]}");
{code}

to use the new SchemaBuilder:
{code}
  public static final org.apache.avro.Schema SCHEMA$;
  static {
    SCHEMA$ = SchemaBuilder
      .recordType("HandshakeRequest")
      .namespace("org.apache.avro.ipc")
      .requiredFixed("clientHash", MD5.SCHEMA$)
      .unionType("clientProtocol", SchemaBuilder.unionType(
          SchemaBuilder.NULL,
          SchemaBuilder.STRING)
          .build())
          .addProp("avro.java.string", "String")
      .requiredFixed("serverHash", MD5.SCHEMA$)
      .unionType("meta", SchemaBuilder.unionType(
          SchemaBuilder.NULL,
          SchemaBuilder.mapType(SchemaBuilder.BYTES)
            .addProp("avro.java.string", "String")
            .build())
          .build())
      .build();
  }
{code}

                
> IDL code-generation generates too-long literals for very large schemas
> ----------------------------------------------------------------------
>
>                 Key: AVRO-1316
>                 URL: https://issues.apache.org/jira/browse/AVRO-1316
>             Project: Avro
>          Issue Type: Bug
>          Components: java
>            Reporter: Jeremy Kahn
>            Priority: Minor
>
> When I work from a very large IDL schema, the Java code generated includes a schema JSON literal that exceeds the length of the maximum allowed literal string ([65535 characters|http://stackoverflow.com/questions/8323082/size-of-initialisation-string-in-java]).  
> This creates weird Maven errors like: {{[ERROR] ...FooProtocol.java:[13,89] constant string too long}}.
> It might seem a little crazy, but a 64-kilobyte JSON protocol isn't outrageous at all for some of the more involved data structures, especially if we're including documentation strings etc.
> I believe the fix should be a bit more sensitivity to the length of the JSON literal (and a willingness to split it into more than one literal, joined by {{+}}), but I haven't figured out where that change needs to go. Has anyone else encountered this problem?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira