You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@avro.apache.org by "David Rosenstrauch (JIRA)" <ji...@apache.org> on 2010/08/13 02:01:20 UTC

[jira] Created: (AVRO-612) Schema.toString() strips out field docs

Schema.toString() strips out field docs
---------------------------------------

                 Key: AVRO-612
                 URL: https://issues.apache.org/jira/browse/AVRO-612
             Project: Avro
          Issue Type: Bug
    Affects Versions: 1.3.3
            Reporter: David Rosenstrauch
            Priority: Minor


Although avro can successfully parse schema text that contains a "doc" on a Schema.Field, when a Schema containing a field doc is serialized (via Schema.toString()) the doc does not get written.

The following JUnit test case demonstrates this problem:
{code:title=TestAvroFieldDocSerialization.java|borderStyle=solid}
import junit.framework.TestCase;

import org.apache.avro.Schema;

public class TestAvroFieldDocSerialization extends TestCase {

	public void testAvroFieldDocSerialization() {
		String schemaStr =
			"{"+
			"	\"name\": \"Rec\","+
			"	\"type\": \"record\","+
			"	\"fields\" : ["+
			"		{\"name\": \"f\", \"type\": \"int\", \"doc\": \"test\"}"+
			"	]"+
			"}";
		Schema schema = Schema.parse(schemaStr);
		verifyFieldDoc(schema);

		schemaStr = schema.toString();
		schema = Schema.parse(schemaStr);
		verifyFieldDoc(schema);
	}

	private void verifyFieldDoc(Schema schema) {
		Schema.Field field = schema.getField("f");
		assertEquals("test", field.doc());
	}
}
{code}

Note that the first call to verifyFieldDoc() succeeds, while the second one fails.  They should both succeed (in my opinion).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (AVRO-612) Schema.toString() strips out field docs

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/AVRO-612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12899540#action_12899540 ] 

Doug Cutting commented on AVRO-612:
-----------------------------------

I will commit this soon if no one objects.

> Schema.toString() strips out field docs
> ---------------------------------------
>
>                 Key: AVRO-612
>                 URL: https://issues.apache.org/jira/browse/AVRO-612
>             Project: Avro
>          Issue Type: Bug
>    Affects Versions: 1.3.3
>            Reporter: David Rosenstrauch
>            Assignee: Doug Cutting
>            Priority: Minor
>             Fix For: 1.4.0
>
>         Attachments: AVRO-612.patch
>
>
> Although avro can successfully parse schema text that contains a "doc" on a Schema.Field, when a Schema containing a field doc is serialized (via Schema.toString()) the doc does not get written.
> The following JUnit test case demonstrates this problem:
> {code:title=TestAvroFieldDocSerialization.java|borderStyle=solid}
> import junit.framework.TestCase;
> import org.apache.avro.Schema;
> public class TestAvroFieldDocSerialization extends TestCase {
> 	public void testAvroFieldDocSerialization() {
> 		String schemaStr =
> 			"{"+
> 			"	\"name\": \"Rec\","+
> 			"	\"type\": \"record\","+
> 			"	\"fields\" : ["+
> 			"		{\"name\": \"f\", \"type\": \"int\", \"doc\": \"test\"}"+
> 			"	]"+
> 			"}";
> 		Schema schema = Schema.parse(schemaStr);
> 		verifyFieldDoc(schema);
> 		schemaStr = schema.toString();
> 		schema = Schema.parse(schemaStr);
> 		verifyFieldDoc(schema);
> 	}
> 	private void verifyFieldDoc(Schema schema) {
> 		Schema.Field field = schema.getField("f");
> 		assertEquals("test", field.doc());
> 	}
> }
> {code}
> Note that the first call to verifyFieldDoc() succeeds, while the second one fails.  They should both succeed (in my opinion).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (AVRO-612) Schema.toString() strips out field docs

Posted by "Scott Carey (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/AVRO-612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12898017#action_12898017 ] 

Scott Carey commented on AVRO-612:
----------------------------------

We need a way to serialize only the 'important' bits of a schema for things like the file format.    Currently, toString() does this minimal version of a schema.  

Is it sufficient to only add 'doc' fields when pretty-printing?

What should we do about custom fields?

> Schema.toString() strips out field docs
> ---------------------------------------
>
>                 Key: AVRO-612
>                 URL: https://issues.apache.org/jira/browse/AVRO-612
>             Project: Avro
>          Issue Type: Bug
>    Affects Versions: 1.3.3
>            Reporter: David Rosenstrauch
>            Priority: Minor
>
> Although avro can successfully parse schema text that contains a "doc" on a Schema.Field, when a Schema containing a field doc is serialized (via Schema.toString()) the doc does not get written.
> The following JUnit test case demonstrates this problem:
> {code:title=TestAvroFieldDocSerialization.java|borderStyle=solid}
> import junit.framework.TestCase;
> import org.apache.avro.Schema;
> public class TestAvroFieldDocSerialization extends TestCase {
> 	public void testAvroFieldDocSerialization() {
> 		String schemaStr =
> 			"{"+
> 			"	\"name\": \"Rec\","+
> 			"	\"type\": \"record\","+
> 			"	\"fields\" : ["+
> 			"		{\"name\": \"f\", \"type\": \"int\", \"doc\": \"test\"}"+
> 			"	]"+
> 			"}";
> 		Schema schema = Schema.parse(schemaStr);
> 		verifyFieldDoc(schema);
> 		schemaStr = schema.toString();
> 		schema = Schema.parse(schemaStr);
> 		verifyFieldDoc(schema);
> 	}
> 	private void verifyFieldDoc(Schema schema) {
> 		Schema.Field field = schema.getField("f");
> 		assertEquals("test", field.doc());
> 	}
> }
> {code}
> Note that the first call to verifyFieldDoc() succeeds, while the second one fails.  They should both succeed (in my opinion).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (AVRO-612) Schema.toString() strips out field docs

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/AVRO-612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Doug Cutting updated AVRO-612:
------------------------------

           Status: Patch Available  (was: Open)
         Assignee: Doug Cutting
    Fix Version/s: 1.4.0

> Schema.toString() strips out field docs
> ---------------------------------------
>
>                 Key: AVRO-612
>                 URL: https://issues.apache.org/jira/browse/AVRO-612
>             Project: Avro
>          Issue Type: Bug
>    Affects Versions: 1.3.3
>            Reporter: David Rosenstrauch
>            Assignee: Doug Cutting
>            Priority: Minor
>             Fix For: 1.4.0
>
>         Attachments: AVRO-612.patch
>
>
> Although avro can successfully parse schema text that contains a "doc" on a Schema.Field, when a Schema containing a field doc is serialized (via Schema.toString()) the doc does not get written.
> The following JUnit test case demonstrates this problem:
> {code:title=TestAvroFieldDocSerialization.java|borderStyle=solid}
> import junit.framework.TestCase;
> import org.apache.avro.Schema;
> public class TestAvroFieldDocSerialization extends TestCase {
> 	public void testAvroFieldDocSerialization() {
> 		String schemaStr =
> 			"{"+
> 			"	\"name\": \"Rec\","+
> 			"	\"type\": \"record\","+
> 			"	\"fields\" : ["+
> 			"		{\"name\": \"f\", \"type\": \"int\", \"doc\": \"test\"}"+
> 			"	]"+
> 			"}";
> 		Schema schema = Schema.parse(schemaStr);
> 		verifyFieldDoc(schema);
> 		schemaStr = schema.toString();
> 		schema = Schema.parse(schemaStr);
> 		verifyFieldDoc(schema);
> 	}
> 	private void verifyFieldDoc(Schema schema) {
> 		Schema.Field field = schema.getField("f");
> 		assertEquals("test", field.doc());
> 	}
> }
> {code}
> Note that the first call to verifyFieldDoc() succeeds, while the second one fails.  They should both succeed (in my opinion).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (AVRO-612) Schema.toString() strips out field docs

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/AVRO-612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Doug Cutting updated AVRO-612:
------------------------------

        Status: Resolved  (was: Patch Available)
    Resolution: Fixed

> Schema.toString() strips out field docs
> ---------------------------------------
>
>                 Key: AVRO-612
>                 URL: https://issues.apache.org/jira/browse/AVRO-612
>             Project: Avro
>          Issue Type: Bug
>    Affects Versions: 1.3.3
>            Reporter: David Rosenstrauch
>            Assignee: Doug Cutting
>            Priority: Minor
>             Fix For: 1.4.0
>
>         Attachments: AVRO-612.patch
>
>
> Although avro can successfully parse schema text that contains a "doc" on a Schema.Field, when a Schema containing a field doc is serialized (via Schema.toString()) the doc does not get written.
> The following JUnit test case demonstrates this problem:
> {code:title=TestAvroFieldDocSerialization.java|borderStyle=solid}
> import junit.framework.TestCase;
> import org.apache.avro.Schema;
> public class TestAvroFieldDocSerialization extends TestCase {
> 	public void testAvroFieldDocSerialization() {
> 		String schemaStr =
> 			"{"+
> 			"	\"name\": \"Rec\","+
> 			"	\"type\": \"record\","+
> 			"	\"fields\" : ["+
> 			"		{\"name\": \"f\", \"type\": \"int\", \"doc\": \"test\"}"+
> 			"	]"+
> 			"}";
> 		Schema schema = Schema.parse(schemaStr);
> 		verifyFieldDoc(schema);
> 		schemaStr = schema.toString();
> 		schema = Schema.parse(schemaStr);
> 		verifyFieldDoc(schema);
> 	}
> 	private void verifyFieldDoc(Schema schema) {
> 		Schema.Field field = schema.getField("f");
> 		assertEquals("test", field.doc());
> 	}
> }
> {code}
> Note that the first call to verifyFieldDoc() succeeds, while the second one fails.  They should both succeed (in my opinion).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (AVRO-612) Schema.toString() strips out field docs

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/AVRO-612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Doug Cutting updated AVRO-612:
------------------------------

    Attachment: AVRO-612.patch

I think this is a bug, that toString() should preserve as much detail as possible.  We might have a separate method that strips a schema down to the minimal form needed to describe the data and use that for schemas written to files, to optimize things.

Here's a patch that fixes this.

> Schema.toString() strips out field docs
> ---------------------------------------
>
>                 Key: AVRO-612
>                 URL: https://issues.apache.org/jira/browse/AVRO-612
>             Project: Avro
>          Issue Type: Bug
>    Affects Versions: 1.3.3
>            Reporter: David Rosenstrauch
>            Priority: Minor
>             Fix For: 1.4.0
>
>         Attachments: AVRO-612.patch
>
>
> Although avro can successfully parse schema text that contains a "doc" on a Schema.Field, when a Schema containing a field doc is serialized (via Schema.toString()) the doc does not get written.
> The following JUnit test case demonstrates this problem:
> {code:title=TestAvroFieldDocSerialization.java|borderStyle=solid}
> import junit.framework.TestCase;
> import org.apache.avro.Schema;
> public class TestAvroFieldDocSerialization extends TestCase {
> 	public void testAvroFieldDocSerialization() {
> 		String schemaStr =
> 			"{"+
> 			"	\"name\": \"Rec\","+
> 			"	\"type\": \"record\","+
> 			"	\"fields\" : ["+
> 			"		{\"name\": \"f\", \"type\": \"int\", \"doc\": \"test\"}"+
> 			"	]"+
> 			"}";
> 		Schema schema = Schema.parse(schemaStr);
> 		verifyFieldDoc(schema);
> 		schemaStr = schema.toString();
> 		schema = Schema.parse(schemaStr);
> 		verifyFieldDoc(schema);
> 	}
> 	private void verifyFieldDoc(Schema schema) {
> 		Schema.Field field = schema.getField("f");
> 		assertEquals("test", field.doc());
> 	}
> }
> {code}
> Note that the first call to verifyFieldDoc() succeeds, while the second one fails.  They should both succeed (in my opinion).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.