You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@avro.apache.org by cu...@apache.org on 2010/01/10 18:30:48 UTC

svn commit: r897665 - in /hadoop/avro/trunk: CHANGES.txt src/doc/content/xdocs/spec.xml

Author: cutting
Date: Sun Jan 10 17:30:48 2010
New Revision: 897665

URL: http://svn.apache.org/viewvc?rev=897665&view=rev
Log:
AVRO-294.  Clarify that bytes and fixed are unsigned and how their JSON default values are interpreted.  Contributed by Jeff Hammerbacher and cutting.

Modified:
    hadoop/avro/trunk/CHANGES.txt
    hadoop/avro/trunk/src/doc/content/xdocs/spec.xml

Modified: hadoop/avro/trunk/CHANGES.txt
URL: http://svn.apache.org/viewvc/hadoop/avro/trunk/CHANGES.txt?rev=897665&r1=897664&r2=897665&view=diff
==============================================================================
--- hadoop/avro/trunk/CHANGES.txt (original)
+++ hadoop/avro/trunk/CHANGES.txt Sun Jan 10 17:30:48 2010
@@ -185,6 +185,9 @@
     AVRO-259. Add null schema check in GenericData.Record and
     GenericData.Array construtors. (Kevin Oliver via cutting)
 
+    AVRO-294. Clarify that bytes and fixed are unsigned, and how their
+    JSON default values are interpreted.  (Jeff Hammerbacher & cutting)
+
   OPTIMIZATIONS
 
     AVRO-172. More efficient schema processing (massie)

Modified: hadoop/avro/trunk/src/doc/content/xdocs/spec.xml
URL: http://svn.apache.org/viewvc/hadoop/avro/trunk/src/doc/content/xdocs/spec.xml?rev=897665&r1=897664&r2=897665&view=diff
==============================================================================
--- hadoop/avro/trunk/src/doc/content/xdocs/spec.xml (original)
+++ hadoop/avro/trunk/src/doc/content/xdocs/spec.xml Sun Jan 10 17:30:48 2010
@@ -59,7 +59,7 @@
         <p>The set of primitive type names is:</p>
         <ul>
           <li><code>string</code>: unicode character sequence</li>
-          <li><code>bytes</code>: sequence of 8-bit bytes</li>
+          <li><code>bytes</code>: sequence of 8-bit unsigned bytes</li>
           <li><code>int</code>: 32-bit signed integer</li>
           <li><code>long</code>: 64-bit signed integer</li>
           <li><code>float</code>: single precision (32-bit) IEEE 754 floating-point number</li>
@@ -108,7 +108,10 @@
 		  field (optional).  Permitted values depend on the
 		  field's schema type, according to the table below.
 		  Default values for union fields correspond to the
-		  first schema in the union.
+		  first schema in the union. Default values for bytes
+		  and fixed fields are JSON strings, where Unicode
+		  code points 0-255 are mapped to unsigned 8-bit byte
+		  values 0-255.
 		  <table class="right">
 		    <caption>field default values</caption>
 		    <tr><th>avro type</th><th>json type</th><th>example</th></tr>
@@ -550,11 +553,12 @@
 	  value.</li>
 	<li><code>boolean</code> data is ordered with false before true.</li>
 	<li><code>null</code> data is always equal.</li>
-	<li><code>string</code> data is compared lexicographically.
-	  Note that since UTF-8 is used as the binary encoding of
-	  strings, sorting by bytes and characters is equivalent.</li>
+	<li><code>string</code> data is compared lexicographically by
+	  Unicode code point.  Note that since UTF-8 is used as the
+	  binary encoding for strings, sorting of bytes and string
+	  binary data is identical.</li>
 	<li><code>bytes</code> and <code>fixed</code> data are
-	  compared lexicographically by byte.</li>
+	  compared lexicographically by unsigned 8-bit values.</li>
 	<li><code>array</code> data is compared lexicographically by
 	  element.</li>
 	<li><code>enum</code> data is ordered by the symbol's position