You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@avro.apache.org by su...@apache.org on 2017/05/31 15:18:42 UTC
svn commit: r1797058 [1/5] - in /avro/site/publish/docs/1.8.2: ./ examples/
examples/java-example/ examples/java-example/src/
examples/java-example/src/main/ examples/java-example/src/main/java/
examples/java-example/src/main/java/example/ examples/mr-...
Author: suraj
Date: Wed May 31 15:18:41 2017
New Revision: 1797058
URL: http://svn.apache.org/viewvc?rev=1797058&view=rev
Log:
Fixing and adding documentation for release 1.8.2
Added:
avro/site/publish/docs/1.8.2/broken-links.xml
avro/site/publish/docs/1.8.2/examples/
avro/site/publish/docs/1.8.2/examples/example.py
avro/site/publish/docs/1.8.2/examples/java-example/
avro/site/publish/docs/1.8.2/examples/java-example/pom.xml
avro/site/publish/docs/1.8.2/examples/java-example/src/
avro/site/publish/docs/1.8.2/examples/java-example/src/main/
avro/site/publish/docs/1.8.2/examples/java-example/src/main/java/
avro/site/publish/docs/1.8.2/examples/java-example/src/main/java/example/
avro/site/publish/docs/1.8.2/examples/java-example/src/main/java/example/GenericMain.java
avro/site/publish/docs/1.8.2/examples/java-example/src/main/java/example/SpecificMain.java
avro/site/publish/docs/1.8.2/examples/mr-example/
avro/site/publish/docs/1.8.2/examples/mr-example/pom.xml
avro/site/publish/docs/1.8.2/examples/mr-example/src/
avro/site/publish/docs/1.8.2/examples/mr-example/src/main/
avro/site/publish/docs/1.8.2/examples/mr-example/src/main/java/
avro/site/publish/docs/1.8.2/examples/mr-example/src/main/java/example/
avro/site/publish/docs/1.8.2/examples/mr-example/src/main/java/example/AvroWordCount.java
avro/site/publish/docs/1.8.2/examples/mr-example/src/main/java/example/GenerateData.java
avro/site/publish/docs/1.8.2/examples/mr-example/src/main/java/example/MapReduceAvroWordCount.java
avro/site/publish/docs/1.8.2/examples/mr-example/src/main/java/example/MapReduceColorCount.java
avro/site/publish/docs/1.8.2/examples/mr-example/src/main/java/example/MapredColorCount.java
avro/site/publish/docs/1.8.2/examples/user.avsc
avro/site/publish/docs/1.8.2/gettingstartedjava.html
avro/site/publish/docs/1.8.2/gettingstartedjava.pdf (with props)
avro/site/publish/docs/1.8.2/gettingstartedpython.html
avro/site/publish/docs/1.8.2/gettingstartedpython.pdf (with props)
avro/site/publish/docs/1.8.2/htmldocs/
avro/site/publish/docs/1.8.2/htmldocs/canonical-completeness.html
avro/site/publish/docs/1.8.2/idl.html
avro/site/publish/docs/1.8.2/idl.pdf (with props)
avro/site/publish/docs/1.8.2/images/
avro/site/publish/docs/1.8.2/images/apache_feather.gif (with props)
avro/site/publish/docs/1.8.2/images/avro-logo.png (with props)
avro/site/publish/docs/1.8.2/images/built-with-forrest-button.png (with props)
avro/site/publish/docs/1.8.2/images/favicon.ico (with props)
avro/site/publish/docs/1.8.2/images/instruction_arrow.png (with props)
avro/site/publish/docs/1.8.2/index.html
avro/site/publish/docs/1.8.2/index.pdf (with props)
avro/site/publish/docs/1.8.2/linkmap.html
avro/site/publish/docs/1.8.2/linkmap.pdf (with props)
avro/site/publish/docs/1.8.2/mr.html
avro/site/publish/docs/1.8.2/mr.pdf (with props)
avro/site/publish/docs/1.8.2/sasl.html
avro/site/publish/docs/1.8.2/sasl.pdf (with props)
avro/site/publish/docs/1.8.2/skin/
avro/site/publish/docs/1.8.2/skin/CommonMessages_de.xml
avro/site/publish/docs/1.8.2/skin/CommonMessages_en_US.xml
avro/site/publish/docs/1.8.2/skin/CommonMessages_es.xml
avro/site/publish/docs/1.8.2/skin/CommonMessages_fr.xml
avro/site/publish/docs/1.8.2/skin/basic.css
avro/site/publish/docs/1.8.2/skin/breadcrumbs-optimized.js
avro/site/publish/docs/1.8.2/skin/breadcrumbs.js
avro/site/publish/docs/1.8.2/skin/css/
avro/site/publish/docs/1.8.2/skin/fontsize.js
avro/site/publish/docs/1.8.2/skin/getBlank.js
avro/site/publish/docs/1.8.2/skin/getMenu.js
avro/site/publish/docs/1.8.2/skin/images/
avro/site/publish/docs/1.8.2/skin/images/README.txt
avro/site/publish/docs/1.8.2/skin/images/add.jpg (with props)
avro/site/publish/docs/1.8.2/skin/images/apache-thanks.png (with props)
avro/site/publish/docs/1.8.2/skin/images/built-with-cocoon.gif (with props)
avro/site/publish/docs/1.8.2/skin/images/built-with-forrest-button.png (with props)
avro/site/publish/docs/1.8.2/skin/images/chapter.gif (with props)
avro/site/publish/docs/1.8.2/skin/images/chapter_open.gif (with props)
avro/site/publish/docs/1.8.2/skin/images/current.gif (with props)
avro/site/publish/docs/1.8.2/skin/images/error.png (with props)
avro/site/publish/docs/1.8.2/skin/images/external-link.gif (with props)
avro/site/publish/docs/1.8.2/skin/images/fix.jpg (with props)
avro/site/publish/docs/1.8.2/skin/images/forrest-credit-logo.png (with props)
avro/site/publish/docs/1.8.2/skin/images/hack.jpg (with props)
avro/site/publish/docs/1.8.2/skin/images/header_white_line.gif (with props)
avro/site/publish/docs/1.8.2/skin/images/info.png (with props)
avro/site/publish/docs/1.8.2/skin/images/instruction_arrow.png (with props)
avro/site/publish/docs/1.8.2/skin/images/label.gif (with props)
avro/site/publish/docs/1.8.2/skin/images/page.gif (with props)
avro/site/publish/docs/1.8.2/skin/images/pdfdoc.gif (with props)
avro/site/publish/docs/1.8.2/skin/images/poddoc.png (with props)
avro/site/publish/docs/1.8.2/skin/images/printer.gif (with props)
avro/site/publish/docs/1.8.2/skin/images/rc-b-l-15-1body-2menu-3menu.png (with props)
avro/site/publish/docs/1.8.2/skin/images/rc-b-r-15-1body-2menu-3menu.png (with props)
avro/site/publish/docs/1.8.2/skin/images/rc-b-r-5-1header-2tab-selected-3tab-selected.png (with props)
avro/site/publish/docs/1.8.2/skin/images/rc-t-l-5-1header-2searchbox-3searchbox.png (with props)
avro/site/publish/docs/1.8.2/skin/images/rc-t-l-5-1header-2tab-selected-3tab-selected.png (with props)
avro/site/publish/docs/1.8.2/skin/images/rc-t-l-5-1header-2tab-unselected-3tab-unselected.png (with props)
avro/site/publish/docs/1.8.2/skin/images/rc-t-r-15-1body-2menu-3menu.png (with props)
avro/site/publish/docs/1.8.2/skin/images/rc-t-r-5-1header-2searchbox-3searchbox.png (with props)
avro/site/publish/docs/1.8.2/skin/images/rc-t-r-5-1header-2tab-selected-3tab-selected.png (with props)
avro/site/publish/docs/1.8.2/skin/images/rc-t-r-5-1header-2tab-unselected-3tab-unselected.png (with props)
avro/site/publish/docs/1.8.2/skin/images/remove.jpg (with props)
avro/site/publish/docs/1.8.2/skin/images/rss.png (with props)
avro/site/publish/docs/1.8.2/skin/images/spacer.gif (with props)
avro/site/publish/docs/1.8.2/skin/images/success.png (with props)
avro/site/publish/docs/1.8.2/skin/images/txtdoc.png (with props)
avro/site/publish/docs/1.8.2/skin/images/update.jpg (with props)
avro/site/publish/docs/1.8.2/skin/images/valid-html401.png (with props)
avro/site/publish/docs/1.8.2/skin/images/vcss.png (with props)
avro/site/publish/docs/1.8.2/skin/images/warning.png (with props)
avro/site/publish/docs/1.8.2/skin/images/xmldoc.gif (with props)
avro/site/publish/docs/1.8.2/skin/menu.js
avro/site/publish/docs/1.8.2/skin/note.txt
avro/site/publish/docs/1.8.2/skin/print.css
avro/site/publish/docs/1.8.2/skin/profile.css
avro/site/publish/docs/1.8.2/skin/prototype.js
avro/site/publish/docs/1.8.2/skin/screen.css
avro/site/publish/docs/1.8.2/skin/scripts/
avro/site/publish/docs/1.8.2/skin/translations/
avro/site/publish/docs/1.8.2/spec.html
avro/site/publish/docs/1.8.2/spec.pdf (with props)
Added: avro/site/publish/docs/1.8.2/broken-links.xml
URL: http://svn.apache.org/viewvc/avro/site/publish/docs/1.8.2/broken-links.xml?rev=1797058&view=auto
==============================================================================
--- avro/site/publish/docs/1.8.2/broken-links.xml (added)
+++ avro/site/publish/docs/1.8.2/broken-links.xml Wed May 31 15:18:41 2017
@@ -0,0 +1,2 @@
+<broken-links>
+</broken-links>
Added: avro/site/publish/docs/1.8.2/examples/example.py
URL: http://svn.apache.org/viewvc/avro/site/publish/docs/1.8.2/examples/example.py?rev=1797058&view=auto
==============================================================================
--- avro/site/publish/docs/1.8.2/examples/example.py (added)
+++ avro/site/publish/docs/1.8.2/examples/example.py Wed May 31 15:18:41 2017
@@ -0,0 +1,33 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements. See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership. The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied. See the License for the
+# specific language governing permissions and limitations
+# under the License.
+#
+import avro.schema
+from avro.datafile import DataFileReader, DataFileWriter
+from avro.io import DatumReader, DatumWriter
+
+schema = avro.schema.parse(open("user.avsc").read())
+
+writer = DataFileWriter(open("/tmp/users.avro", "w"), DatumWriter(), schema)
+writer.append({"name": "Alyssa", "favorite_number": 256, "WTF": 2})
+writer.append({"name": "Ben", "favorite_number": 7, "favorite_color": "red"})
+writer.close()
+
+reader = DataFileReader(open("/tmp/users.avro", "r"), DatumReader())
+for user in reader:
+ print user
+reader.close()
Added: avro/site/publish/docs/1.8.2/examples/java-example/pom.xml
URL: http://svn.apache.org/viewvc/avro/site/publish/docs/1.8.2/examples/java-example/pom.xml?rev=1797058&view=auto
==============================================================================
--- avro/site/publish/docs/1.8.2/examples/java-example/pom.xml (added)
+++ avro/site/publish/docs/1.8.2/examples/java-example/pom.xml Wed May 31 15:18:41 2017
@@ -0,0 +1,70 @@
+<!--
+ - Licensed to the Apache Software Foundation (ASF) under one
+ - or more contributor license agreements. See the NOTICE file
+ - distributed with this work for additional information
+ - regarding copyright ownership. The ASF licenses this file
+ - to you under the Apache License, Version 2.0 (the
+ - "License"); you may not use this file except in compliance
+ - with the License. You may obtain a copy of the License at
+ -
+ - http://www.apache.org/licenses/LICENSE-2.0
+ -
+ - Unless required by applicable law or agreed to in writing,
+ - software distributed under the License is distributed on an
+ - "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ - KIND, either express or implied. See the License for the
+ - specific language governing permissions and limitations
+ - under the License.
+ -->
+<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
+ xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
+ <modelVersion>4.0.0</modelVersion>
+ <groupId>example</groupId>
+ <artifactId>java-example</artifactId>
+ <packaging>jar</packaging>
+ <version>1.0-SNAPSHOT</version>
+ <name>java-example</name>
+ <url>http://maven.apache.org</url>
+ <dependencies>
+ <dependency>
+ <groupId>junit</groupId>
+ <artifactId>junit</artifactId>
+ <version>3.8.1</version>
+ <scope>test</scope>
+ </dependency>
+ <dependency>
+ <groupId>org.apache.avro</groupId>
+ <artifactId>avro</artifactId>
+ <version>1.7.5</version>
+ </dependency>
+ </dependencies>
+ <build>
+ <plugins>
+ <plugin>
+ <groupId>org.apache.avro</groupId>
+ <artifactId>avro-maven-plugin</artifactId>
+ <version>1.7.5</version>
+ <executions>
+ <execution>
+ <phase>generate-sources</phase>
+ <goals>
+ <goal>schema</goal>
+ </goals>
+ <configuration>
+ <sourceDirectory>${project.basedir}/../</sourceDirectory>
+ <outputDirectory>${project.basedir}/src/main/java/</outputDirectory>
+ </configuration>
+ </execution>
+ </executions>
+ </plugin>
+ <plugin>
+ <groupId>org.apache.maven.plugins</groupId>
+ <artifactId>maven-compiler-plugin</artifactId>
+ <configuration>
+ <source>1.6</source>
+ <target>1.6</target>
+ </configuration>
+ </plugin>
+ </plugins>
+ </build>
+</project>
Added: avro/site/publish/docs/1.8.2/examples/java-example/src/main/java/example/GenericMain.java
URL: http://svn.apache.org/viewvc/avro/site/publish/docs/1.8.2/examples/java-example/src/main/java/example/GenericMain.java?rev=1797058&view=auto
==============================================================================
--- avro/site/publish/docs/1.8.2/examples/java-example/src/main/java/example/GenericMain.java (added)
+++ avro/site/publish/docs/1.8.2/examples/java-example/src/main/java/example/GenericMain.java Wed May 31 15:18:41 2017
@@ -0,0 +1,71 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package example;
+
+import java.io.File;
+import java.io.IOException;
+
+import org.apache.avro.Schema;
+import org.apache.avro.Schema.Parser;
+import org.apache.avro.file.DataFileReader;
+import org.apache.avro.file.DataFileWriter;
+import org.apache.avro.generic.GenericData;
+import org.apache.avro.generic.GenericDatumReader;
+import org.apache.avro.generic.GenericDatumWriter;
+import org.apache.avro.generic.GenericRecord;
+import org.apache.avro.io.DatumReader;
+import org.apache.avro.io.DatumWriter;
+
+public class GenericMain {
+ public static void main(String[] args) throws IOException {
+ Schema schema = new Parser().parse(new File("/home/skye/code/cloudera/avro/doc/examples/user.avsc"));
+
+ GenericRecord user1 = new GenericData.Record(schema);
+ user1.put("name", "Alyssa");
+ user1.put("favorite_number", 256);
+ // Leave favorite color null
+
+ GenericRecord user2 = new GenericData.Record(schema);
+ user2.put("name", "Ben");
+ user2.put("favorite_number", 7);
+ user2.put("favorite_color", "red");
+
+ // Serialize user1 and user2 to disk
+ File file = new File("users.avro");
+ DatumWriter<GenericRecord> datumWriter = new GenericDatumWriter<GenericRecord>(schema);
+ DataFileWriter<GenericRecord> dataFileWriter = new DataFileWriter<GenericRecord>(datumWriter);
+ dataFileWriter.create(schema, file);
+ dataFileWriter.append(user1);
+ dataFileWriter.append(user2);
+ dataFileWriter.close();
+
+ // Deserialize users from disk
+ DatumReader<GenericRecord> datumReader = new GenericDatumReader<GenericRecord>(schema);
+ DataFileReader<GenericRecord> dataFileReader = new DataFileReader<GenericRecord>(file, datumReader);
+ GenericRecord user = null;
+ while (dataFileReader.hasNext()) {
+ // Reuse user object by passing it to next(). This saves us from
+ // allocating and garbage collecting many objects for files with
+ // many items.
+ user = dataFileReader.next(user);
+ System.out.println(user);
+ }
+
+ }
+}
Added: avro/site/publish/docs/1.8.2/examples/java-example/src/main/java/example/SpecificMain.java
URL: http://svn.apache.org/viewvc/avro/site/publish/docs/1.8.2/examples/java-example/src/main/java/example/SpecificMain.java?rev=1797058&view=auto
==============================================================================
--- avro/site/publish/docs/1.8.2/examples/java-example/src/main/java/example/SpecificMain.java (added)
+++ avro/site/publish/docs/1.8.2/examples/java-example/src/main/java/example/SpecificMain.java Wed May 31 15:18:41 2017
@@ -0,0 +1,73 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package example;
+
+import java.io.File;
+import java.io.IOException;
+
+import org.apache.avro.file.DataFileReader;
+import org.apache.avro.file.DataFileWriter;
+import org.apache.avro.io.DatumReader;
+import org.apache.avro.io.DatumWriter;
+import org.apache.avro.specific.SpecificDatumReader;
+import org.apache.avro.specific.SpecificDatumWriter;
+
+import example.avro.User;
+
+public class SpecificMain {
+ public static void main(String[] args) throws IOException {
+ User user1 = new User();
+ user1.setName("Alyssa");
+ user1.setFavoriteNumber(256);
+ // Leave favorite color null
+
+ // Alternate constructor
+ User user2 = new User("Ben", 7, "red");
+
+ // Construct via builder
+ User user3 = User.newBuilder()
+ .setName("Charlie")
+ .setFavoriteColor("blue")
+ .setFavoriteNumber(null)
+ .build();
+
+ // Serialize user1 and user2 to disk
+ File file = new File("users.avro");
+ DatumWriter<User> userDatumWriter = new SpecificDatumWriter<User>(User.class);
+ DataFileWriter<User> dataFileWriter = new DataFileWriter<User>(userDatumWriter);
+ dataFileWriter.create(user1.getSchema(), file);
+ dataFileWriter.append(user1);
+ dataFileWriter.append(user2);
+ dataFileWriter.append(user3);
+ dataFileWriter.close();
+
+ // Deserialize Users from disk
+ DatumReader<User> userDatumReader = new SpecificDatumReader<User>(User.class);
+ DataFileReader<User> dataFileReader = new DataFileReader<User>(file, userDatumReader);
+ User user = null;
+ while (dataFileReader.hasNext()) {
+ // Reuse user object by passing it to next(). This saves us from
+ // allocating and garbage collecting many objects for files with
+ // many items.
+ user = dataFileReader.next(user);
+ System.out.println(user);
+ }
+
+ }
+}
Added: avro/site/publish/docs/1.8.2/examples/mr-example/pom.xml
URL: http://svn.apache.org/viewvc/avro/site/publish/docs/1.8.2/examples/mr-example/pom.xml?rev=1797058&view=auto
==============================================================================
--- avro/site/publish/docs/1.8.2/examples/mr-example/pom.xml (added)
+++ avro/site/publish/docs/1.8.2/examples/mr-example/pom.xml Wed May 31 15:18:41 2017
@@ -0,0 +1,77 @@
+<!--
+ - Licensed to the Apache Software Foundation (ASF) under one
+ - or more contributor license agreements. See the NOTICE file
+ - distributed with this work for additional information
+ - regarding copyright ownership. The ASF licenses this file
+ - to you under the Apache License, Version 2.0 (the
+ - "License"); you may not use this file except in compliance
+ - with the License. You may obtain a copy of the License at
+ -
+ - http://www.apache.org/licenses/LICENSE-2.0
+ -
+ - Unless required by applicable law or agreed to in writing,
+ - software distributed under the License is distributed on an
+ - "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ - KIND, either express or implied. See the License for the
+ - specific language governing permissions and limitations
+ - under the License.
+ -->
+<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
+ xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
+ <modelVersion>4.0.0</modelVersion>
+
+ <groupId>example</groupId>
+ <artifactId>mr-example</artifactId>
+ <version>1.0</version>
+ <packaging>jar</packaging>
+
+ <name>mr-example</name>
+
+ <build>
+ <plugins>
+ <plugin>
+ <groupId>org.apache.maven.plugins</groupId>
+ <artifactId>maven-compiler-plugin</artifactId>
+ <configuration>
+ <source>1.6</source>
+ <target>1.6</target>
+ </configuration>
+ </plugin>
+ <plugin>
+ <groupId>org.apache.avro</groupId>
+ <artifactId>avro-maven-plugin</artifactId>
+ <version>1.7.5</version>
+ <executions>
+ <execution>
+ <phase>generate-sources</phase>
+ <goals>
+ <goal>schema</goal>
+ </goals>
+ <configuration>
+ <sourceDirectory>${project.basedir}/../</sourceDirectory>
+ <outputDirectory>${project.build.directory}/generated-sources/java</outputDirectory>
+ </configuration>
+ </execution>
+ </executions>
+ </plugin>
+ </plugins>
+ </build>
+
+ <dependencies>
+ <dependency>
+ <groupId>org.apache.avro</groupId>
+ <artifactId>avro</artifactId>
+ <version>1.7.5</version>
+ </dependency>
+ <dependency>
+ <groupId>org.apache.avro</groupId>
+ <artifactId>avro-mapred</artifactId>
+ <version>1.7.5</version>
+ </dependency>
+ <dependency>
+ <groupId>org.apache.hadoop</groupId>
+ <artifactId>hadoop-core</artifactId>
+ <version>1.1.0</version>
+ </dependency>
+ </dependencies>
+</project>
Added: avro/site/publish/docs/1.8.2/examples/mr-example/src/main/java/example/AvroWordCount.java
URL: http://svn.apache.org/viewvc/avro/site/publish/docs/1.8.2/examples/mr-example/src/main/java/example/AvroWordCount.java?rev=1797058&view=auto
==============================================================================
--- avro/site/publish/docs/1.8.2/examples/mr-example/src/main/java/example/AvroWordCount.java (added)
+++ avro/site/publish/docs/1.8.2/examples/mr-example/src/main/java/example/AvroWordCount.java Wed May 31 15:18:41 2017
@@ -0,0 +1,105 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package example;
+
+import java.io.IOException;
+import java.util.*;
+
+import org.apache.avro.*;
+import org.apache.avro.Schema.Type;
+import org.apache.avro.mapred.*;
+import org.apache.hadoop.conf.*;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.io.*;
+import org.apache.hadoop.mapred.*;
+import org.apache.hadoop.util.*;
+
+/**
+ * The classic WordCount example modified to output Avro Pair<CharSequence,
+ * Integer> records instead of text.
+ */
+public class AvroWordCount extends Configured implements Tool {
+
+ public static class Map extends MapReduceBase implements Mapper<LongWritable, Text, Text, IntWritable> {
+ private final static IntWritable one = new IntWritable(1);
+ private Text word = new Text();
+
+ public void map(LongWritable key, Text value, OutputCollector<Text, IntWritable> output, Reporter reporter)
+ throws IOException {
+ String line = value.toString();
+ StringTokenizer tokenizer = new StringTokenizer(line);
+ while (tokenizer.hasMoreTokens()) {
+ word.set(tokenizer.nextToken());
+ output.collect(word, one);
+ }
+ }
+ }
+
+ public static class Reduce extends MapReduceBase
+ implements Reducer<Text, IntWritable,
+ AvroWrapper<Pair<CharSequence, Integer>>, NullWritable> {
+
+ public void reduce(Text key, Iterator<IntWritable> values,
+ OutputCollector<AvroWrapper<Pair<CharSequence, Integer>>, NullWritable> output,
+ Reporter reporter) throws IOException {
+ int sum = 0;
+ while (values.hasNext()) {
+ sum += values.next().get();
+ }
+ output.collect(new AvroWrapper<Pair<CharSequence, Integer>>(
+ new Pair<CharSequence, Integer>(key.toString(), sum)),
+ NullWritable.get());
+ }
+ }
+
+ public int run(String[] args) throws Exception {
+ if (args.length != 2) {
+ System.err.println("Usage: AvroWordCount <input path> <output path>");
+ return -1;
+ }
+
+ JobConf conf = new JobConf(AvroWordCount.class);
+ conf.setJobName("wordcount");
+
+ // We call setOutputSchema first so we can override the configuration
+ // parameters it sets
+ AvroJob.setOutputSchema(conf, Pair.getPairSchema(Schema.create(Type.STRING),
+ Schema.create(Type.INT)));
+
+ conf.setMapperClass(Map.class);
+ conf.setReducerClass(Reduce.class);
+
+ conf.setInputFormat(TextInputFormat.class);
+
+ conf.setMapOutputKeyClass(Text.class);
+ conf.setMapOutputValueClass(IntWritable.class);
+ conf.setOutputKeyComparatorClass(Text.Comparator.class);
+
+ FileInputFormat.setInputPaths(conf, new Path(args[0]));
+ FileOutputFormat.setOutputPath(conf, new Path(args[1]));
+
+ JobClient.runJob(conf);
+ return 0;
+ }
+
+ public static void main(String[] args) throws Exception {
+ int res = ToolRunner.run(new Configuration(), new AvroWordCount(), args);
+ System.exit(res);
+ }
+}
Added: avro/site/publish/docs/1.8.2/examples/mr-example/src/main/java/example/GenerateData.java
URL: http://svn.apache.org/viewvc/avro/site/publish/docs/1.8.2/examples/mr-example/src/main/java/example/GenerateData.java?rev=1797058&view=auto
==============================================================================
--- avro/site/publish/docs/1.8.2/examples/mr-example/src/main/java/example/GenerateData.java (added)
+++ avro/site/publish/docs/1.8.2/examples/mr-example/src/main/java/example/GenerateData.java Wed May 31 15:18:41 2017
@@ -0,0 +1,57 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package example;
+
+import java.io.File;
+import java.io.IOException;
+import java.util.Random;
+
+import org.apache.avro.file.DataFileWriter;
+import org.apache.avro.io.DatumWriter;
+import org.apache.avro.specific.SpecificDatumWriter;
+
+import example.avro.User;
+
+public class GenerateData {
+ public static final String[] COLORS = {"red", "orange", "yellow", "green", "blue", "purple", null};
+ public static final int USERS = 20;
+ public static final String PATH = "./input/users.avro";
+
+ public static void main(String[] args) throws IOException {
+ // Open data file
+ File file = new File(PATH);
+ if (file.getParentFile() != null) {
+ file.getParentFile().mkdirs();
+ }
+ DatumWriter<User> userDatumWriter = new SpecificDatumWriter<User>(User.class);
+ DataFileWriter<User> dataFileWriter = new DataFileWriter<User>(userDatumWriter);
+ dataFileWriter.create(User.SCHEMA$, file);
+
+ // Create random users
+ User user;
+ Random random = new Random();
+ for (int i = 0; i < USERS; i++) {
+ user = new User("user", null, COLORS[random.nextInt(COLORS.length)]);
+ dataFileWriter.append(user);
+ System.out.println(user);
+ }
+
+ dataFileWriter.close();
+ }
+}
Added: avro/site/publish/docs/1.8.2/examples/mr-example/src/main/java/example/MapReduceAvroWordCount.java
URL: http://svn.apache.org/viewvc/avro/site/publish/docs/1.8.2/examples/mr-example/src/main/java/example/MapReduceAvroWordCount.java?rev=1797058&view=auto
==============================================================================
--- avro/site/publish/docs/1.8.2/examples/mr-example/src/main/java/example/MapReduceAvroWordCount.java (added)
+++ avro/site/publish/docs/1.8.2/examples/mr-example/src/main/java/example/MapReduceAvroWordCount.java Wed May 31 15:18:41 2017
@@ -0,0 +1,124 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package example;
+
+import java.io.IOException;
+import java.util.*;
+
+import org.apache.avro.Schema;
+import org.apache.avro.Schema.Type;
+import org.apache.avro.mapred.AvroWrapper;
+import org.apache.avro.mapred.Pair;
+import org.apache.avro.mapreduce.AvroJob;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.conf.Configured;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.io.IntWritable;
+import org.apache.hadoop.io.LongWritable;
+import org.apache.hadoop.io.NullWritable;
+import org.apache.hadoop.io.Text;
+import org.apache.hadoop.mapreduce.Job;
+import org.apache.hadoop.mapreduce.Mapper;
+import org.apache.hadoop.mapreduce.Reducer;
+import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
+import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
+import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
+import org.apache.hadoop.util.Tool;
+import org.apache.hadoop.util.ToolRunner;
+
+/**
+ * The classic WordCount example modified to output Avro Pair<CharSequence,
+ * Integer> records instead of text.
+ */
+public class MapReduceAvroWordCount extends Configured implements Tool {
+
+ public static class Map
+ extends Mapper<LongWritable, Text, Text, IntWritable> {
+
+ private final static IntWritable one = new IntWritable(1);
+ private Text word = new Text();
+
+ public void map(LongWritable key, Text value, Context context)
+ throws IOException, InterruptedException {
+ String line = value.toString();
+ StringTokenizer tokenizer = new StringTokenizer(line);
+ while (tokenizer.hasMoreTokens()) {
+ word.set(tokenizer.nextToken());
+ context.write(word, one);
+ }
+ }
+ }
+
+ public static class Reduce
+ extends Reducer<Text, IntWritable,
+ AvroWrapper<Pair<CharSequence, Integer>>, NullWritable> {
+
+ public void reduce(Text key, Iterable<IntWritable> values,
+ Context context)
+ throws IOException, InterruptedException {
+ int sum = 0;
+ for (IntWritable value : values) {
+ sum += value.get();
+ }
+ context.write(new AvroWrapper<Pair<CharSequence, Integer>>
+ (new Pair<CharSequence, Integer>(key.toString(), sum)),
+ NullWritable.get());
+ }
+ }
+
+ public int run(String[] args) throws Exception {
+ if (args.length != 2) {
+ System.err.println("Usage: AvroWordCount <input path> <output path>");
+ return -1;
+ }
+
+ Job job = new Job(getConf());
+ job.setJarByClass(MapReduceAvroWordCount.class);
+ job.setJobName("wordcount");
+
+ // We call setOutputSchema first so we can override the configuration
+ // parameters it sets
+ AvroJob.setOutputKeySchema(job,
+ Pair.getPairSchema(Schema.create(Type.STRING),
+ Schema.create(Type.INT)));
+ job.setOutputValueClass(NullWritable.class);
+
+ job.setMapperClass(Map.class);
+ job.setReducerClass(Reduce.class);
+
+ job.setInputFormatClass(TextInputFormat.class);
+
+ job.setMapOutputKeyClass(Text.class);
+ job.setMapOutputValueClass(IntWritable.class);
+ job.setSortComparatorClass(Text.Comparator.class);
+
+ FileInputFormat.setInputPaths(job, new Path(args[0]));
+ FileOutputFormat.setOutputPath(job, new Path(args[1]));
+
+ job.waitForCompletion(true);
+
+ return 0;
+ }
+
+ public static void main(String[] args) throws Exception {
+ int res =
+ ToolRunner.run(new Configuration(), new MapReduceAvroWordCount(), args);
+ System.exit(res);
+ }
+}
Added: avro/site/publish/docs/1.8.2/examples/mr-example/src/main/java/example/MapReduceColorCount.java
URL: http://svn.apache.org/viewvc/avro/site/publish/docs/1.8.2/examples/mr-example/src/main/java/example/MapReduceColorCount.java?rev=1797058&view=auto
==============================================================================
--- avro/site/publish/docs/1.8.2/examples/mr-example/src/main/java/example/MapReduceColorCount.java (added)
+++ avro/site/publish/docs/1.8.2/examples/mr-example/src/main/java/example/MapReduceColorCount.java Wed May 31 15:18:41 2017
@@ -0,0 +1,107 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package example;
+
+import java.io.IOException;
+
+import org.apache.avro.Schema;
+import org.apache.avro.mapred.AvroKey;
+import org.apache.avro.mapred.AvroValue;
+import org.apache.avro.mapreduce.AvroJob;
+import org.apache.avro.mapreduce.AvroKeyInputFormat;
+import org.apache.avro.mapreduce.AvroKeyValueOutputFormat;
+import org.apache.hadoop.conf.Configured;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.io.IntWritable;
+import org.apache.hadoop.io.NullWritable;
+import org.apache.hadoop.io.Text;
+import org.apache.hadoop.mapreduce.Job;
+import org.apache.hadoop.mapreduce.Mapper;
+import org.apache.hadoop.mapreduce.Reducer;
+import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
+import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
+import org.apache.hadoop.util.Tool;
+import org.apache.hadoop.util.ToolRunner;
+
+import example.avro.User;
+
+public class MapReduceColorCount extends Configured implements Tool {
+
+ public static class ColorCountMapper extends
+ Mapper<AvroKey<User>, NullWritable, Text, IntWritable> {
+
+ @Override
+ public void map(AvroKey<User> key, NullWritable value, Context context)
+ throws IOException, InterruptedException {
+
+ CharSequence color = key.datum().getFavoriteColor();
+ if (color == null) {
+ color = "none";
+ }
+ context.write(new Text(color.toString()), new IntWritable(1));
+ }
+ }
+
+ public static class ColorCountReducer extends
+ Reducer<Text, IntWritable, AvroKey<CharSequence>, AvroValue<Integer>> {
+
+ @Override
+ public void reduce(Text key, Iterable<IntWritable> values,
+ Context context) throws IOException, InterruptedException {
+
+ int sum = 0;
+ for (IntWritable value : values) {
+ sum += value.get();
+ }
+ context.write(new AvroKey<CharSequence>(key.toString()), new AvroValue<Integer>(sum));
+ }
+ }
+
+ public int run(String[] args) throws Exception {
+ if (args.length != 2) {
+ System.err.println("Usage: MapReduceColorCount <input path> <output path>");
+ return -1;
+ }
+
+ Job job = new Job(getConf());
+ job.setJarByClass(MapReduceColorCount.class);
+ job.setJobName("Color Count");
+
+ FileInputFormat.setInputPaths(job, new Path(args[0]));
+ FileOutputFormat.setOutputPath(job, new Path(args[1]));
+
+ job.setInputFormatClass(AvroKeyInputFormat.class);
+ job.setMapperClass(ColorCountMapper.class);
+ AvroJob.setInputKeySchema(job, User.getClassSchema());
+ job.setMapOutputKeyClass(Text.class);
+ job.setMapOutputValueClass(IntWritable.class);
+
+ job.setOutputFormatClass(AvroKeyValueOutputFormat.class);
+ job.setReducerClass(ColorCountReducer.class);
+ AvroJob.setOutputKeySchema(job, Schema.create(Schema.Type.STRING));
+ AvroJob.setOutputValueSchema(job, Schema.create(Schema.Type.INT));
+
+ return (job.waitForCompletion(true) ? 0 : 1);
+ }
+
+ public static void main(String[] args) throws Exception {
+ int res = ToolRunner.run(new MapReduceColorCount(), args);
+ System.exit(res);
+ }
+}
Added: avro/site/publish/docs/1.8.2/examples/mr-example/src/main/java/example/MapredColorCount.java
URL: http://svn.apache.org/viewvc/avro/site/publish/docs/1.8.2/examples/mr-example/src/main/java/example/MapredColorCount.java?rev=1797058&view=auto
==============================================================================
--- avro/site/publish/docs/1.8.2/examples/mr-example/src/main/java/example/MapredColorCount.java (added)
+++ avro/site/publish/docs/1.8.2/examples/mr-example/src/main/java/example/MapredColorCount.java Wed May 31 15:18:41 2017
@@ -0,0 +1,93 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package example;
+
+import java.io.IOException;
+
+import org.apache.avro.*;
+import org.apache.avro.Schema.Type;
+import org.apache.avro.mapred.*;
+import org.apache.hadoop.conf.*;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.mapred.*;
+import org.apache.hadoop.util.*;
+
+import example.avro.User;
+
+public class MapredColorCount extends Configured implements Tool {
+
+ public static class ColorCountMapper extends AvroMapper<User, Pair<CharSequence, Integer>> {
+ @Override
+ public void map(User user, AvroCollector<Pair<CharSequence, Integer>> collector, Reporter reporter)
+ throws IOException {
+ CharSequence color = user.getFavoriteColor();
+ // We need this check because the User.favorite_color field has type ["string", "null"]
+ if (color == null) {
+ color = "none";
+ }
+ collector.collect(new Pair<CharSequence, Integer>(color, 1));
+ }
+ }
+
+ public static class ColorCountReducer extends AvroReducer<CharSequence, Integer,
+ Pair<CharSequence, Integer>> {
+ @Override
+ public void reduce(CharSequence key, Iterable<Integer> values,
+ AvroCollector<Pair<CharSequence, Integer>> collector,
+ Reporter reporter)
+ throws IOException {
+ int sum = 0;
+ for (Integer value : values) {
+ sum += value;
+ }
+ collector.collect(new Pair<CharSequence, Integer>(key, sum));
+ }
+ }
+
+ public int run(String[] args) throws Exception {
+ if (args.length != 2) {
+ System.err.println("Usage: MapredColorCount <input path> <output path>");
+ return -1;
+ }
+
+ JobConf conf = new JobConf(getConf(), MapredColorCount.class);
+ conf.setJobName("colorcount");
+
+ FileInputFormat.setInputPaths(conf, new Path(args[0]));
+ FileOutputFormat.setOutputPath(conf, new Path(args[1]));
+
+ AvroJob.setMapperClass(conf, ColorCountMapper.class);
+ AvroJob.setReducerClass(conf, ColorCountReducer.class);
+
+ // Note that AvroJob.setInputSchema and AvroJob.setOutputSchema set
+ // relevant config options such as input/output format, map output
+ // classes, and output key class.
+ AvroJob.setInputSchema(conf, User.getClassSchema());
+ AvroJob.setOutputSchema(conf, Pair.getPairSchema(Schema.create(Type.STRING),
+ Schema.create(Type.INT)));
+
+ JobClient.runJob(conf);
+ return 0;
+ }
+
+ public static void main(String[] args) throws Exception {
+ int res = ToolRunner.run(new Configuration(), new MapredColorCount(), args);
+ System.exit(res);
+ }
+}
Added: avro/site/publish/docs/1.8.2/examples/user.avsc
URL: http://svn.apache.org/viewvc/avro/site/publish/docs/1.8.2/examples/user.avsc?rev=1797058&view=auto
==============================================================================
--- avro/site/publish/docs/1.8.2/examples/user.avsc (added)
+++ avro/site/publish/docs/1.8.2/examples/user.avsc Wed May 31 15:18:41 2017
@@ -0,0 +1,9 @@
+{"namespace": "example.avro",
+ "type": "record",
+ "name": "User",
+ "fields": [
+ {"name": "name", "type": "string"},
+ {"name": "favorite_number", "type": ["int", "null"]},
+ {"name": "favorite_color", "type": ["string", "null"]}
+ ]
+}
Added: avro/site/publish/docs/1.8.2/gettingstartedjava.html
URL: http://svn.apache.org/viewvc/avro/site/publish/docs/1.8.2/gettingstartedjava.html?rev=1797058&view=auto
==============================================================================
--- avro/site/publish/docs/1.8.2/gettingstartedjava.html (added)
+++ avro/site/publish/docs/1.8.2/gettingstartedjava.html Wed May 31 15:18:41 2017
@@ -0,0 +1,694 @@
+<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
+<html>
+<head>
+<META http-equiv="Content-Type" content="text/html; charset=UTF-8">
+<meta content="Apache Forrest" name="Generator">
+<meta name="Forrest-version" content="0.9">
+<meta name="Forrest-skin-name" content="pelt">
+<title>Apache Avro™ 1.8.2
+ Getting Started (Java)</title>
+<link type="text/css" href="skin/basic.css" rel="stylesheet">
+<link media="screen" type="text/css" href="skin/screen.css" rel="stylesheet">
+<link media="print" type="text/css" href="skin/print.css" rel="stylesheet">
+<link type="text/css" href="skin/profile.css" rel="stylesheet">
+<script src="skin/getBlank.js" language="javascript" type="text/javascript"></script><script src="skin/getMenu.js" language="javascript" type="text/javascript"></script><script src="skin/fontsize.js" language="javascript" type="text/javascript"></script>
+<link rel="shortcut icon" href="images/favicon.ico">
+</head>
+<body onload="init()">
+<script type="text/javascript">ndeSetTextSize();</script>
+<div id="top">
+<!--+
+ |breadtrail
+ +-->
+<div class="breadtrail">
+<a href="http://www.apache.org/">Apache</a> > <a href="http://avro.apache.org/">Avro</a> > <a href="http://avro.apache.org/">Avro</a><script src="skin/breadcrumbs.js" language="JavaScript" type="text/javascript"></script>
+</div>
+<!--+
+ |header
+ +-->
+<div class="header">
+<!--+
+ |start group logo
+ +-->
+<div class="grouplogo">
+<a href="http://www.apache.org/"><img class="logoImage" alt="Apache" src="images/apache_feather.gif" title="The Apache Software Foundation"></a>
+</div>
+<!--+
+ |end group logo
+ +-->
+<!--+
+ |start Project Logo
+ +-->
+<div class="projectlogo">
+<a href="http://avro.apache.org/"><img class="logoImage" alt="Avro" src="images/avro-logo.png" title="Serialization System"></a>
+</div>
+<!--+
+ |end Project Logo
+ +-->
+<!--+
+ |start Search
+ +-->
+<div class="searchbox">
+<form action="http://www.google.com/search" method="get" class="roundtopsmall">
+<input value="avro.apache.org" name="sitesearch" type="hidden"><input onFocus="getBlank (this, 'Search the site with google');" size="25" name="q" id="query" type="text" value="Search the site with google">
+ <input name="Search" value="Search" type="submit">
+</form>
+</div>
+<!--+
+ |end search
+ +-->
+<!--+
+ |start Tabs
+ +-->
+<ul id="tabs">
+<li>
+<a class="unselected" href="http://avro.apache.org/">Project</a>
+</li>
+<li>
+<a class="unselected" href="http://wiki.apache.org/hadoop/Avro/">Wiki</a>
+</li>
+<li class="current">
+<a class="selected" href="index.html">Avro 1.8.2 Documentation</a>
+</li>
+</ul>
+<!--+
+ |end Tabs
+ +-->
+</div>
+</div>
+<div id="main">
+<div id="publishedStrip">
+<!--+
+ |start Subtabs
+ +-->
+<div id="level2tabs"></div>
+<!--+
+ |end Endtabs
+ +-->
+<script type="text/javascript"><!--
+document.write("Last Published: " + document.lastModified);
+// --></script>
+</div>
+<!--+
+ |breadtrail
+ +-->
+<div class="breadtrail">
+
+
+ </div>
+<!--+
+ |start Menu, mainarea
+ +-->
+<!--+
+ |start Menu
+ +-->
+<div id="menu">
+<div onclick="SwitchMenu('menu_selected_1.1', 'skin/')" id="menu_selected_1.1Title" class="menutitle" style="background-image: url('skin/images/chapter_open.gif');">Documentation</div>
+<div id="menu_selected_1.1" class="selectedmenuitemgroup" style="display: block;">
+<div class="menuitem">
+<a href="index.html">Overview</a>
+</div>
+<div class="menupage">
+<div class="menupagetitle">Getting started (Java)</div>
+</div>
+<div class="menuitem">
+<a href="gettingstartedpython.html">Getting started (Python)</a>
+</div>
+<div class="menuitem">
+<a href="spec.html">Specification</a>
+</div>
+<div class="menuitem">
+<a href="trevni/spec.html">Trevni</a>
+</div>
+<div class="menuitem">
+<a href="api/java/index.html">Java API</a>
+</div>
+<div class="menuitem">
+<a href="api/c/index.html">C API</a>
+</div>
+<div class="menuitem">
+<a href="api/cpp/html/index.html">C++ API</a>
+</div>
+<div class="menuitem">
+<a href="api/csharp/index.html">C# API</a>
+</div>
+<div class="menuitem">
+<a href="mr.html">MapReduce guide</a>
+</div>
+<div class="menuitem">
+<a href="idl.html">IDL language</a>
+</div>
+<div class="menuitem">
+<a href="sasl.html">SASL profile</a>
+</div>
+<div class="menuitem">
+<a href="http://wiki.apache.org/hadoop/Avro/">Wiki</a>
+</div>
+<div class="menuitem">
+<a href="http://wiki.apache.org/hadoop/Avro/FAQ">FAQ</a>
+</div>
+</div>
+<div id="credit"></div>
+<div id="roundbottom">
+<img style="display: none" class="corner" height="15" width="15" alt="" src="skin/images/rc-b-l-15-1body-2menu-3menu.png"></div>
+<!--+
+ |alternative credits
+ +-->
+<div id="credit2"></div>
+</div>
+<!--+
+ |end Menu
+ +-->
+<!--+
+ |start content
+ +-->
+<div id="content">
+<div title="Portable Document Format" class="pdflink">
+<a class="dida" href="gettingstartedjava.pdf"><img alt="PDF -icon" src="skin/images/pdfdoc.gif" class="skin"><br>
+ PDF</a>
+</div>
+<h1>Apache Avro™ 1.8.2
+ Getting Started (Java)</h1>
+<div id="front-matter">
+<div id="minitoc-area">
+<ul class="minitoc">
+<li>
+<a href="#download_install">Download</a>
+</li>
+<li>
+<a href="#Defining+a+schema">Defining a schema</a>
+</li>
+<li>
+<a href="#Serializing+and+deserializing+with+code+generation">Serializing and deserializing with code generation</a>
+<ul class="minitoc">
+<li>
+<a href="#Compiling+the+schema">Compiling the schema</a>
+</li>
+<li>
+<a href="#Creating+Users">Creating Users</a>
+</li>
+<li>
+<a href="#Serializing">Serializing</a>
+</li>
+<li>
+<a href="#Deserializing">Deserializing</a>
+</li>
+<li>
+<a href="#Compiling+and+running+the+example+code">Compiling and running the example code</a>
+</li>
+</ul>
+</li>
+<li>
+<a href="#Serializing+and+deserializing+without+code+generation">Serializing and deserializing without code generation</a>
+<ul class="minitoc">
+<li>
+<a href="#Creating+users">Creating users</a>
+</li>
+<li>
+<a href="#Serializing-N101DE">Serializing</a>
+</li>
+<li>
+<a href="#Deserializing-N10207">Deserializing</a>
+</li>
+<li>
+<a href="#Compiling+and+running+the+example+code-N10249">Compiling and running the example code</a>
+</li>
+</ul>
+</li>
+</ul>
+</div>
+</div>
+
+<p>
+ This is a short guide for getting started with Apache Avro™ using
+ Java. This guide only covers using Avro for data serialization; see
+ Patrick Hunt's <a href="https://github.com/phunt/avro-rpc-quickstart">Avro
+ RPC Quick Start</a> for a good introduction to using Avro for RPC.
+ </p>
+
+<a name="download_install"></a>
+<h2 class="h3">Download</h2>
+<div class="section">
+<p>
+ Avro implementations for C, C++, C#, Java, PHP, Python, and Ruby can be
+ downloaded from the <a href="http://avro.apache.org/releases.html">Apache Avro™
+ Releases</a> page. This guide uses Avro 1.8.2
+, the latest
+ version at the time of writing. For the examples in this guide,
+ download <em>avro-1.8.2
+.jar</em> and
+ <em>avro-tools-1.8.2
+.jar</em>. The Avro Java implementation
+ also depends on the <a href="http://jackson.codehaus.org/">Jackson</a>
+ JSON library. From the Jackson <a href="http://wiki.fasterxml.com/JacksonDownload">download page</a>,
+ download the core-asl and mapper-asl jars. Add
+ <em>avro-1.8.2
+.jar</em> and the Jackson jars to your project's
+ classpath (avro-tools will be used for code generation).
+ </p>
+<p>
+ Alternatively, if you are using Maven, add the following dependency to
+ your POM:
+ </p>
+<pre class="code">
+<dependency>
+ <groupId>org.apache.avro</groupId>
+ <artifactId>avro</artifactId>
+ <version>1.8.2
+</version>
+</dependency>
+ </pre>
+<p>
+ As well as the Avro Maven plugin (for performing code generation):
+ </p>
+<pre class="code">
+<plugin>
+ <groupId>org.apache.avro</groupId>
+ <artifactId>avro-maven-plugin</artifactId>
+ <version>1.8.2
+</version>
+ <executions>
+ <execution>
+ <phase>generate-sources</phase>
+ <goals>
+ <goal>schema</goal>
+ </goals>
+ <configuration>
+ <sourceDirectory>${project.basedir}/src/main/avro/</sourceDirectory>
+ <outputDirectory>${project.basedir}/src/main/java/</outputDirectory>
+ </configuration>
+ </execution>
+ </executions>
+</plugin>
+<plugin>
+ <groupId>org.apache.maven.plugins</groupId>
+ <artifactId>maven-compiler-plugin</artifactId>
+ <configuration>
+ <source>1.6</source>
+ <target>1.6</target>
+ </configuration>
+</plugin>
+ </pre>
+<p>
+ You may also build the required Avro jars from source. Building Avro is
+ beyond the scope of this guide; see the <a href="https://cwiki.apache.org/AVRO/Build+Documentation">Build
+ Documentation</a> page in the wiki for more information.
+ </p>
+</div>
+
+
+<a name="Defining+a+schema"></a>
+<h2 class="h3">Defining a schema</h2>
+<div class="section">
+<p>
+ Avro schemas are defined using JSON. Schemas are composed of <a href="spec.html#schema_primitive">primitive types</a>
+ (<span class="codefrag">null</span>, <span class="codefrag">boolean</span>, <span class="codefrag">int</span>,
+ <span class="codefrag">long</span>, <span class="codefrag">float</span>, <span class="codefrag">double</span>,
+ <span class="codefrag">bytes</span>, and <span class="codefrag">string</span>) and <a href="spec.html#schema_complex">complex types</a> (<span class="codefrag">record</span>,
+ <span class="codefrag">enum</span>, <span class="codefrag">array</span>, <span class="codefrag">map</span>,
+ <span class="codefrag">union</span>, and <span class="codefrag">fixed</span>). You can learn more about
+ Avro schemas and types from the specification, but for now let's start
+ with a simple schema example, <em>user.avsc</em>:
+ </p>
+<pre class="code">
+{"namespace": "example.avro",
+ "type": "record",
+ "name": "User",
+ "fields": [
+ {"name": "name", "type": "string"},
+ {"name": "favorite_number", "type": ["int", "null"]},
+ {"name": "favorite_color", "type": ["string", "null"]}
+ ]
+}
+ </pre>
+<p>
+ This schema defines a record representing a hypothetical user. (Note
+ that a schema file can only contain a single schema definition.) At
+ minimum, a record definition must include its type (<span class="codefrag">"type":
+ "record"</span>), a name (<span class="codefrag">"name": "User"</span>), and fields, in
+ this case <span class="codefrag">name</span>, <span class="codefrag">favorite_number</span>, and
+ <span class="codefrag">favorite_color</span>. We also define a namespace
+ (<span class="codefrag">"namespace": "example.avro"</span>), which together with the name
+ attribute defines the "full name" of the schema
+ (<span class="codefrag">example.avro.User</span> in this case).
+
+ </p>
+<p>
+ Fields are defined via an array of objects, each of which defines a name
+ and type (other attributes are optional, see the <a href="spec.html#schema_record">record specification</a> for more
+ details). The type attribute of a field is another schema object, which
+ can be either a primitive or complex type. For example, the
+ <span class="codefrag">name</span> field of our User schema is the primitive type
+ <span class="codefrag">string</span>, whereas the <span class="codefrag">favorite_number</span> and
+ <span class="codefrag">favorite_color</span> fields are both <span class="codefrag">union</span>s,
+ represented by JSON arrays. <span class="codefrag">union</span>s are a complex type that
+ can be any of the types listed in the array; e.g.,
+ <span class="codefrag">favorite_number</span> can either be an <span class="codefrag">int</span> or
+ <span class="codefrag">null</span>, essentially making it an optional field.
+ </p>
+</div>
+
+
+<a name="Serializing+and+deserializing+with+code+generation"></a>
+<h2 class="h3">Serializing and deserializing with code generation</h2>
+<div class="section">
+<a name="Compiling+the+schema"></a>
+<h3 class="h4">Compiling the schema</h3>
+<p>
+ Code generation allows us to automatically create classes based on our
+ previously-defined schema. Once we have defined the relevant classes,
+ there is no need to use the schema directly in our programs. We use the
+ avro-tools jar to generate code as follows:
+ </p>
+<pre class="code">
+java -jar /path/to/avro-tools-1.8.2
+.jar compile schema <schema file> <destination>
+ </pre>
+<p>
+ This will generate the appropriate source files in a package based on
+ the schema's namespace in the provided destination folder. For
+ instance, to generate a <span class="codefrag">User</span> class in package
+ <span class="codefrag">example.avro</span> from the schema defined above, run
+ </p>
+<pre class="code">
+java -jar /path/to/avro-tools-1.8.2
+.jar compile schema user.avsc .
+ </pre>
+<p>
+ Note that if you using the Avro Maven plugin, there is no need to
+ manually invoke the schema compiler; the plugin automatically
+ performs code generation on any .avsc files present in the configured
+ source directory.
+ </p>
+<a name="Creating+Users"></a>
+<h3 class="h4">Creating Users</h3>
+<p>
+ Now that we've completed the code generation, let's create some
+ <span class="codefrag">User</span>s, serialize them to a data file on disk, and then
+ read back the file and deserialize the <span class="codefrag">User</span> objects.
+ </p>
+<p>
+ First let's create some <span class="codefrag">User</span>s and set their fields.
+ </p>
+<pre class="code">
+User user1 = new User();
+user1.setName("Alyssa");
+user1.setFavoriteNumber(256);
+// Leave favorite color null
+
+// Alternate constructor
+User user2 = new User("Ben", 7, "red");
+
+// Construct via builder
+User user3 = User.newBuilder()
+ .setName("Charlie")
+ .setFavoriteColor("blue")
+ .setFavoriteNumber(null)
+ .build();
+ </pre>
+<p>
+ As shown in this example, Avro objects can be created either by
+ invoking a constructor directly or by using a builder. Unlike
+ constructors, builders will automatically set any default values
+ specified in the schema. Additionally, builders validate the data as
+ it set, whereas objects constructed directly will not cause an error
+ until the object is serialized. However, using constructors directly
+ generally offers better performance, as builders create a copy of the
+ datastructure before it is written.
+ </p>
+<p>
+ Note that we do not set <span class="codefrag">user1</span>'s favorite color. Since
+ that record is of type <span class="codefrag">["string", "null"]</span>, we can either
+ set it to a <span class="codefrag">string</span> or leave it <span class="codefrag">null</span>; it is
+ essentially optional. Similarly, we set <span class="codefrag">user3</span>'s favorite
+ number to null (using a builder requires setting all fields, even if
+ they are null).
+ </p>
+<a name="Serializing"></a>
+<h3 class="h4">Serializing</h3>
+<p>
+ Now let's serialize our <span class="codefrag">User</span>s to disk.
+ </p>
+<pre class="code">
+// Serialize user1, user2 and user3 to disk
+DatumWriter<User> userDatumWriter = new SpecificDatumWriter<User>(User.class);
+DataFileWriter<User> dataFileWriter = new DataFileWriter<User>(userDatumWriter);
+dataFileWriter.create(user1.getSchema(), new File("users.avro"));
+dataFileWriter.append(user1);
+dataFileWriter.append(user2);
+dataFileWriter.append(user3);
+dataFileWriter.close();
+ </pre>
+<p>
+ We create a <span class="codefrag">DatumWriter</span>, which converts Java objects into
+ an in-memory serialized format. The <span class="codefrag">SpecificDatumWriter</span>
+ class is used with generated classes and extracts the schema from the
+ specified generated type.
+ </p>
+<p>
+ Next we create a <span class="codefrag">DataFileWriter</span>, which writes the
+ serialized records, as well as the schema, to the file specified in the
+ <span class="codefrag">dataFileWriter.create</span> call. We write our users to the file
+ via calls to the <span class="codefrag">dataFileWriter.append</span> method. When we are
+ done writing, we close the data file.
+ </p>
+<a name="Deserializing"></a>
+<h3 class="h4">Deserializing</h3>
+<p>
+ Finally, let's deserialize the data file we just created.
+ </p>
+<pre class="code">
+// Deserialize Users from disk
+DatumReader<User> userDatumReader = new SpecificDatumReader<User>(User.class);
+DataFileReader<User> dataFileReader = new DataFileReader<User>(file, userDatumReader);
+User user = null;
+while (dataFileReader.hasNext()) {
+// Reuse user object by passing it to next(). This saves us from
+// allocating and garbage collecting many objects for files with
+// many items.
+user = dataFileReader.next(user);
+System.out.println(user);
+}
+ </pre>
+<p>
+ This snippet will output:
+ </p>
+<pre class="code">
+{"name": "Alyssa", "favorite_number": 256, "favorite_color": null}
+{"name": "Ben", "favorite_number": 7, "favorite_color": "red"}
+{"name": "Charlie", "favorite_number": null, "favorite_color": "blue"}
+ </pre>
+<p>
+ Deserializing is very similar to serializing. We create a
+ <span class="codefrag">SpecificDatumReader</span>, analogous to the
+ <span class="codefrag">SpecificDatumWriter</span> we used in serialization, which
+ converts in-memory serialized items into instances of our generated
+ class, in this case <span class="codefrag">User</span>. We pass the
+ <span class="codefrag">DatumReader</span> and the previously created <span class="codefrag">File</span>
+ to a <span class="codefrag">DataFileReader</span>, analogous to the
+ <span class="codefrag">DataFileWriter</span>, which reads the data file on disk.
+ </p>
+<p>
+ Next we use the <span class="codefrag">DataFileReader</span> to iterate through the
+ serialized <span class="codefrag">User</span>s and print the deserialized object to
+ stdout. Note how we perform the iteration: we create a single
+ <span class="codefrag">User</span> object which we store the current deserialized user
+ in, and pass this record object to every call of
+ <span class="codefrag">dataFileReader.next</span>. This is a performance optimization
+ that allows the <span class="codefrag">DataFileReader</span> to reuse the same
+ <span class="codefrag">User</span> object rather than allocating a new
+ <span class="codefrag">User</span> for every iteration, which can be very expensive in
+ terms of object allocation and garbage collection if we deserialize a
+ large data file. While this technique is the standard way to iterate
+ through a data file, it's also possible to use <span class="codefrag">for (User user :
+ dataFileReader)</span> if performance is not a concern.
+ </p>
+<a name="Compiling+and+running+the+example+code"></a>
+<h3 class="h4">Compiling and running the example code</h3>
+<p>
+ This example code is included as a Maven project in the
+ <em>examples/java-example</em> directory in the Avro docs. From this
+ directory, execute the following commands to build and run the
+ example:
+ </p>
+<pre class="code">
+$ mvn compile # includes code generation via Avro Maven plugin
+$ mvn -q exec:java -Dexec.mainClass=example.SpecificMain
+ </pre>
+</div>
+
+
+<a name="Serializing+and+deserializing+without+code+generation"></a>
+<h2 class="h3">Serializing and deserializing without code generation</h2>
+<div class="section">
+<p>
+ Data in Avro is always stored with its corresponding schema, meaning we
+ can always read a serialized item regardless of whether we know the
+ schema ahead of time. This allows us to perform serialization and
+ deserialization without code generation.
+ </p>
+<p>
+ Let's go over the same example as in the previous section, but without
+ using code generation: we'll create some users, serialize them to a data
+ file on disk, and then read back the file and deserialize the users
+ objects.
+ </p>
+<a name="Creating+users"></a>
+<h3 class="h4">Creating users</h3>
+<p>
+ First, we use a <span class="codefrag">Parser</span> to read our schema definition and
+ create a <span class="codefrag">Schema</span> object.
+ </p>
+<pre class="code">
+Schema schema = new Schema.Parser().parse(new File("user.avsc"));
+ </pre>
+<p>
+ Using this schema, let's create some users.
+ </p>
+<pre class="code">
+GenericRecord user1 = new GenericData.Record(schema);
+user1.put("name", "Alyssa");
+user1.put("favorite_number", 256);
+// Leave favorite color null
+
+GenericRecord user2 = new GenericData.Record(schema);
+user2.put("name", "Ben");
+user2.put("favorite_number", 7);
+user2.put("favorite_color", "red");
+ </pre>
+<p>
+ Since we're not using code generation, we use
+ <span class="codefrag">GenericRecord</span>s to represent users.
+ <span class="codefrag">GenericRecord</span> uses the schema to verify that we only
+ specify valid fields. If we try to set a non-existent field (e.g.,
+ <span class="codefrag">user1.put("favorite_animal", "cat")</span>), we'll get an
+ <span class="codefrag">AvroRuntimeException</span> when we run the program.
+ </p>
+<p>
+ Note that we do not set <span class="codefrag">user1</span>'s favorite color. Since
+ that record is of type <span class="codefrag">["string", "null"]</span>, we can either
+ set it to a <span class="codefrag">string</span> or leave it <span class="codefrag">null</span>; it is
+ essentially optional.
+ </p>
+<a name="Serializing-N101DE"></a>
+<h3 class="h4">Serializing</h3>
+<p>
+ Now that we've created our user objects, serializing and deserializing
+ them is almost identical to the example above which uses code
+ generation. The main difference is that we use generic instead of
+ specific readers and writers.
+ </p>
+<p>
+ First we'll serialize our users to a data file on disk.
+ </p>
+<pre class="code">
+// Serialize user1 and user2 to disk
+File file = new File("users.avro");
+DatumWriter<GenericRecord> datumWriter = new GenericDatumWriter<GenericRecord>(schema);
+DataFileWriter<GenericRecord> dataFileWriter = new DataFileWriter<GenericRecord>(datumWriter);
+dataFileWriter.create(schema, file);
+dataFileWriter.append(user1);
+dataFileWriter.append(user2);
+dataFileWriter.close();
+ </pre>
+<p>
+ We create a <span class="codefrag">DatumWriter</span>, which converts Java objects into
+ an in-memory serialized format. Since we are not using code
+ generation, we create a <span class="codefrag">GenericDatumWriter</span>. It requires
+ the schema both to determine how to write the
+ <span class="codefrag">GenericRecord</span>s and to verify that all non-nullable fields
+ are present.
+ </p>
+<p>
+ As in the code generation example, we also create a
+ <span class="codefrag">DataFileWriter</span>, which writes the serialized records, as
+ well as the schema, to the file specified in the
+ <span class="codefrag">dataFileWriter.create</span> call. We write our users to the
+ file via calls to the <span class="codefrag">dataFileWriter.append</span> method. When
+ we are done writing, we close the data file.
+ </p>
+<a name="Deserializing-N10207"></a>
+<h3 class="h4">Deserializing</h3>
+<p>
+ Finally, we'll deserialize the data file we just created.
+ </p>
+<pre class="code">
+// Deserialize users from disk
+DatumReader<GenericRecord> datumReader = new GenericDatumReader<GenericRecord>(schema);
+DataFileReader<GenericRecord> dataFileReader = new DataFileReader<GenericRecord>(file, datumReader);
+GenericRecord user = null;
+while (dataFileReader.hasNext()) {
+// Reuse user object by passing it to next(). This saves us from
+// allocating and garbage collecting many objects for files with
+// many items.
+user = dataFileReader.next(user);
+System.out.println(user);
+ </pre>
+<p>This outputs:</p>
+<pre class="code">
+{"name": "Alyssa", "favorite_number": 256, "favorite_color": null}
+{"name": "Ben", "favorite_number": 7, "favorite_color": "red"}
+ </pre>
+<p>
+ Deserializing is very similar to serializing. We create a
+ <span class="codefrag">GenericDatumReader</span>, analogous to the
+ <span class="codefrag">GenericDatumWriter</span> we used in serialization, which
+ converts in-memory serialized items into <span class="codefrag">GenericRecords</span>.
+ We pass the <span class="codefrag">DatumReader</span> and the previously created
+ <span class="codefrag">File</span> to a <span class="codefrag">DataFileReader</span>, analogous to the
+ <span class="codefrag">DataFileWriter</span>, which reads the data file on disk.
+ </p>
+<p>
+ Next, we use the <span class="codefrag">DataFileReader</span> to iterate through the
+ serialized users and print the deserialized object to stdout. Note
+ how we perform the iteration: we create a single
+ <span class="codefrag">GenericRecord</span> object which we store the current
+ deserialized user in, and pass this record object to every call of
+ <span class="codefrag">dataFileReader.next</span>. This is a performance optimization
+ that allows the <span class="codefrag">DataFileReader</span> to reuse the same record
+ object rather than allocating a new <span class="codefrag">GenericRecord</span> for
+ every iteration, which can be very expensive in terms of object
+ allocation and garbage collection if we deserialize a large data file.
+ While this technique is the standard way to iterate through a data
+ file, it's also possible to use <span class="codefrag">for (GenericRecord user :
+ dataFileReader)</span> if performance is not a concern.
+ </p>
+<a name="Compiling+and+running+the+example+code-N10249"></a>
+<h3 class="h4">Compiling and running the example code</h3>
+<p>
+ This example code is included as a Maven project in the
+ <em>examples/java-example</em> directory in the Avro docs. From this
+ directory, execute the following commands to build and run the
+ example:
+ </p>
+<pre class="code">
+$ mvn compile
+$ mvn -q exec:java -Dexec.mainClass=example.GenericMain
+ </pre>
+</div>
+
+</div>
+<!--+
+ |end content
+ +-->
+<div class="clearboth"> </div>
+</div>
+<div id="footer">
+<!--+
+ |start bottomstrip
+ +-->
+<div class="lastmodified">
+<script type="text/javascript"><!--
+document.write("Last Published: " + document.lastModified);
+// --></script>
+</div>
+<div class="copyright">
+ Copyright ©
+ 2012 <a href="http://www.apache.org/licenses/">The Apache Software Foundation.</a>
+</div>
+<!--+
+ |end bottomstrip
+ +-->
+</div>
+</body>
+</html>
Added: avro/site/publish/docs/1.8.2/gettingstartedjava.pdf
URL: http://svn.apache.org/viewvc/avro/site/publish/docs/1.8.2/gettingstartedjava.pdf?rev=1797058&view=auto
==============================================================================
Binary file - no diff available.
Propchange: avro/site/publish/docs/1.8.2/gettingstartedjava.pdf
------------------------------------------------------------------------------
svn:mime-type = application/octet-stream
Added: avro/site/publish/docs/1.8.2/gettingstartedpython.html
URL: http://svn.apache.org/viewvc/avro/site/publish/docs/1.8.2/gettingstartedpython.html?rev=1797058&view=auto
==============================================================================
--- avro/site/publish/docs/1.8.2/gettingstartedpython.html (added)
+++ avro/site/publish/docs/1.8.2/gettingstartedpython.html Wed May 31 15:18:41 2017
@@ -0,0 +1,423 @@
+<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
+<html>
+<head>
+<META http-equiv="Content-Type" content="text/html; charset=UTF-8">
+<meta content="Apache Forrest" name="Generator">
+<meta name="Forrest-version" content="0.9">
+<meta name="Forrest-skin-name" content="pelt">
+<title>Apache Avro™ 1.8.2
+ Getting Started (Python)</title>
+<link type="text/css" href="skin/basic.css" rel="stylesheet">
+<link media="screen" type="text/css" href="skin/screen.css" rel="stylesheet">
+<link media="print" type="text/css" href="skin/print.css" rel="stylesheet">
+<link type="text/css" href="skin/profile.css" rel="stylesheet">
+<script src="skin/getBlank.js" language="javascript" type="text/javascript"></script><script src="skin/getMenu.js" language="javascript" type="text/javascript"></script><script src="skin/fontsize.js" language="javascript" type="text/javascript"></script>
+<link rel="shortcut icon" href="images/favicon.ico">
+</head>
+<body onload="init()">
+<script type="text/javascript">ndeSetTextSize();</script>
+<div id="top">
+<!--+
+ |breadtrail
+ +-->
+<div class="breadtrail">
+<a href="http://www.apache.org/">Apache</a> > <a href="http://avro.apache.org/">Avro</a> > <a href="http://avro.apache.org/">Avro</a><script src="skin/breadcrumbs.js" language="JavaScript" type="text/javascript"></script>
+</div>
+<!--+
+ |header
+ +-->
+<div class="header">
+<!--+
+ |start group logo
+ +-->
+<div class="grouplogo">
+<a href="http://www.apache.org/"><img class="logoImage" alt="Apache" src="images/apache_feather.gif" title="The Apache Software Foundation"></a>
+</div>
+<!--+
+ |end group logo
+ +-->
+<!--+
+ |start Project Logo
+ +-->
+<div class="projectlogo">
+<a href="http://avro.apache.org/"><img class="logoImage" alt="Avro" src="images/avro-logo.png" title="Serialization System"></a>
+</div>
+<!--+
+ |end Project Logo
+ +-->
+<!--+
+ |start Search
+ +-->
+<div class="searchbox">
+<form action="http://www.google.com/search" method="get" class="roundtopsmall">
+<input value="avro.apache.org" name="sitesearch" type="hidden"><input onFocus="getBlank (this, 'Search the site with google');" size="25" name="q" id="query" type="text" value="Search the site with google">
+ <input name="Search" value="Search" type="submit">
+</form>
+</div>
+<!--+
+ |end search
+ +-->
+<!--+
+ |start Tabs
+ +-->
+<ul id="tabs">
+<li>
+<a class="unselected" href="http://avro.apache.org/">Project</a>
+</li>
+<li>
+<a class="unselected" href="http://wiki.apache.org/hadoop/Avro/">Wiki</a>
+</li>
+<li class="current">
+<a class="selected" href="index.html">Avro 1.8.2 Documentation</a>
+</li>
+</ul>
+<!--+
+ |end Tabs
+ +-->
+</div>
+</div>
+<div id="main">
+<div id="publishedStrip">
+<!--+
+ |start Subtabs
+ +-->
+<div id="level2tabs"></div>
+<!--+
+ |end Endtabs
+ +-->
+<script type="text/javascript"><!--
+document.write("Last Published: " + document.lastModified);
+// --></script>
+</div>
+<!--+
+ |breadtrail
+ +-->
+<div class="breadtrail">
+
+
+ </div>
+<!--+
+ |start Menu, mainarea
+ +-->
+<!--+
+ |start Menu
+ +-->
+<div id="menu">
+<div onclick="SwitchMenu('menu_selected_1.1', 'skin/')" id="menu_selected_1.1Title" class="menutitle" style="background-image: url('skin/images/chapter_open.gif');">Documentation</div>
+<div id="menu_selected_1.1" class="selectedmenuitemgroup" style="display: block;">
+<div class="menuitem">
+<a href="index.html">Overview</a>
+</div>
+<div class="menuitem">
+<a href="gettingstartedjava.html">Getting started (Java)</a>
+</div>
+<div class="menupage">
+<div class="menupagetitle">Getting started (Python)</div>
+</div>
+<div class="menuitem">
+<a href="spec.html">Specification</a>
+</div>
+<div class="menuitem">
+<a href="trevni/spec.html">Trevni</a>
+</div>
+<div class="menuitem">
+<a href="api/java/index.html">Java API</a>
+</div>
+<div class="menuitem">
+<a href="api/c/index.html">C API</a>
+</div>
+<div class="menuitem">
+<a href="api/cpp/html/index.html">C++ API</a>
+</div>
+<div class="menuitem">
+<a href="api/csharp/index.html">C# API</a>
+</div>
+<div class="menuitem">
+<a href="mr.html">MapReduce guide</a>
+</div>
+<div class="menuitem">
+<a href="idl.html">IDL language</a>
+</div>
+<div class="menuitem">
+<a href="sasl.html">SASL profile</a>
+</div>
+<div class="menuitem">
+<a href="http://wiki.apache.org/hadoop/Avro/">Wiki</a>
+</div>
+<div class="menuitem">
+<a href="http://wiki.apache.org/hadoop/Avro/FAQ">FAQ</a>
+</div>
+</div>
+<div id="credit"></div>
+<div id="roundbottom">
+<img style="display: none" class="corner" height="15" width="15" alt="" src="skin/images/rc-b-l-15-1body-2menu-3menu.png"></div>
+<!--+
+ |alternative credits
+ +-->
+<div id="credit2"></div>
+</div>
+<!--+
+ |end Menu
+ +-->
+<!--+
+ |start content
+ +-->
+<div id="content">
+<div title="Portable Document Format" class="pdflink">
+<a class="dida" href="gettingstartedpython.pdf"><img alt="PDF -icon" src="skin/images/pdfdoc.gif" class="skin"><br>
+ PDF</a>
+</div>
+<h1>Apache Avro™ 1.8.2
+ Getting Started (Python)</h1>
+<div id="front-matter">
+<div id="minitoc-area">
+<ul class="minitoc">
+<li>
+<a href="#download_install">Download</a>
+</li>
+<li>
+<a href="#Defining+a+schema">Defining a schema</a>
+</li>
+<li>
+<a href="#Serializing+and+deserializing+without+code+generation">Serializing and deserializing without code generation</a>
+</li>
+</ul>
+</div>
+</div>
+
+<p>
+ This is a short guide for getting started with Apache Avro™ using
+ Python. This guide only covers using Avro for data serialization; see
+ Patrick Hunt's <a href="https://github.com/phunt/avro-rpc-quickstart">Avro
+ RPC Quick Start</a> for a good introduction to using Avro for RPC.
+ </p>
+
+
+<a name="download_install"></a>
+<h2 class="h3">Download</h2>
+<div class="section">
+<p>
+ Avro implementations for C, C++, C#, Java, PHP, Python, and Ruby can be
+ downloaded from the <a href="http://avro.apache.org/releases.html">Apache Avro™
+ Releases</a> page. This guide uses Avro 1.8.2
+, the latest
+ version at the time of writing. Download and unzip
+ <em>avro-1.8.2
+.tar.gz</em>, and install via <span class="codefrag">python
+ setup.py</span> (this will probably require root privileges). Ensure
+ that you can <span class="codefrag">import avro</span> from a Python prompt.
+ </p>
+<pre class="code">
+$ tar xvf avro-1.8.2
+.tar.gz
+$ cd avro-1.8.2
+
+$ sudo python setup.py install
+$ python
+>>> import avro # should not raise ImportError
+ </pre>
+<p>
+ Alternatively, you may build the Avro Python library from source. From
+ your the root Avro directory, run the commands
+ </p>
+<pre class="code">
+$ cd lang/py/
+$ ant
+$ sudo python setup.py install
+$ python
+>>> import avro # should not raise ImportError
+ </pre>
+</div>
+
+
+<a name="Defining+a+schema"></a>
+<h2 class="h3">Defining a schema</h2>
+<div class="section">
+<p>
+ Avro schemas are defined using JSON. Schemas are composed of <a href="spec.html#schema_primitive">primitive types</a>
+ (<span class="codefrag">null</span>, <span class="codefrag">boolean</span>, <span class="codefrag">int</span>,
+ <span class="codefrag">long</span>, <span class="codefrag">float</span>, <span class="codefrag">double</span>,
+ <span class="codefrag">bytes</span>, and <span class="codefrag">string</span>) and <a href="spec.html#schema_complex">complex types</a> (<span class="codefrag">record</span>,
+ <span class="codefrag">enum</span>, <span class="codefrag">array</span>, <span class="codefrag">map</span>,
+ <span class="codefrag">union</span>, and <span class="codefrag">fixed</span>). You can learn more about
+ Avro schemas and types from the specification, but for now let's start
+ with a simple schema example, <em>user.avsc</em>:
+ </p>
+<pre class="code">
+{"namespace": "example.avro",
+ "type": "record",
+ "name": "User",
+ "fields": [
+ {"name": "name", "type": "string"},
+ {"name": "favorite_number", "type": ["int", "null"]},
+ {"name": "favorite_color", "type": ["string", "null"]}
+ ]
+}
+ </pre>
+<p>
+ This schema defines a record representing a hypothetical user. (Note
+ that a schema file can only contain a single schema definition.) At
+ minimum, a record definition must include its type (<span class="codefrag">"type":
+ "record"</span>), a name (<span class="codefrag">"name": "User"</span>), and fields, in
+ this case <span class="codefrag">name</span>, <span class="codefrag">favorite_number</span>, and
+ <span class="codefrag">favorite_color</span>. We also define a namespace
+ (<span class="codefrag">"namespace": "example.avro"</span>), which together with the name
+ attribute defines the "full name" of the schema
+ (<span class="codefrag">example.avro.User</span> in this case).
+
+ </p>
+<p>
+ Fields are defined via an array of objects, each of which defines a name
+ and type (other attributes are optional, see the <a href="spec.html#schema_record">record specification</a> for more
+ details). The type attribute of a field is another schema object, which
+ can be either a primitive or complex type. For example, the
+ <span class="codefrag">name</span> field of our User schema is the primitive type
+ <span class="codefrag">string</span>, whereas the <span class="codefrag">favorite_number</span> and
+ <span class="codefrag">favorite_color</span> fields are both <span class="codefrag">union</span>s,
+ represented by JSON arrays. <span class="codefrag">union</span>s are a complex type that
+ can be any of the types listed in the array; e.g.,
+ <span class="codefrag">favorite_number</span> can either be an <span class="codefrag">int</span> or
+ <span class="codefrag">null</span>, essentially making it an optional field.
+ </p>
+</div>
+
+
+<a name="Serializing+and+deserializing+without+code+generation"></a>
+<h2 class="h3">Serializing and deserializing without code generation</h2>
+<div class="section">
+<p>
+ Data in Avro is always stored with its corresponding schema, meaning we
+ can always read a serialized item, regardless of whether we know the
+ schema ahead of time. This allows us to perform serialization and
+ deserialization without code generation. Note that the Avro Python
+ library does not support code generation.
+ </p>
+<p>
+ Try running the following code snippet, which serializes two users to a
+ data file on disk, and then reads back and deserializes the data file:
+ </p>
+<pre class="code">
+import avro.schema
+from avro.datafile import DataFileReader, DataFileWriter
+from avro.io import DatumReader, DatumWriter
+
+schema = avro.schema.parse(open("user.avsc", "rb").read())
+
+writer = DataFileWriter(open("users.avro", "wb"), DatumWriter(), schema)
+writer.append({"name": "Alyssa", "favorite_number": 256})
+writer.append({"name": "Ben", "favorite_number": 7, "favorite_color": "red"})
+writer.close()
+
+reader = DataFileReader(open("users.avro", "rb"), DatumReader())
+for user in reader:
+ print user
+reader.close()
+ </pre>
+<p>This outputs:</p>
+<pre class="code">
+{u'favorite_color': None, u'favorite_number': 256, u'name': u'Alyssa'}
+{u'favorite_color': u'red', u'favorite_number': 7, u'name': u'Ben'}
+ </pre>
+<p>
+ Do make sure that you open your files in binary mode (i.e. using the modes
+ <span class="codefrag">wb</span> or <span class="codefrag">rb</span> respectively). Otherwise you might
+ generate corrupt files due to
+ <a href="http://docs.python.org/library/functions.html#open">
+ automatic replacement</a> of newline characters with the
+ platform-specific representations.
+ </p>
+<p>
+ Let's take a closer look at what's going on here.
+ </p>
+<pre class="code">
+schema = avro.schema.parse(open("user.avsc", "rb").read())
+ </pre>
+<p>
+
+<span class="codefrag">avro.schema.parse</span> takes a string containing a JSON schema
+ definition as input and outputs a <span class="codefrag">avro.schema.Schema</span> object
+ (specifically a subclass of <span class="codefrag">Schema</span>, in this case
+ <span class="codefrag">RecordSchema</span>). We're passing in the contents of our
+ user.avsc schema file here.
+ </p>
+<pre class="code">
+writer = DataFileWriter(open("users.avro", "wb"), DatumWriter(), schema)
+ </pre>
+<p>
+ We create a <span class="codefrag">DataFileWriter</span>, which we'll use to write
+ serialized items to a data file on disk. The
+ <span class="codefrag">DataFileWriter</span> constructor takes three arguments:
+ </p>
+<ul>
+
+<li>The file we'll serialize to</li>
+
+<li>A <span class="codefrag">DatumWriter</span>, which is responsible for actually
+ serializing the items to Avro's binary format
+ (<span class="codefrag">DatumWriter</span>s can be used separately from
+ <span class="codefrag">DataFileWriter</span>s, e.g., to perform IPC with Avro
+ <strong>TODO: is this true??</strong>).</li>
+
+<li>The schema we're using. The <span class="codefrag">DataFileWriter</span> needs the
+ schema both to write the schema to the data file, and to verify that
+ the items we write are valid items and write the appropriate
+ fields.</li>
+
+</ul>
+<pre class="code">
+writer.append({"name": "Alyssa", "favorite_number": 256})
+writer.append({"name": "Ben", "favorite_number": 7, "favorite_color": "red"})
+ </pre>
+<p>
+ We use <span class="codefrag">DataFileWriter.append</span> to add items to our data
+ file. Avro records are represented as Python <span class="codefrag">dict</span>s.
+ Since the field <span class="codefrag">favorite_color</span> has type <span class="codefrag">["int",
+ "null"]</span>, we are not required to specify this field, as shown in
+ the first append. Were we to omit the required <span class="codefrag">name</span>
+ field, an exception would be raised. Any extra entries not
+ corresponding to a field are present in the <span class="codefrag">dict</span> are
+ ignored.
+ </p>
+<pre class="code">
+reader = DataFileReader(open("users.avro", "rb"), DatumReader())
+ </pre>
+<p>
+ We open the file again, this time for reading back from disk. We use
+ a <span class="codefrag">DataFileReader</span> and <span class="codefrag">DatumReader</span> analagous
+ to the <span class="codefrag">DataFileWriter</span> and <span class="codefrag">DatumWriter</span> above.
+ </p>
+<pre class="code">
+for user in reader:
+ print user
+ </pre>
+<p>
+ The <span class="codefrag">DataFileReader</span> is an iterator that returns
+ <span class="codefrag">dict</span>s corresponding to the serialized items.
+ </p>
+</div>
+
+</div>
+<!--+
+ |end content
+ +-->
+<div class="clearboth"> </div>
+</div>
+<div id="footer">
+<!--+
+ |start bottomstrip
+ +-->
+<div class="lastmodified">
+<script type="text/javascript"><!--
+document.write("Last Published: " + document.lastModified);
+// --></script>
+</div>
+<div class="copyright">
+ Copyright ©
+ 2012 <a href="http://www.apache.org/licenses/">The Apache Software Foundation.</a>
+</div>
+<!--+
+ |end bottomstrip
+ +-->
+</div>
+</body>
+</html>
Added: avro/site/publish/docs/1.8.2/gettingstartedpython.pdf
URL: http://svn.apache.org/viewvc/avro/site/publish/docs/1.8.2/gettingstartedpython.pdf?rev=1797058&view=auto
==============================================================================
Binary file - no diff available.