You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@avro.apache.org by "Skye Wanderman-Milne (JIRA)" <ji...@apache.org> on 2012/11/09 19:24:12 UTC

[jira] [Created] (AVRO-1202) Add "Getting started" guide to documentation

Skye Wanderman-Milne created AVRO-1202:
------------------------------------------

             Summary: Add "Getting started" guide to documentation
                 Key: AVRO-1202
                 URL: https://issues.apache.org/jira/browse/AVRO-1202
             Project: Avro
          Issue Type: Improvement
          Components: doc
            Reporter: Skye Wanderman-Milne
             Fix For: 1.7.3


The Avro documentation is currently not very beginner-friendly -- it's hard to figure out basics like code generation, de/serialization, etc. We should write a "Getting started" guide and similar documentation for those new to Avro, as well as include some example code with the docs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (AVRO-1202) Add "Getting started" guide to documentation

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/AVRO-1202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13494189#comment-13494189 ] 

Doug Cutting commented on AVRO-1202:
------------------------------------

A user guide would be great to have!

Topics might include:
- Concepts
 -- Schemas & Protocols
 -- Generic, Specific & Reflect APIs
 -- IDL
- Command line tools
- Maven plugins
- Reading and writing data files
- Writing MapReduce jobs
- Integration with other projects
 -- Pig
 -- Hive
 -- Flume
- RPC


                
> Add "Getting started" guide to documentation
> --------------------------------------------
>
>                 Key: AVRO-1202
>                 URL: https://issues.apache.org/jira/browse/AVRO-1202
>             Project: Avro
>          Issue Type: Improvement
>          Components: doc
>            Reporter: Skye Wanderman-Milne
>              Labels: documentation
>             Fix For: 1.7.3
>
>
> The Avro documentation is currently not very beginner-friendly -- it's hard to figure out basics like code generation, de/serialization, etc. We should write a "Getting started" guide and similar documentation for those new to Avro, as well as include some example code with the docs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (AVRO-1202) Add "Getting started" guide to documentation

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/AVRO-1202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Doug Cutting updated AVRO-1202:
-------------------------------

    Resolution: Fixed
        Status: Resolved  (was: Patch Available)

I committed this.  Thanks, Skye!
                
> Add "Getting started" guide to documentation
> --------------------------------------------
>
>                 Key: AVRO-1202
>                 URL: https://issues.apache.org/jira/browse/AVRO-1202
>             Project: Avro
>          Issue Type: Improvement
>          Components: doc
>            Reporter: Skye Wanderman-Milne
>            Assignee: Skye Wanderman-Milne
>              Labels: documentation
>             Fix For: 1.7.3
>
>         Attachments: AVRO-1202.0.patch, AVRO-1202.0.tar.gz, AVRO-1202.1.patch, AVRO-1202.1.tar.gz, AVRO-1202.2.patch, AVRO-1202.2.tar.gz.tar.gz, AVRO-1202.patch
>
>
> The Avro documentation is currently not very beginner-friendly -- it's hard to figure out basics like code generation, de/serialization, etc. We should write a "Getting started" guide and similar documentation for those new to Avro, as well as include some example code with the docs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (AVRO-1202) Add "Getting started" guide to documentation

Posted by "Hari Shreedharan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/AVRO-1202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13494205#comment-13494205 ] 

Hari Shreedharan commented on AVRO-1202:
----------------------------------------

+1 on this.
                
> Add "Getting started" guide to documentation
> --------------------------------------------
>
>                 Key: AVRO-1202
>                 URL: https://issues.apache.org/jira/browse/AVRO-1202
>             Project: Avro
>          Issue Type: Improvement
>          Components: doc
>            Reporter: Skye Wanderman-Milne
>              Labels: documentation
>             Fix For: 1.7.3
>
>
> The Avro documentation is currently not very beginner-friendly -- it's hard to figure out basics like code generation, de/serialization, etc. We should write a "Getting started" guide and similar documentation for those new to Avro, as well as include some example code with the docs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (AVRO-1202) Add "Getting started" guide to documentation

Posted by "Jeff Kolesky (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/AVRO-1202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13494407#comment-13494407 ] 

Jeff Kolesky commented on AVRO-1202:
------------------------------------

Also, as it turns out, I was helping a colleague understand reader and writer schemas and realized a code example would be best, so I put up some code on Github: https://github.com/opower/avro-by-example

Please feel free to incorporate it in this documentation as you want, or I can make a patch with the example as well.
                
> Add "Getting started" guide to documentation
> --------------------------------------------
>
>                 Key: AVRO-1202
>                 URL: https://issues.apache.org/jira/browse/AVRO-1202
>             Project: Avro
>          Issue Type: Improvement
>          Components: doc
>            Reporter: Skye Wanderman-Milne
>            Assignee: Skye Wanderman-Milne
>              Labels: documentation
>             Fix For: 1.7.3
>
>         Attachments: AVRO-1202.0.patch, AVRO-1202.0.tar.gz, AVRO-1202.patch
>
>
> The Avro documentation is currently not very beginner-friendly -- it's hard to figure out basics like code generation, de/serialization, etc. We should write a "Getting started" guide and similar documentation for those new to Avro, as well as include some example code with the docs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (AVRO-1202) Add "Getting started" guide to documentation

Posted by "Skye Wanderman-Milne (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/AVRO-1202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Skye Wanderman-Milne updated AVRO-1202:
---------------------------------------

    Attachment: AVRO-1202.2.tar.gz.tar.gz
                AVRO-1202.2.patch

I added a builder example to the Java example, and went over the performance vs. convenience tradeoff in the guide. (I also removed some extraneous imports.)

+1 to committing
                
> Add "Getting started" guide to documentation
> --------------------------------------------
>
>                 Key: AVRO-1202
>                 URL: https://issues.apache.org/jira/browse/AVRO-1202
>             Project: Avro
>          Issue Type: Improvement
>          Components: doc
>            Reporter: Skye Wanderman-Milne
>            Assignee: Skye Wanderman-Milne
>              Labels: documentation
>             Fix For: 1.7.3
>
>         Attachments: AVRO-1202.0.patch, AVRO-1202.0.tar.gz, AVRO-1202.1.patch, AVRO-1202.1.tar.gz, AVRO-1202.2.patch, AVRO-1202.2.tar.gz.tar.gz, AVRO-1202.patch
>
>
> The Avro documentation is currently not very beginner-friendly -- it's hard to figure out basics like code generation, de/serialization, etc. We should write a "Getting started" guide and similar documentation for those new to Avro, as well as include some example code with the docs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (AVRO-1202) Add "Getting started" guide to documentation

Posted by "Jeff Kolesky (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/AVRO-1202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13494396#comment-13494396 ] 

Jeff Kolesky commented on AVRO-1202:
------------------------------------

Just a quick question about creating the {{User}} object.  Is it not recommended to use the builder like so:

{code}
User user = User.newBuilder().
    setName("Alyssa").
    setFavoriteNumber(256).
    build();
{code}

I was under the impression that was the correct way to build {{SpecificRecord}} objects that had been generated from schemas so that defaults would be appropriately set and fields appropriately validated.
                
> Add "Getting started" guide to documentation
> --------------------------------------------
>
>                 Key: AVRO-1202
>                 URL: https://issues.apache.org/jira/browse/AVRO-1202
>             Project: Avro
>          Issue Type: Improvement
>          Components: doc
>            Reporter: Skye Wanderman-Milne
>            Assignee: Skye Wanderman-Milne
>              Labels: documentation
>             Fix For: 1.7.3
>
>         Attachments: AVRO-1202.0.patch, AVRO-1202.0.tar.gz, AVRO-1202.patch
>
>
> The Avro documentation is currently not very beginner-friendly -- it's hard to figure out basics like code generation, de/serialization, etc. We should write a "Getting started" guide and similar documentation for those new to Avro, as well as include some example code with the docs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (AVRO-1202) Add "Getting started" guide to documentation

Posted by "Skye Wanderman-Milne (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/AVRO-1202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Skye Wanderman-Milne updated AVRO-1202:
---------------------------------------

    Attachment: AVRO-1202.0.tar.gz
                AVRO-1202.0.patch

Here's what I have so far. I've written rough drafts of "Getting started" pages for Java and Python, and created an "examples" directory containing the code from the guides. So far the guides go over a very basic schema, the Maven plugin, code generation using avro-tools, and basic data file reading/writing.  Personally, I don't think we should add much more material to these pages -- I like the idea of a short, self-contained example that just gets you up and running (a "5-minute intro"). I do think we should add (and link to) other pages going over the topics Doug suggests.

I'm just learning Avro myself as I write this guide, so any feedback is appreciated! In particular, I'm not sure how best to include the example code -- I created a Maven project for the Java example in order to give an example POM and to make it easy to build and run, but I could also give code/instructions for downloading the required jars and compiling the code manually.

My immediate next steps are writing an MR page and a more complete schema example.

I've included a patch as well as a .tar.gz containing the new HTML files and the example code directory for convenience.

                
> Add "Getting started" guide to documentation
> --------------------------------------------
>
>                 Key: AVRO-1202
>                 URL: https://issues.apache.org/jira/browse/AVRO-1202
>             Project: Avro
>          Issue Type: Improvement
>          Components: doc
>            Reporter: Skye Wanderman-Milne
>              Labels: documentation
>             Fix For: 1.7.3
>
>         Attachments: AVRO-1202.0.patch, AVRO-1202.0.tar.gz
>
>
> The Avro documentation is currently not very beginner-friendly -- it's hard to figure out basics like code generation, de/serialization, etc. We should write a "Getting started" guide and similar documentation for those new to Avro, as well as include some example code with the docs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (AVRO-1202) Add "Getting started" guide to documentation

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/AVRO-1202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Doug Cutting updated AVRO-1202:
-------------------------------

        Assignee: Skye Wanderman-Milne
    Hadoop Flags: Reviewed
          Status: Patch Available  (was: Open)

Unless someone objects, I'll commit this soon.
                
> Add "Getting started" guide to documentation
> --------------------------------------------
>
>                 Key: AVRO-1202
>                 URL: https://issues.apache.org/jira/browse/AVRO-1202
>             Project: Avro
>          Issue Type: Improvement
>          Components: doc
>            Reporter: Skye Wanderman-Milne
>            Assignee: Skye Wanderman-Milne
>              Labels: documentation
>             Fix For: 1.7.3
>
>         Attachments: AVRO-1202.0.patch, AVRO-1202.0.tar.gz, AVRO-1202.patch
>
>
> The Avro documentation is currently not very beginner-friendly -- it's hard to figure out basics like code generation, de/serialization, etc. We should write a "Getting started" guide and similar documentation for those new to Avro, as well as include some example code with the docs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (AVRO-1202) Add "Getting started" guide to documentation

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/AVRO-1202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13497384#comment-13497384 ] 

Doug Cutting commented on AVRO-1202:
------------------------------------

My instinct is to commit this as-is, then we can improve it with subsequent patches.  Does anyone object?

As for builder-or-not, the tradeoff is primarily that with the builder one gets error messages sooner, while without it one gets better performance.  Without the builder the data is validated as the record is serialized, while with the builder each value is checked as it is set but a copy of the datastructure must be made before it is written which can impact performance.  Also, a builder will fill in default values automatically from the schema while the constructor & setters will not.
                
> Add "Getting started" guide to documentation
> --------------------------------------------
>
>                 Key: AVRO-1202
>                 URL: https://issues.apache.org/jira/browse/AVRO-1202
>             Project: Avro
>          Issue Type: Improvement
>          Components: doc
>            Reporter: Skye Wanderman-Milne
>            Assignee: Skye Wanderman-Milne
>              Labels: documentation
>             Fix For: 1.7.3
>
>         Attachments: AVRO-1202.0.patch, AVRO-1202.0.tar.gz, AVRO-1202.1.patch, AVRO-1202.1.tar.gz, AVRO-1202.patch
>
>
> The Avro documentation is currently not very beginner-friendly -- it's hard to figure out basics like code generation, de/serialization, etc. We should write a "Getting started" guide and similar documentation for those new to Avro, as well as include some example code with the docs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (AVRO-1202) Add "Getting started" guide to documentation

Posted by "Skye Wanderman-Milne (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/AVRO-1202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Skye Wanderman-Milne updated AVRO-1202:
---------------------------------------

    Attachment: AVRO-1202.1.tar.gz
                AVRO-1202.1.patch

Thanks for the feedback everyone! I added some minor fixes (removing whitespace, adding more formatting tags, etc.). I've also included all the generated docs files, including css, in the .tar.gz so you can actually see the formatting.

If someone can verify that using a builder is the preferred way to construct Avro objects, I'll change the examples accordingly.

Jeff, thanks for the code example. I think for the getting started guides we should keep it simple and stick to using one schema, but a separate page going over your example with multiple schemas would be very useful if you want to write one up. (Maybe flesh it out a little to go over how this works, cases where this does/doesn't work, etc. -- I actually don't know much about using multiple schemas myself :))
                
> Add "Getting started" guide to documentation
> --------------------------------------------
>
>                 Key: AVRO-1202
>                 URL: https://issues.apache.org/jira/browse/AVRO-1202
>             Project: Avro
>          Issue Type: Improvement
>          Components: doc
>            Reporter: Skye Wanderman-Milne
>            Assignee: Skye Wanderman-Milne
>              Labels: documentation
>             Fix For: 1.7.3
>
>         Attachments: AVRO-1202.0.patch, AVRO-1202.0.tar.gz, AVRO-1202.1.patch, AVRO-1202.1.tar.gz, AVRO-1202.patch
>
>
> The Avro documentation is currently not very beginner-friendly -- it's hard to figure out basics like code generation, de/serialization, etc. We should write a "Getting started" guide and similar documentation for those new to Avro, as well as include some example code with the docs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (AVRO-1202) Add "Getting started" guide to documentation

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/AVRO-1202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Doug Cutting updated AVRO-1202:
-------------------------------

    Attachment: AVRO-1202.patch

Skye, this looks great!  I made only one change: I replaced the hardcoded 1.7.2 version with &AvroVersion; so that the current release number is included and we don't have to update the documentation by hand for each release.
                
> Add "Getting started" guide to documentation
> --------------------------------------------
>
>                 Key: AVRO-1202
>                 URL: https://issues.apache.org/jira/browse/AVRO-1202
>             Project: Avro
>          Issue Type: Improvement
>          Components: doc
>            Reporter: Skye Wanderman-Milne
>              Labels: documentation
>             Fix For: 1.7.3
>
>         Attachments: AVRO-1202.0.patch, AVRO-1202.0.tar.gz, AVRO-1202.patch
>
>
> The Avro documentation is currently not very beginner-friendly -- it's hard to figure out basics like code generation, de/serialization, etc. We should write a "Getting started" guide and similar documentation for those new to Avro, as well as include some example code with the docs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira