You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@avro.apache.org by "Ron Bodkin (JIRA)" <ji...@apache.org> on 2010/09/17 19:56:37 UTC

[jira] Created: (AVRO-670) Allow DataFileWriteTool to accept schema files as input

Allow DataFileWriteTool to accept schema files as input
-------------------------------------------------------

                 Key: AVRO-670
                 URL: https://issues.apache.org/jira/browse/AVRO-670
             Project: Avro
          Issue Type: Improvement
            Reporter: Ron Bodkin


For non-trivial schemas, it's difficult to pass them inline as a command line argument. I made a patch to use two different arguments: instead of having the first argument be the schema you would now use --schema-file file or --schema schema and then have one other argument (the input JSON file)


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (AVRO-670) Allow DataFileWriteTool to accept schema files as input

Posted by "Philip Zeyliger (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/AVRO-670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Philip Zeyliger updated AVRO-670:
---------------------------------

          Status: Resolved  (was: Patch Available)
    Hadoop Flags: [Incompatible change, Reviewed]
    Release Note: The "fromjson" tool now requires either a --schema or --schema-file command-line argument to specify the schema.  Previously, the schema was to be specified as the first argument.
        Assignee: Ron Bodkin
      Resolution: Fixed

Hi Ron,

Thanks for your contribution!  (And congratulations on your first contribution to AVRO.)  I've committed it.

I made two very minor changes:

I fixed one "checkstyle" bug ("ant test" runs checkstyle) 

bq. [checkstyle] /data/6/philip/avro-svn/lang/java/src/java/org/apache/avro/tool/DataFileWriteTool.java:121:69: Redundant throws: 'FileNotFoundException' is subclass of 'IOException'.

I also added a "--schema" to src/test/bin/test_tools.sh in one place, to fix a test failure.

> Allow DataFileWriteTool to accept schema files as input
> -------------------------------------------------------
>
>                 Key: AVRO-670
>                 URL: https://issues.apache.org/jira/browse/AVRO-670
>             Project: Avro
>          Issue Type: Improvement
>            Reporter: Ron Bodkin
>            Assignee: Ron Bodkin
>             Fix For: 1.5.0
>
>         Attachments: AVRO-670.patch, datafilewritefile.patch
>
>
> For non-trivial schemas, it's difficult to pass them inline as a command line argument. I made a patch to use two different arguments: instead of having the first argument be the schema you would now use -schema-file file or -schema schema and then have one other argument (the input JSON file)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (AVRO-670) Allow DataFileWriteTool to accept schema files as input

Posted by "Ron Bodkin (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/AVRO-670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ron Bodkin updated AVRO-670:
----------------------------

    Description: 
For non-trivial schemas, it's difficult to pass them inline as a command line argument. I made a patch to use two different arguments: instead of having the first argument be the schema you would now use -schema-file file or -schema schema and then have one other argument (the input JSON file)


  was:
For non-trivial schemas, it's difficult to pass them inline as a command line argument. I made a patch to use two different arguments: instead of having the first argument be the schema you would now use --schema-file file or --schema schema and then have one other argument (the input JSON file)



> Allow DataFileWriteTool to accept schema files as input
> -------------------------------------------------------
>
>                 Key: AVRO-670
>                 URL: https://issues.apache.org/jira/browse/AVRO-670
>             Project: Avro
>          Issue Type: Improvement
>            Reporter: Ron Bodkin
>             Fix For: 1.5.0
>
>         Attachments: datafilewritefile.patch
>
>
> For non-trivial schemas, it's difficult to pass them inline as a command line argument. I made a patch to use two different arguments: instead of having the first argument be the schema you would now use -schema-file file or -schema schema and then have one other argument (the input JSON file)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (AVRO-670) Allow DataFileWriteTool to accept schema files as input

Posted by "Ron Bodkin (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/AVRO-670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ron Bodkin updated AVRO-670:
----------------------------

    Attachment: datafilewritefile.patch

The patch

> Allow DataFileWriteTool to accept schema files as input
> -------------------------------------------------------
>
>                 Key: AVRO-670
>                 URL: https://issues.apache.org/jira/browse/AVRO-670
>             Project: Avro
>          Issue Type: Improvement
>            Reporter: Ron Bodkin
>             Fix For: 1.5.0
>
>         Attachments: datafilewritefile.patch
>
>
> For non-trivial schemas, it's difficult to pass them inline as a command line argument. I made a patch to use two different arguments: instead of having the first argument be the schema you would now use --schema-file file or --schema schema and then have one other argument (the input JSON file)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (AVRO-670) Allow DataFileWriteTool to accept schema files as input

Posted by "Philip Zeyliger (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/AVRO-670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12910712#action_12910712 ] 

Philip Zeyliger commented on AVRO-670:
--------------------------------------

Hi Ron,

The idea and implementation look good.  We should note that it's an incompatible change in terms of the tools command line API, which I think is fine, but should be noted.

I couldn't get the patch to apply cleanly.  Typically (see https://cwiki.apache.org/AVRO/how-to-contribute.html) patches should be generated at top-level (so, lang/java/... should be the path that's being patched).  Many folks name their patches AVRO-670.patch, too, though I'm not a stickler there.  I also recommend turning off Eclipse's autoimport.  I think our style guide discourages star imports (e.g., +import joptsimple.*; is not something your patch should have introduced).  Besides that, "patch" gave me a rejects file---how did you generate your patch?

I was slightly surprised that 'lang/java/src/test/bin/test_tools.sh' (which is run by the ant target 'test-tools') doesn't exercise this code.  Would be good to make sure that the tests for this code don't need any modification.  (There's a java test somewhere, but it might not exercise the command-line parsing code; I haven't looked.)

Could you upload a new patch without the import changes and re-generated against the root of the repo?

Thanks!

> Allow DataFileWriteTool to accept schema files as input
> -------------------------------------------------------
>
>                 Key: AVRO-670
>                 URL: https://issues.apache.org/jira/browse/AVRO-670
>             Project: Avro
>          Issue Type: Improvement
>            Reporter: Ron Bodkin
>             Fix For: 1.5.0
>
>         Attachments: datafilewritefile.patch
>
>
> For non-trivial schemas, it's difficult to pass them inline as a command line argument. I made a patch to use two different arguments: instead of having the first argument be the schema you would now use -schema-file file or -schema schema and then have one other argument (the input JSON file)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (AVRO-670) Allow DataFileWriteTool to accept schema files as input

Posted by "Ron Bodkin (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/AVRO-670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12910719#action_12910719 ] 

Ron Bodkin commented on AVRO-670:
---------------------------------

I am working on fixing the one test that actually does test this code (it is indeed incompatible). I'll submit an updated patch with that and the imports less modified. I just used svn diff to generate the patch.

> Allow DataFileWriteTool to accept schema files as input
> -------------------------------------------------------
>
>                 Key: AVRO-670
>                 URL: https://issues.apache.org/jira/browse/AVRO-670
>             Project: Avro
>          Issue Type: Improvement
>            Reporter: Ron Bodkin
>             Fix For: 1.5.0
>
>         Attachments: datafilewritefile.patch
>
>
> For non-trivial schemas, it's difficult to pass them inline as a command line argument. I made a patch to use two different arguments: instead of having the first argument be the schema you would now use -schema-file file or -schema schema and then have one other argument (the input JSON file)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (AVRO-670) Allow DataFileWriteTool to accept schema files as input

Posted by "Ron Bodkin (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/AVRO-670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ron Bodkin updated AVRO-670:
----------------------------

    Attachment: AVRO-670.patch

Here's a revised patch from top-level, including updated tests that test the new argument as well.


> Allow DataFileWriteTool to accept schema files as input
> -------------------------------------------------------
>
>                 Key: AVRO-670
>                 URL: https://issues.apache.org/jira/browse/AVRO-670
>             Project: Avro
>          Issue Type: Improvement
>            Reporter: Ron Bodkin
>             Fix For: 1.5.0
>
>         Attachments: AVRO-670.patch, datafilewritefile.patch
>
>
> For non-trivial schemas, it's difficult to pass them inline as a command line argument. I made a patch to use two different arguments: instead of having the first argument be the schema you would now use -schema-file file or -schema schema and then have one other argument (the input JSON file)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (AVRO-670) Allow DataFileWriteTool to accept schema files as input

Posted by "Ron Bodkin (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/AVRO-670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ron Bodkin updated AVRO-670:
----------------------------

           Status: Patch Available  (was: Open)
    Fix Version/s: 1.5.0

> Allow DataFileWriteTool to accept schema files as input
> -------------------------------------------------------
>
>                 Key: AVRO-670
>                 URL: https://issues.apache.org/jira/browse/AVRO-670
>             Project: Avro
>          Issue Type: Improvement
>            Reporter: Ron Bodkin
>             Fix For: 1.5.0
>
>
> For non-trivial schemas, it's difficult to pass them inline as a command line argument. I made a patch to use two different arguments: instead of having the first argument be the schema you would now use --schema-file file or --schema schema and then have one other argument (the input JSON file)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.