You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@avro.apache.org by "Harsh J Chouraria (JIRA)" <ji...@apache.org> on 2010/10/24 16:52:19 UTC

[jira] Created: (AVRO-682) Expose the DataFile's metadata entirely

Expose the DataFile's metadata entirely
---------------------------------------

                 Key: AVRO-682
                 URL: https://issues.apache.org/jira/browse/AVRO-682
             Project: Avro
          Issue Type: Improvement
          Components: java
    Affects Versions: 1.4.1
         Environment: Linux, Java 1.6
            Reporter: Harsh J Chouraria
            Assignee: Harsh J Chouraria
            Priority: Minor
             Fix For: 1.5.0


Right now, the DataFileReader (DataFileStream actually) only allows one to query the meta data of a file by issuing a key. A user who does not know what metadata may be stored in the file, has no way to find out by getting a list/map of all there is. Perhaps we should provide a way for the user to retrieve global metadata info to query it back for values.

Attached a patch (initial) that simply exposes the HashMap that contains the metadata after initialization of the data file reader.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (AVRO-682) Expose the DataFile's metadata entirely

Posted by "Harsh J Chouraria (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/AVRO-682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Harsh J Chouraria updated AVRO-682:
-----------------------------------

    Attachment: avro.metadata.datafile.r2.diff

Alternative patch that gives a mere list of keys to use. I guess it'd come with some reserved Avro DF keys also, which prevents me from writing a size or element-compare test case for the same method (as I do not know how many/what all reserved keys get added -- I think codec is one).

But the list should be usable, I guess. People could ignore "avro." stuff. Or should we filter those out?

> Expose the DataFile's metadata entirely
> ---------------------------------------
>
>                 Key: AVRO-682
>                 URL: https://issues.apache.org/jira/browse/AVRO-682
>             Project: Avro
>          Issue Type: Improvement
>          Components: java
>    Affects Versions: 1.4.1
>         Environment: Linux, Java 1.6
>            Reporter: Harsh J Chouraria
>            Assignee: Harsh J Chouraria
>            Priority: Minor
>             Fix For: 1.5.0
>
>         Attachments: avro.metadata.datafile.r1.diff, avro.metadata.datafile.r1.diff, avro.metadata.datafile.r2.diff
>
>   Original Estimate: 0.02h
>  Remaining Estimate: 0.02h
>
> Right now, the DataFileReader (DataFileStream actually) only allows one to query the meta data of a file by issuing a key. A user who does not know what metadata may be stored in the file, has no way to find out by getting a list/map of all there is. Perhaps we should provide a way for the user to retrieve global metadata info to query it back for values.
> Attached a patch (initial) that simply exposes the HashMap that contains the metadata after initialization of the data file reader.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (AVRO-682) Expose the DataFile's metadata entirely

Posted by "Harsh J Chouraria (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/AVRO-682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Harsh J Chouraria updated AVRO-682:
-----------------------------------

    Attachment: avro.metadata.datafile.r1.diff

Patch that adds a method such that the entire metadata map may be recovered via DataFileReader.

> Expose the DataFile's metadata entirely
> ---------------------------------------
>
>                 Key: AVRO-682
>                 URL: https://issues.apache.org/jira/browse/AVRO-682
>             Project: Avro
>          Issue Type: Improvement
>          Components: java
>    Affects Versions: 1.4.1
>         Environment: Linux, Java 1.6
>            Reporter: Harsh J Chouraria
>            Assignee: Harsh J Chouraria
>            Priority: Minor
>             Fix For: 1.5.0
>
>         Attachments: avro.metadata.datafile.r1.diff
>
>   Original Estimate: 0.02h
>  Remaining Estimate: 0.02h
>
> Right now, the DataFileReader (DataFileStream actually) only allows one to query the meta data of a file by issuing a key. A user who does not know what metadata may be stored in the file, has no way to find out by getting a list/map of all there is. Perhaps we should provide a way for the user to retrieve global metadata info to query it back for values.
> Attached a patch (initial) that simply exposes the HashMap that contains the metadata after initialization of the data file reader.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (AVRO-682) Expose the DataFile's metadata entirely

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/AVRO-682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12924643#action_12924643 ] 

Doug Cutting commented on AVRO-682:
-----------------------------------

This looks great.  One addition:  can we make the list umodifiable?  After the values are all added, we can call
{code} 
  metaKeyList = Collections.unmodifiableList(metaKeyList);
{code}
Also, we should add some tests that call this new method.


> Expose the DataFile's metadata entirely
> ---------------------------------------
>
>                 Key: AVRO-682
>                 URL: https://issues.apache.org/jira/browse/AVRO-682
>             Project: Avro
>          Issue Type: Improvement
>          Components: java
>    Affects Versions: 1.4.1
>         Environment: Linux, Java 1.6
>            Reporter: Harsh J Chouraria
>            Assignee: Harsh J Chouraria
>            Priority: Minor
>             Fix For: 1.5.0
>
>         Attachments: avro.metadata.datafile.r1.diff, avro.metadata.datafile.r1.diff, avro.metadata.datafile.r2.diff
>
>   Original Estimate: 0.02h
>  Remaining Estimate: 0.02h
>
> Right now, the DataFileReader (DataFileStream actually) only allows one to query the meta data of a file by issuing a key. A user who does not know what metadata may be stored in the file, has no way to find out by getting a list/map of all there is. Perhaps we should provide a way for the user to retrieve global metadata info to query it back for values.
> Attached a patch (initial) that simply exposes the HashMap that contains the metadata after initialization of the data file reader.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (AVRO-682) Expose the DataFile's metadata entirely

Posted by "Harsh J Chouraria (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/AVRO-682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Harsh J Chouraria updated AVRO-682:
-----------------------------------

    Attachment: avro.metadata.datafile.r3.diff

Updated patch to reflect the unmodifiable change. And one simple test case in TestDataFileMeta checking if the keys list contains the key whose value is sought.

Any more tests required?

> Expose the DataFile's metadata entirely
> ---------------------------------------
>
>                 Key: AVRO-682
>                 URL: https://issues.apache.org/jira/browse/AVRO-682
>             Project: Avro
>          Issue Type: Improvement
>          Components: java
>    Affects Versions: 1.4.1
>         Environment: Linux, Java 1.6
>            Reporter: Harsh J Chouraria
>            Assignee: Harsh J Chouraria
>            Priority: Minor
>             Fix For: 1.5.0
>
>         Attachments: avro.metadata.datafile.r1.diff, avro.metadata.datafile.r1.diff, avro.metadata.datafile.r2.diff, avro.metadata.datafile.r3.diff
>
>   Original Estimate: 0.02h
>  Remaining Estimate: 0.02h
>
> Right now, the DataFileReader (DataFileStream actually) only allows one to query the meta data of a file by issuing a key. A user who does not know what metadata may be stored in the file, has no way to find out by getting a list/map of all there is. Perhaps we should provide a way for the user to retrieve global metadata info to query it back for values.
> Attached a patch (initial) that simply exposes the HashMap that contains the metadata after initialization of the data file reader.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (AVRO-682) Expose the DataFile's metadata entirely

Posted by "Harsh J Chouraria (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/AVRO-682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Harsh J Chouraria updated AVRO-682:
-----------------------------------

    Attachment: avro.metadata.datafile.r1.diff

Oops, bad doc-comment. Fixed in this re-up.

> Expose the DataFile's metadata entirely
> ---------------------------------------
>
>                 Key: AVRO-682
>                 URL: https://issues.apache.org/jira/browse/AVRO-682
>             Project: Avro
>          Issue Type: Improvement
>          Components: java
>    Affects Versions: 1.4.1
>         Environment: Linux, Java 1.6
>            Reporter: Harsh J Chouraria
>            Assignee: Harsh J Chouraria
>            Priority: Minor
>             Fix For: 1.5.0
>
>         Attachments: avro.metadata.datafile.r1.diff, avro.metadata.datafile.r1.diff
>
>   Original Estimate: 0.02h
>  Remaining Estimate: 0.02h
>
> Right now, the DataFileReader (DataFileStream actually) only allows one to query the meta data of a file by issuing a key. A user who does not know what metadata may be stored in the file, has no way to find out by getting a list/map of all there is. Perhaps we should provide a way for the user to retrieve global metadata info to query it back for values.
> Attached a patch (initial) that simply exposes the HashMap that contains the metadata after initialization of the data file reader.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (AVRO-682) Expose the DataFile's metadata entirely

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/AVRO-682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Doug Cutting resolved AVRO-682.
-------------------------------

      Resolution: Fixed
    Hadoop Flags: [Reviewed]

I just committed this.  Thanks, Harsh.

> Expose the DataFile's metadata entirely
> ---------------------------------------
>
>                 Key: AVRO-682
>                 URL: https://issues.apache.org/jira/browse/AVRO-682
>             Project: Avro
>          Issue Type: Improvement
>          Components: java
>    Affects Versions: 1.4.1
>         Environment: Linux, Java 1.6
>            Reporter: Harsh J Chouraria
>            Assignee: Harsh J Chouraria
>            Priority: Minor
>             Fix For: 1.5.0
>
>         Attachments: avro.metadata.datafile.r1.diff, avro.metadata.datafile.r1.diff, avro.metadata.datafile.r2.diff, avro.metadata.datafile.r3.diff
>
>   Original Estimate: 0.02h
>  Remaining Estimate: 0.02h
>
> Right now, the DataFileReader (DataFileStream actually) only allows one to query the meta data of a file by issuing a key. A user who does not know what metadata may be stored in the file, has no way to find out by getting a list/map of all there is. Perhaps we should provide a way for the user to retrieve global metadata info to query it back for values.
> Attached a patch (initial) that simply exposes the HashMap that contains the metadata after initialization of the data file reader.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (AVRO-682) Expose the DataFile's metadata entirely

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/AVRO-682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12924644#action_12924644 ] 

Doug Cutting commented on AVRO-682:
-----------------------------------

Also, I think it's fine to expose the "avro." keys.

> Expose the DataFile's metadata entirely
> ---------------------------------------
>
>                 Key: AVRO-682
>                 URL: https://issues.apache.org/jira/browse/AVRO-682
>             Project: Avro
>          Issue Type: Improvement
>          Components: java
>    Affects Versions: 1.4.1
>         Environment: Linux, Java 1.6
>            Reporter: Harsh J Chouraria
>            Assignee: Harsh J Chouraria
>            Priority: Minor
>             Fix For: 1.5.0
>
>         Attachments: avro.metadata.datafile.r1.diff, avro.metadata.datafile.r1.diff, avro.metadata.datafile.r2.diff
>
>   Original Estimate: 0.02h
>  Remaining Estimate: 0.02h
>
> Right now, the DataFileReader (DataFileStream actually) only allows one to query the meta data of a file by issuing a key. A user who does not know what metadata may be stored in the file, has no way to find out by getting a list/map of all there is. Perhaps we should provide a way for the user to retrieve global metadata info to query it back for values.
> Attached a patch (initial) that simply exposes the HashMap that contains the metadata after initialization of the data file reader.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (AVRO-682) Expose the DataFile's metadata entirely

Posted by "Patrick Linehan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/AVRO-682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12925138#action_12925138 ] 

Patrick Linehan commented on AVRO-682:
--------------------------------------

Thanks, guys!  Looks great.

> Expose the DataFile's metadata entirely
> ---------------------------------------
>
>                 Key: AVRO-682
>                 URL: https://issues.apache.org/jira/browse/AVRO-682
>             Project: Avro
>          Issue Type: Improvement
>          Components: java
>    Affects Versions: 1.4.1
>         Environment: Linux, Java 1.6
>            Reporter: Harsh J Chouraria
>            Assignee: Harsh J Chouraria
>            Priority: Minor
>             Fix For: 1.5.0
>
>         Attachments: avro.metadata.datafile.r1.diff, avro.metadata.datafile.r1.diff, avro.metadata.datafile.r2.diff, avro.metadata.datafile.r3.diff
>
>   Original Estimate: 0.02h
>  Remaining Estimate: 0.02h
>
> Right now, the DataFileReader (DataFileStream actually) only allows one to query the meta data of a file by issuing a key. A user who does not know what metadata may be stored in the file, has no way to find out by getting a list/map of all there is. Perhaps we should provide a way for the user to retrieve global metadata info to query it back for values.
> Attached a patch (initial) that simply exposes the HashMap that contains the metadata after initialization of the data file reader.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.