You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@beam.apache.org by "Anton Kedin (JIRA)" <ji...@apache.org> on 2018/01/31 06:50:00 UTC
[jira] [Updated] (BEAM-3574) [SQL] Support schema qualifiers for field names

     [ https://issues.apache.org/jira/browse/BEAM-3574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Anton Kedin updated BEAM-3574:
------------------------------
    Description: 
Currently there are utility methods in BeamRecord to get field values by name, e.g. BeamRecord.getFieldValue(String name). Internally they call fieldNamesArrayList.indexOf(fieldName) to find the index of the field name.

This works as long as there is only one field with such name in the record. But when joining 2 records you can end up with duplicate field nameswithout any means of distinguishing them and getting a value from specific field by name. We don't keep any metadata in BeamRecordType to help identify a field in this case. 

It feels that this can lead to obscure bugs.

We probably should keep more detailed schema information attached to the fields, so that we could reference them using qualifiers like "[schemaA].[pcollectionB].[fieldC]".

 

  was:
Currently there are utility methods in BeamRecord to get field values by name, e.g. BeamRecord.getFieldValue(String name). Internally they call fieldNamesArrayList.indexOf(fieldName) to find the index of the field name.

This works as long as there is only one field with such name in the record. But when joining 2 records you can end up with duplicate fields without any means of distinguishing them and getting a value from specific field by name. We don't keep any metadata in BeamRecordType to help identify a field in this case. 

It feels that this can lead to obscure bugs.

We probably should keep more detailed schema information attached to the fields, so that we could reference them using qualifiers like "[schemaA].[pcollectionB].[fieldC]".

 


> [SQL] Support schema qualifiers for field names
> -----------------------------------------------
>
>                 Key: BEAM-3574
>                 URL: https://issues.apache.org/jira/browse/BEAM-3574
>             Project: Beam
>          Issue Type: Bug
>          Components: dsl-sql
>            Reporter: Anton Kedin
>            Priority: Major
>
> Currently there are utility methods in BeamRecord to get field values by name, e.g. BeamRecord.getFieldValue(String name). Internally they call fieldNamesArrayList.indexOf(fieldName) to find the index of the field name.
> This works as long as there is only one field with such name in the record. But when joining 2 records you can end up with duplicate field nameswithout any means of distinguishing them and getting a value from specific field by name. We don't keep any metadata in BeamRecordType to help identify a field in this case. 
> It feels that this can lead to obscure bugs.
> We probably should keep more detailed schema information attached to the fields, so that we could reference them using qualifiers like "[schemaA].[pcollectionB].[fieldC]".
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)