You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Paul Rogers (JIRA)" <ji...@apache.org> on 2017/12/19 22:19:00 UTC

[jira] [Updated] (DRILL-6046) Define semantics of vector metadata

     [ https://issues.apache.org/jira/browse/DRILL-6046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Paul Rogers updated DRILL-6046:
-------------------------------
    Description: 
Vectors provide metadata in the form of the {{MaterializedField}}. This class has evolved in an ad-hoc fashion over time, resulting in inconsistent behavior across vectors. The inconsistent behavior causes bugs and slow development because each vector follows different rules. Consistent behavior would, by contrast, lead to faster development and fewer bugs by reducing the number of variations that code must handle.

Issues include:

* Map vectors, but not lists, can create contents given a list of children in the {{MaterializedField}} passed to the constructor.
* {{MaterializedField}} appears to want to be immutable, but it does allow changing of children. Unions also want to change the list of subtypes, but that is in the immutable {{MajorType}}, causing unions to rebuild and replace its {{MaterializedField}} on addition of a new type. By contrast, maps do not replace the field, they just add children.
* Container vectors (maps, unions, lists) hold references to child {{MaterializedFields}}. But, because unions replace their fields, parents become out of sync since they point to the old, version before the update, causing inconsistent metadata, so that code cannot trust the metadata.
* Lists and maps, but not unions, list their children in the field.
* Nullable types, but not repeated types, include internal vectors in their list of children. 
* When creating a map, as discussed above, the map creates children based on the field. But, the constructor clones the field so that the actual field in the map is not the one passed in. As a result, a parent vector, which holds a child map, points to the original map field, not the cloned one, leading to inconsistency if the child map later adds more fields.

  was:
Vectors provide metadata in the form of the {{MaterializedField}}. This class has evolved in an ad-hoc fashion over time, resulting in inconsistent behavior across vectors. The inconsistent behavior causes bugs and slow development because each vector follows different rules. Consistent behavior would, by contrast, lead to faster development and fewer bugs by reducing the number of variations that code must handle.

Issues include:

* Map vectors, but not lists, can create contents given a list of children in the {{MaterializedField}} passed to the constructor.
* {{MaterializedField}} appears to want to be immutable, but it does allow changing of children. Unions also want to change the list of subtypes, but that is in the immutable {{MajorType}}, causing unions to rebuild and replace its {{MaterializedField}} on addition of a new type. By contrast, maps do not replace the field, they just add children.
* Container vectors (maps, unions, lists) hold references to child {{MaterializedFields}}. But, because unions replace their fields, parents become out of sync since they point to the old, version before the update, causing inconsistent metadata, so that code cannot trust the metadata.
* Lists and maps, but not unions, list their children in the field.
* Nullable types, but not repeated types, include internal vectors in their list of children. 


> Define semantics of vector metadata
> -----------------------------------
>
>                 Key: DRILL-6046
>                 URL: https://issues.apache.org/jira/browse/DRILL-6046
>             Project: Apache Drill
>          Issue Type: Improvement
>    Affects Versions: 1.10.0
>            Reporter: Paul Rogers
>            Priority: Minor
>
> Vectors provide metadata in the form of the {{MaterializedField}}. This class has evolved in an ad-hoc fashion over time, resulting in inconsistent behavior across vectors. The inconsistent behavior causes bugs and slow development because each vector follows different rules. Consistent behavior would, by contrast, lead to faster development and fewer bugs by reducing the number of variations that code must handle.
> Issues include:
> * Map vectors, but not lists, can create contents given a list of children in the {{MaterializedField}} passed to the constructor.
> * {{MaterializedField}} appears to want to be immutable, but it does allow changing of children. Unions also want to change the list of subtypes, but that is in the immutable {{MajorType}}, causing unions to rebuild and replace its {{MaterializedField}} on addition of a new type. By contrast, maps do not replace the field, they just add children.
> * Container vectors (maps, unions, lists) hold references to child {{MaterializedFields}}. But, because unions replace their fields, parents become out of sync since they point to the old, version before the update, causing inconsistent metadata, so that code cannot trust the metadata.
> * Lists and maps, but not unions, list their children in the field.
> * Nullable types, but not repeated types, include internal vectors in their list of children. 
> * When creating a map, as discussed above, the map creates children based on the field. But, the constructor clones the field so that the actual field in the map is not the one passed in. As a result, a parent vector, which holds a child map, points to the original map field, not the cloned one, leading to inconsistency if the child map later adds more fields.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)