You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hcatalog-commits@incubator.apache.org by "Jakob Homan (JIRA)" <ji...@apache.org> on 2011/05/02 22:27:03 UTC

[jira] [Created] (HCATALOG-17) Shouldn't be able to add an HCatFieldSchema with the same name as existing

Shouldn't be able to add an HCatFieldSchema with the same name as existing
--------------------------------------------------------------------------

                 Key: HCATALOG-17
                 URL: https://issues.apache.org/jira/browse/HCATALOG-17
             Project: HCatalog
          Issue Type: Bug
            Reporter: Jakob Homan
            Assignee: Jakob Homan


(cloning from https://github.com/yahoo/howl/pull/6)
As noted in HowlSchema.java, one should not be able to append an a field schema with the same name. The code says that this requires Comparable, which is not correct, since we're not doing ordering. Technically, this requires a correct equals, but since we don't want multiple fields with the same name in a schema (particularly since we index it via the name in the accompanying map), the correct check is just based on the name. This adds the check and throws a HowlException if this occurs.

Unit test to verify is included. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HCATALOG-17) Shouldn't be able to add an HCatFieldSchema with the same name as existing

Posted by "Alan Gates (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HCATALOG-17?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13028391#comment-13028391 ] 

Alan Gates commented on HCATALOG-17:
------------------------------------

When checking if the field name already exists its doing a linear search.  Since it will do this for every field you add, that means we have sum(1..n) comparison operations every time we build a schema.  Assuming memory isn't at a premium here (we should only be keeping one copy of the schema for each partition), we should have a separate hash that stores the field names so we can do n comparison operations instead.

> Shouldn't be able to add an HCatFieldSchema with the same name as existing
> --------------------------------------------------------------------------
>
>                 Key: HCATALOG-17
>                 URL: https://issues.apache.org/jira/browse/HCATALOG-17
>             Project: HCatalog
>          Issue Type: Bug
>            Reporter: Jakob Homan
>            Assignee: Jakob Homan
>         Attachments: dupe_fields.patch
>
>
> (cloning from https://github.com/yahoo/howl/pull/6)
> As noted in HowlSchema.java, one should not be able to append an a field schema with the same name. The code says that this requires Comparable, which is not correct, since we're not doing ordering. Technically, this requires a correct equals, but since we don't want multiple fields with the same name in a schema (particularly since we index it via the name in the accompanying map), the correct check is just based on the name. This adds the check and throws a HowlException if this occurs.
> Unit test to verify is included. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HCATALOG-17) Shouldn't be able to add an HCatFieldSchema with the same name as existing

Posted by "Alan Gates (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HCATALOG-17?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alan Gates resolved HCATALOG-17.
--------------------------------

    Resolution: Fixed

HCAT-17-2.patch checked in.  Thanks Jacob.

> Shouldn't be able to add an HCatFieldSchema with the same name as existing
> --------------------------------------------------------------------------
>
>                 Key: HCATALOG-17
>                 URL: https://issues.apache.org/jira/browse/HCATALOG-17
>             Project: HCatalog
>          Issue Type: Bug
>            Reporter: Jakob Homan
>            Assignee: Jakob Homan
>         Attachments: HCAT-17-2.patch, dupe_fields.patch
>
>
> (cloning from https://github.com/yahoo/howl/pull/6)
> As noted in HowlSchema.java, one should not be able to append an a field schema with the same name. The code says that this requires Comparable, which is not correct, since we're not doing ordering. Technically, this requires a correct equals, but since we don't want multiple fields with the same name in a schema (particularly since we index it via the name in the accompanying map), the correct check is just based on the name. This adds the check and throws a HowlException if this occurs.
> Unit test to verify is included. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HCATALOG-17) Shouldn't be able to add an HCatFieldSchema with the same name as existing

Posted by "Jakob Homan (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HCATALOG-17?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jakob Homan updated HCATALOG-17:
--------------------------------

    Attachment: dupe_fields.patch

Same patch as on github.  Verified against trunk.  All tests pass, except TestHiveCompatibility, which fails without patch as well.

> Shouldn't be able to add an HCatFieldSchema with the same name as existing
> --------------------------------------------------------------------------
>
>                 Key: HCATALOG-17
>                 URL: https://issues.apache.org/jira/browse/HCATALOG-17
>             Project: HCatalog
>          Issue Type: Bug
>            Reporter: Jakob Homan
>            Assignee: Jakob Homan
>         Attachments: dupe_fields.patch
>
>
> (cloning from https://github.com/yahoo/howl/pull/6)
> As noted in HowlSchema.java, one should not be able to append an a field schema with the same name. The code says that this requires Comparable, which is not correct, since we're not doing ordering. Technically, this requires a correct equals, but since we don't want multiple fields with the same name in a schema (particularly since we index it via the name in the accompanying map), the correct check is just based on the name. This adds the check and throws a HowlException if this occurs.
> Unit test to verify is included. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HCATALOG-17) Shouldn't be able to add an HCatFieldSchema with the same name as existing

Posted by "Jakob Homan (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HCATALOG-17?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jakob Homan updated HCATALOG-17:
--------------------------------

    Attachment: HCAT-17-2.patch

If we're optimizing for this, it's not necessary to keep a separate hash, since {{fieldPositionMap}} is keyed off of the field name and {{containsKey}} provides constant time results (modulo key collisions - I checked the source to verify).  Updated patch to use this.  

Also added symmetrical null check in the constructor (as in the append method) since we're accepting a {{List}} interface, which allows null values.  

Also removed a check for null on {{fieldSchemas}} in the append method since {{fieldSchemas}} is a final field and you can't successfully construct an instance of {{HCatSchema}} with this field null.

> Shouldn't be able to add an HCatFieldSchema with the same name as existing
> --------------------------------------------------------------------------
>
>                 Key: HCATALOG-17
>                 URL: https://issues.apache.org/jira/browse/HCATALOG-17
>             Project: HCatalog
>          Issue Type: Bug
>            Reporter: Jakob Homan
>            Assignee: Jakob Homan
>         Attachments: HCAT-17-2.patch, dupe_fields.patch
>
>
> (cloning from https://github.com/yahoo/howl/pull/6)
> As noted in HowlSchema.java, one should not be able to append an a field schema with the same name. The code says that this requires Comparable, which is not correct, since we're not doing ordering. Technically, this requires a correct equals, but since we don't want multiple fields with the same name in a schema (particularly since we index it via the name in the accompanying map), the correct check is just based on the name. This adds the check and throws a HowlException if this occurs.
> Unit test to verify is included. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HCATALOG-17) Shouldn't be able to add an HCatFieldSchema with the same name as existing

Posted by "Alan Gates (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HCATALOG-17?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alan Gates updated HCATALOG-17:
-------------------------------

    Fix Version/s: 0.2

> Shouldn't be able to add an HCatFieldSchema with the same name as existing
> --------------------------------------------------------------------------
>
>                 Key: HCATALOG-17
>                 URL: https://issues.apache.org/jira/browse/HCATALOG-17
>             Project: HCatalog
>          Issue Type: Bug
>            Reporter: Jakob Homan
>            Assignee: Jakob Homan
>             Fix For: 0.2
>
>         Attachments: HCAT-17-2.patch, dupe_fields.patch
>
>
> (cloning from https://github.com/yahoo/howl/pull/6)
> As noted in HowlSchema.java, one should not be able to append an a field schema with the same name. The code says that this requires Comparable, which is not correct, since we're not doing ordering. Technically, this requires a correct equals, but since we don't want multiple fields with the same name in a schema (particularly since we index it via the name in the accompanying map), the correct check is just based on the name. This adds the check and throws a HowlException if this occurs.
> Unit test to verify is included. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira