You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Gianmarco De Francisci Morales (JIRA)" <ji...@apache.org> on 2010/06/05 16:09:53 UTC

[jira] Created: (PIG-1440) Refactor org.apache.pig.data.DataType to use Enums instead of integer constants

Refactor org.apache.pig.data.DataType to use Enums instead of integer constants
-------------------------------------------------------------------------------

                 Key: PIG-1440
                 URL: https://issues.apache.org/jira/browse/PIG-1440
             Project: Pig
          Issue Type: Improvement
            Reporter: Gianmarco De Francisci Morales


Refactoring DataType to use Enums instead of integer constants would provide many benefits, including:

* Cleaner code
* Easier to iterate over Enums
* Easier to add new Enums without braking backwards compatibility
* Can use EnumMaps for easily link values to Enums
* Better support for translation from Enums to Strings and viceversa

Int (or byte in Pig's case) Enum pattern has several drawbacks as summarized here http://java.sun.com/j2se/1.5.0/docs/guide/language/enums.html

Drawbacks:
We have to explicitly convert Enum values to bytes when serializing. This can be done in DataReaderWriter.
Possibly higher overhead than simply using bytes.
Refactoring might be difficult.


Thoughts?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-1440) Refactor org.apache.pig.data.DataType to use Enums instead of integer constants

Posted by "Gianmarco De Francisci Morales (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Gianmarco De Francisci Morales updated PIG-1440:
------------------------------------------------

    Priority: Minor  (was: Major)

> Refactor org.apache.pig.data.DataType to use Enums instead of integer constants
> -------------------------------------------------------------------------------
>
>                 Key: PIG-1440
>                 URL: https://issues.apache.org/jira/browse/PIG-1440
>             Project: Pig
>          Issue Type: Improvement
>            Reporter: Gianmarco De Francisci Morales
>            Priority: Minor
>
> Refactoring DataType to use Enums instead of integer constants would provide many benefits, including:
> * Cleaner code
> * Easier to iterate over Enums
> * Easier to add new Enums without braking backwards compatibility
> * Can use EnumMaps for easily link values to Enums
> * Better support for translation from Enums to Strings and viceversa
> Int (or byte in Pig's case) Enum pattern has several drawbacks as summarized here http://java.sun.com/j2se/1.5.0/docs/guide/language/enums.html
> Drawbacks:
> We have to explicitly convert Enum values to bytes when serializing. This can be done in DataReaderWriter.
> Possibly higher overhead than simply using bytes.
> Refactoring might be difficult.
> Thoughts?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-1440) Refactor org.apache.pig.data.DataType to use Enums instead of integer constants

Posted by "Olga Natkovich (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-1440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12876076#action_12876076 ] 

Olga Natkovich commented on PIG-1440:
-------------------------------------

I think it is important t preserve backward compatibility in 0.8.0 to improve adoption. Our users in Yahoo had to go through a lot of pain to migrate their code to 0.7.0 and forcing user updates on every release is not a good idea in my oppinion. 

That being said, is it possible to achieve this without breaking comptaibility. Can we add enum side-by-side with the old constants and depricate them for now?

The other question is - are the benefits of this change worth the effort of refactoring. The code that uses the current constants is everywhere. The other question is the conversion cost on performance.

> Refactor org.apache.pig.data.DataType to use Enums instead of integer constants
> -------------------------------------------------------------------------------
>
>                 Key: PIG-1440
>                 URL: https://issues.apache.org/jira/browse/PIG-1440
>             Project: Pig
>          Issue Type: Improvement
>            Reporter: Gianmarco De Francisci Morales
>            Priority: Minor
>
> Refactoring DataType to use Enums instead of integer constants would provide many benefits, including:
> * Cleaner code
> * Easier to iterate over Enums
> * Easier to add new Enums without braking backwards compatibility
> * Can use EnumMaps for easily link values to Enums
> * Better support for translation from Enums to Strings and viceversa
> Int (or byte in Pig's case) Enum pattern has several drawbacks as summarized here http://java.sun.com/j2se/1.5.0/docs/guide/language/enums.html
> Drawbacks:
> We have to explicitly convert Enum values to bytes when serializing. This can be done in DataReaderWriter.
> Possibly higher overhead than simply using bytes.
> Refactoring might be difficult.
> Thoughts?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-1440) Refactor org.apache.pig.data.DataType to use Enums instead of integer constants

Posted by "Dmitriy V. Ryaboy (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-1440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12876011#action_12876011 ] 

Dmitriy V. Ryaboy commented on PIG-1440:
----------------------------------------

I think Enums are great for this, and have wished many a time that the types were Enums while working with Pig.

I do want to point out, though, that this will affect a lot of user code -- any EvalFunc that specifies a schema, any loadfunc that implements the metadata options, etc. Are we willing to break things for our users so soon after 0.7?



> Refactor org.apache.pig.data.DataType to use Enums instead of integer constants
> -------------------------------------------------------------------------------
>
>                 Key: PIG-1440
>                 URL: https://issues.apache.org/jira/browse/PIG-1440
>             Project: Pig
>          Issue Type: Improvement
>            Reporter: Gianmarco De Francisci Morales
>            Priority: Minor
>
> Refactoring DataType to use Enums instead of integer constants would provide many benefits, including:
> * Cleaner code
> * Easier to iterate over Enums
> * Easier to add new Enums without braking backwards compatibility
> * Can use EnumMaps for easily link values to Enums
> * Better support for translation from Enums to Strings and viceversa
> Int (or byte in Pig's case) Enum pattern has several drawbacks as summarized here http://java.sun.com/j2se/1.5.0/docs/guide/language/enums.html
> Drawbacks:
> We have to explicitly convert Enum values to bytes when serializing. This can be done in DataReaderWriter.
> Possibly higher overhead than simply using bytes.
> Refactoring might be difficult.
> Thoughts?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.