You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Zhijie Shen (JIRA)" <ji...@apache.org> on 2011/07/24 16:32:09 UTC

[jira] [Updated] (PIG-1429) Add Boolean Data Type to Pig

     [ https://issues.apache.org/jira/browse/PIG-1429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Zhijie Shen updated PIG-1429:
-----------------------------

    Attachment: PIG-1429_1.patch

Attached is the temporal non-working patch to update my progress. I've added boolean cast to many places in the runtime code. I still need to check I've touched every place. The test code hasn't been well explored yet. The following is the type casting rule:

1. From Boolean to other types

      Integer Long Float Double DataByteArray String
True  1       1L   1.0F  1.0D   "true"        "true"
False 0       0L   0.0F  0.0D   "false"       "false"

2. From other types to Boolean

Integer       non-0 -> True    0 -> False
Long          non-0L -> True   0L -> False
Float         non-0.0F -> True 0.0F -> False
Double        non-0.0D -> True 0.0D -> False
DataByteArray "true" -> True   non-"true" -> False
String        "true" -> True   non-"true" -> False

I still have some puzzles with the code:

1. In PigPerformanceLoader.Caster.bytesToBag(byte[] b, ResourceFieldSchema fs), there is a piece of code as follows

switch (b[start]) {
case 105: t.set(0, bytesToInteger(copy)); break;
case 108: t.set(0, bytesToLong(copy)); break;
case 102: t.set(0, bytesToFloat(copy)); break;
case 100: t.set(0, bytesToDouble(copy)); break;
case 115: t.set(0, bytesToCharArray(copy)); break;
case 109: t.set(0, bytesToMap(copy)); break;
case 98: t.set(0, bytesToBag(copy, null)); break;
default: throw new RuntimeException("unknown type " + b[start]);
}

Where does the number come? What should I assign to Boolean?

2. Similarly, in QueryLexer.tokens, what is the rule to assigne a value to a token? Again what should be assigned to Boolean?

> Add Boolean Data Type to Pig
> ----------------------------
>
>                 Key: PIG-1429
>                 URL: https://issues.apache.org/jira/browse/PIG-1429
>             Project: Pig
>          Issue Type: New Feature
>          Components: data
>    Affects Versions: 0.7.0
>            Reporter: Russell Jurney
>            Assignee: Zhijie Shen
>              Labels: boolean, gsoc2011, pig, type
>         Attachments: PIG-1429_1.patch, working_boolean.patch
>
>   Original Estimate: 8h
>  Remaining Estimate: 8h
>
> Pig needs a Boolean data type.  Pig-1097 is dependent on doing this.  
> I volunteer.  Is there anything beyond the work in src/org/apache/pig/data/ plus unit tests to make this work?  
> This is a candidate project for Google summer of code 2011. More information about the program can be found at http://wiki.apache.org/pig/GSoc2011

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Re: [jira] [Updated] (PIG-1429) Add Boolean Data Type to Pig

Posted by Gianmarco <gi...@gmail.com>.
Hi Zhijie,

QueryLexer.tokens is generated automatically by ANTLR, so you should not
touch it.
What you should do is to modify QueryLexer.g in order to include a Boolean
token declaration.
Also, you should modify QueryParser.g and all the other grammar (.g) files
to accept boolean expressions.

Cheers,
--
Gianmarco De Francisci Morales


On Sun, Jul 24, 2011 at 16:32, Zhijie Shen (JIRA) <ji...@apache.org> wrote:

>
>     [
> https://issues.apache.org/jira/browse/PIG-1429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel]
>
> Zhijie Shen updated PIG-1429:
> -----------------------------
>
>    Attachment: PIG-1429_1.patch
>
> 2. Similarly, in QueryLexer.tokens, what is the rule to assigne a value to
> a token? Again what should be assigned to Boolean?
>
>