You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Zhijie Shen (JIRA)" <ji...@apache.org> on 2011/07/24 16:32:09 UTC
[jira] [Updated] (PIG-1429) Add Boolean Data Type to Pig
[ https://issues.apache.org/jira/browse/PIG-1429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Zhijie Shen updated PIG-1429:
-----------------------------
Attachment: PIG-1429_1.patch
Attached is the temporal non-working patch to update my progress. I've added boolean cast to many places in the runtime code. I still need to check I've touched every place. The test code hasn't been well explored yet. The following is the type casting rule:
1. From Boolean to other types
Integer Long Float Double DataByteArray String
True 1 1L 1.0F 1.0D "true" "true"
False 0 0L 0.0F 0.0D "false" "false"
2. From other types to Boolean
Integer non-0 -> True 0 -> False
Long non-0L -> True 0L -> False
Float non-0.0F -> True 0.0F -> False
Double non-0.0D -> True 0.0D -> False
DataByteArray "true" -> True non-"true" -> False
String "true" -> True non-"true" -> False
I still have some puzzles with the code:
1. In PigPerformanceLoader.Caster.bytesToBag(byte[] b, ResourceFieldSchema fs), there is a piece of code as follows
switch (b[start]) {
case 105: t.set(0, bytesToInteger(copy)); break;
case 108: t.set(0, bytesToLong(copy)); break;
case 102: t.set(0, bytesToFloat(copy)); break;
case 100: t.set(0, bytesToDouble(copy)); break;
case 115: t.set(0, bytesToCharArray(copy)); break;
case 109: t.set(0, bytesToMap(copy)); break;
case 98: t.set(0, bytesToBag(copy, null)); break;
default: throw new RuntimeException("unknown type " + b[start]);
}
Where does the number come? What should I assign to Boolean?
2. Similarly, in QueryLexer.tokens, what is the rule to assigne a value to a token? Again what should be assigned to Boolean?
> Add Boolean Data Type to Pig
> ----------------------------
>
> Key: PIG-1429
> URL: https://issues.apache.org/jira/browse/PIG-1429
> Project: Pig
> Issue Type: New Feature
> Components: data
> Affects Versions: 0.7.0
> Reporter: Russell Jurney
> Assignee: Zhijie Shen
> Labels: boolean, gsoc2011, pig, type
> Attachments: PIG-1429_1.patch, working_boolean.patch
>
> Original Estimate: 8h
> Remaining Estimate: 8h
>
> Pig needs a Boolean data type. Pig-1097 is dependent on doing this.
> I volunteer. Is there anything beyond the work in src/org/apache/pig/data/ plus unit tests to make this work?
> This is a candidate project for Google summer of code 2011. More information about the program can be found at http://wiki.apache.org/pig/GSoc2011
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: [jira] [Updated] (PIG-1429) Add Boolean Data Type to Pig
Posted by Gianmarco <gi...@gmail.com>.
Hi Zhijie,
QueryLexer.tokens is generated automatically by ANTLR, so you should not
touch it.
What you should do is to modify QueryLexer.g in order to include a Boolean
token declaration.
Also, you should modify QueryParser.g and all the other grammar (.g) files
to accept boolean expressions.
Cheers,
--
Gianmarco De Francisci Morales
On Sun, Jul 24, 2011 at 16:32, Zhijie Shen (JIRA) <ji...@apache.org> wrote:
>
> [
> https://issues.apache.org/jira/browse/PIG-1429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel]
>
> Zhijie Shen updated PIG-1429:
> -----------------------------
>
> Attachment: PIG-1429_1.patch
>
> 2. Similarly, in QueryLexer.tokens, what is the rule to assigne a value to
> a token? Again what should be assigned to Boolean?
>
>