You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Hao Liu (JIRA)" <ji...@apache.org> on 2008/12/11 07:27:44 UTC
[jira] Created: (HIVE-163) support loading json data into hive
support loading json data into hive
-----------------------------------
Key: HIVE-163
URL: https://issues.apache.org/jira/browse/HIVE-163
Project: Hadoop Hive
Issue Type: New Feature
Components: Serializers/Deserializers
Reporter: Hao Liu
The JSON format is commonly used for transmitting structured data over a network, especially for ajax web applications. People also choose json format to store log data.
Support loading and query json format data will be a desirable features in Hive.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-163) support loading json data into hive
Posted by "Joydeep Sen Sarma (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12663207#action_12663207 ]
Joydeep Sen Sarma commented on HIVE-163:
----------------------------------------
there are a couple of debug printfs in the last patch posted on this jira:
+ System.out.println("ZSHAO: " + System.getenv("CLASSPATH"));
+ System.out.println("ExecDriver CMD:PC " + cmdLine);
?
> support loading json data into hive
> -----------------------------------
>
> Key: HIVE-163
> URL: https://issues.apache.org/jira/browse/HIVE-163
> Project: Hadoop Hive
> Issue Type: New Feature
> Components: Serializers/Deserializers
> Reporter: Hao Liu
> Assignee: Hao Liu
> Fix For: 0.2.0
>
> Attachments: HIVE-163.2.patch, HIVE-163.patch, json.jar
>
> Original Estimate: 168h
> Remaining Estimate: 168h
>
> The JSON format is commonly used for transmitting structured data over a network, especially for ajax web applications. People also choose json format to store log data.
> Support loading and query json format data will be a desirable features in Hive.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-163) support loading json data into hive
Posted by "Carl Steinbach (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Carl Steinbach updated HIVE-163:
--------------------------------
Fix Version/s: 0.3.0
(was: 0.6.0)
> support loading json data into hive
> -----------------------------------
>
> Key: HIVE-163
> URL: https://issues.apache.org/jira/browse/HIVE-163
> Project: Hadoop Hive
> Issue Type: New Feature
> Components: Serializers/Deserializers
> Reporter: Hao Liu
> Assignee: Hao Liu
> Fix For: 0.3.0
>
> Attachments: HIVE-163.2.patch, HIVE-163.3.patch, HIVE-163.4.patch, HIVE-163.patch, json.jar
>
> Original Estimate: 168h
> Remaining Estimate: 168h
>
> The JSON format is commonly used for transmitting structured data over a network, especially for ajax web applications. People also choose json format to store log data.
> Support loading and query json format data will be a desirable features in Hive.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Resolved: (HIVE-163) support loading json data into hive
Posted by "Zheng Shao (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Zheng Shao resolved HIVE-163.
-----------------------------
Resolution: Fixed
Fix Version/s: 0.2.0
Release Note: HIVE-163. JSON udf function added. (Hao Liu via zshao)
Hadoop Flags: [Reviewed]
Committed revision 733992. Thanks Hao!
> support loading json data into hive
> -----------------------------------
>
> Key: HIVE-163
> URL: https://issues.apache.org/jira/browse/HIVE-163
> Project: Hadoop Hive
> Issue Type: New Feature
> Components: Serializers/Deserializers
> Reporter: Hao Liu
> Assignee: Hao Liu
> Fix For: 0.2.0
>
> Attachments: HIVE-163.2.patch, HIVE-163.patch, json.jar
>
> Original Estimate: 168h
> Remaining Estimate: 168h
>
> The JSON format is commonly used for transmitting structured data over a network, especially for ajax web applications. People also choose json format to store log data.
> Support loading and query json format data will be a desirable features in Hive.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-163) support loading json data into hive
Posted by "Zheng Shao (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12663212#action_12663212 ]
Zheng Shao commented on HIVE-163:
---------------------------------
The committed code does not contain the extra printlns.
Thanks for checking this Joy.
> support loading json data into hive
> -----------------------------------
>
> Key: HIVE-163
> URL: https://issues.apache.org/jira/browse/HIVE-163
> Project: Hadoop Hive
> Issue Type: New Feature
> Components: Serializers/Deserializers
> Reporter: Hao Liu
> Assignee: Hao Liu
> Fix For: 0.2.0
>
> Attachments: HIVE-163.2.patch, HIVE-163.3.patch, HIVE-163.patch, json.jar
>
> Original Estimate: 168h
> Remaining Estimate: 168h
>
> The JSON format is commonly used for transmitting structured data over a network, especially for ajax web applications. People also choose json format to store log data.
> Support loading and query json format data will be a desirable features in Hive.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-163) support loading json data into hive
Posted by "Zheng Shao (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Zheng Shao updated HIVE-163:
----------------------------
Attachment: HIVE-163.3.patch
This is the right patch. HIVE-163.2.patch was an outdated one with additional printfs.
> support loading json data into hive
> -----------------------------------
>
> Key: HIVE-163
> URL: https://issues.apache.org/jira/browse/HIVE-163
> Project: Hadoop Hive
> Issue Type: New Feature
> Components: Serializers/Deserializers
> Reporter: Hao Liu
> Assignee: Hao Liu
> Fix For: 0.2.0
>
> Attachments: HIVE-163.2.patch, HIVE-163.3.patch, HIVE-163.patch, json.jar
>
> Original Estimate: 168h
> Remaining Estimate: 168h
>
> The JSON format is commonly used for transmitting structured data over a network, especially for ajax web applications. People also choose json format to store log data.
> Support loading and query json format data will be a desirable features in Hive.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-163) support loading json data into hive
Posted by "Ashish Thusoo (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12663187#action_12663187 ]
Ashish Thusoo commented on HIVE-163:
------------------------------------
Hao just showed me that the license it there :). So +1 from my side as well.
> support loading json data into hive
> -----------------------------------
>
> Key: HIVE-163
> URL: https://issues.apache.org/jira/browse/HIVE-163
> Project: Hadoop Hive
> Issue Type: New Feature
> Components: Serializers/Deserializers
> Reporter: Hao Liu
> Assignee: Hao Liu
> Attachments: HIVE-163.patch, json.jar
>
> Original Estimate: 168h
> Remaining Estimate: 168h
>
> The JSON format is commonly used for transmitting structured data over a network, especially for ajax web applications. People also choose json format to store log data.
> Support loading and query json format data will be a desirable features in Hive.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Assigned: (HIVE-163) support loading json data into hive
Posted by "Hao Liu (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Hao Liu reassigned HIVE-163:
----------------------------
Assignee: Hao Liu
> support loading json data into hive
> -----------------------------------
>
> Key: HIVE-163
> URL: https://issues.apache.org/jira/browse/HIVE-163
> Project: Hadoop Hive
> Issue Type: New Feature
> Components: Serializers/Deserializers
> Reporter: Hao Liu
> Assignee: Hao Liu
> Attachments: HIVE-163.patch, json.jar
>
> Original Estimate: 168h
> Remaining Estimate: 168h
>
> The JSON format is commonly used for transmitting structured data over a network, especially for ajax web applications. People also choose json format to store log data.
> Support loading and query json format data will be a desirable features in Hive.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-163) support loading json data into hive
Posted by "Hao Liu (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Hao Liu updated HIVE-163:
-------------------------
Attachment: HIVE-163.4.patch
add patch to .classpath. also it looks like build-common.xml was not in HIVE-163.3.patch, so add to this one.
> support loading json data into hive
> -----------------------------------
>
> Key: HIVE-163
> URL: https://issues.apache.org/jira/browse/HIVE-163
> Project: Hadoop Hive
> Issue Type: New Feature
> Components: Serializers/Deserializers
> Reporter: Hao Liu
> Assignee: Hao Liu
> Fix For: 0.2.0
>
> Attachments: HIVE-163.2.patch, HIVE-163.3.patch, HIVE-163.4.patch, HIVE-163.patch, json.jar
>
> Original Estimate: 168h
> Remaining Estimate: 168h
>
> The JSON format is commonly used for transmitting structured data over a network, especially for ajax web applications. People also choose json format to store log data.
> Support loading and query json format data will be a desirable features in Hive.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-163) support loading json data into hive
Posted by "Prasad Chakka (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12663401#action_12663401 ]
Prasad Chakka commented on HIVE-163:
------------------------------------
.classpath should be updated with new lib json.jar
> support loading json data into hive
> -----------------------------------
>
> Key: HIVE-163
> URL: https://issues.apache.org/jira/browse/HIVE-163
> Project: Hadoop Hive
> Issue Type: New Feature
> Components: Serializers/Deserializers
> Reporter: Hao Liu
> Assignee: Hao Liu
> Fix For: 0.2.0
>
> Attachments: HIVE-163.2.patch, HIVE-163.3.patch, HIVE-163.patch, json.jar
>
> Original Estimate: 168h
> Remaining Estimate: 168h
>
> The JSON format is commonly used for transmitting structured data over a network, especially for ajax web applications. People also choose json format to store log data.
> Support loading and query json format data will be a desirable features in Hive.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-163) support loading json data into hive
Posted by "Ashish Thusoo (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12663185#action_12663185 ]
Ashish Thusoo commented on HIVE-163:
------------------------------------
We also have to include the license text for json.jar. Can you include that in the patch. You should probably remove the README for json.jar.
> support loading json data into hive
> -----------------------------------
>
> Key: HIVE-163
> URL: https://issues.apache.org/jira/browse/HIVE-163
> Project: Hadoop Hive
> Issue Type: New Feature
> Components: Serializers/Deserializers
> Reporter: Hao Liu
> Assignee: Hao Liu
> Attachments: HIVE-163.patch, json.jar
>
> Original Estimate: 168h
> Remaining Estimate: 168h
>
> The JSON format is commonly used for transmitting structured data over a network, especially for ajax web applications. People also choose json format to store log data.
> Support loading and query json format data will be a desirable features in Hive.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-163) support loading json data into hive
Posted by "Prasad Chakka (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12663573#action_12663573 ]
Prasad Chakka commented on HIVE-163:
------------------------------------
Hao, the build-common.xml change is not needed since json.jar is not an auxiliary lib but any other 3rd party library like commons-lang.jar etc...
> support loading json data into hive
> -----------------------------------
>
> Key: HIVE-163
> URL: https://issues.apache.org/jira/browse/HIVE-163
> Project: Hadoop Hive
> Issue Type: New Feature
> Components: Serializers/Deserializers
> Reporter: Hao Liu
> Assignee: Hao Liu
> Fix For: 0.2.0
>
> Attachments: HIVE-163.2.patch, HIVE-163.3.patch, HIVE-163.4.patch, HIVE-163.patch, json.jar
>
> Original Estimate: 168h
> Remaining Estimate: 168h
>
> The JSON format is commonly used for transmitting structured data over a network, especially for ajax web applications. People also choose json format to store log data.
> Support loading and query json format data will be a desirable features in Hive.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-163) support loading json data into hive
Posted by "Zheng Shao (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Zheng Shao updated HIVE-163:
----------------------------
Attachment: HIVE-163.2.patch
We had to unpack json.jar and put it into hive_exec.jar to work with hadoop 0.17.
> support loading json data into hive
> -----------------------------------
>
> Key: HIVE-163
> URL: https://issues.apache.org/jira/browse/HIVE-163
> Project: Hadoop Hive
> Issue Type: New Feature
> Components: Serializers/Deserializers
> Reporter: Hao Liu
> Assignee: Hao Liu
> Fix For: 0.2.0
>
> Attachments: HIVE-163.2.patch, HIVE-163.patch, json.jar
>
> Original Estimate: 168h
> Remaining Estimate: 168h
>
> The JSON format is commonly used for transmitting structured data over a network, especially for ajax web applications. People also choose json format to store log data.
> Support loading and query json format data will be a desirable features in Hive.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-163) support loading json data into hive
Posted by "Hao Liu (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Hao Liu updated HIVE-163:
-------------------------
Attachment: HIVE-163.patch
add a patch to support json udf as suggested.
> support loading json data into hive
> -----------------------------------
>
> Key: HIVE-163
> URL: https://issues.apache.org/jira/browse/HIVE-163
> Project: Hadoop Hive
> Issue Type: New Feature
> Components: Serializers/Deserializers
> Reporter: Hao Liu
> Attachments: HIVE-163.patch
>
> Original Estimate: 168h
> Remaining Estimate: 168h
>
> The JSON format is commonly used for transmitting structured data over a network, especially for ajax web applications. People also choose json format to store log data.
> Support loading and query json format data will be a desirable features in Hive.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-163) support loading json data into hive
Posted by "Zheng Shao (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12663136#action_12663136 ]
Zheng Shao commented on HIVE-163:
---------------------------------
+1
> support loading json data into hive
> -----------------------------------
>
> Key: HIVE-163
> URL: https://issues.apache.org/jira/browse/HIVE-163
> Project: Hadoop Hive
> Issue Type: New Feature
> Components: Serializers/Deserializers
> Reporter: Hao Liu
> Assignee: Hao Liu
> Attachments: HIVE-163.patch, json.jar
>
> Original Estimate: 168h
> Remaining Estimate: 168h
>
> The JSON format is commonly used for transmitting structured data over a network, especially for ajax web applications. People also choose json format to store log data.
> Support loading and query json format data will be a desirable features in Hive.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-163) support loading json data into hive
Posted by "Hao Liu (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Hao Liu updated HIVE-163:
-------------------------
Attachment: json.jar
json.jar from hadoop project. It should be included in ${hive.root}/lib
> support loading json data into hive
> -----------------------------------
>
> Key: HIVE-163
> URL: https://issues.apache.org/jira/browse/HIVE-163
> Project: Hadoop Hive
> Issue Type: New Feature
> Components: Serializers/Deserializers
> Reporter: Hao Liu
> Attachments: HIVE-163.patch, json.jar
>
> Original Estimate: 168h
> Remaining Estimate: 168h
>
> The JSON format is commonly used for transmitting structured data over a network, especially for ajax web applications. People also choose json format to store log data.
> Support loading and query json format data will be a desirable features in Hive.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-163) support loading json data into hive
Posted by "Zheng Shao (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12655556#action_12655556 ]
Zheng Shao commented on HIVE-163:
---------------------------------
I would suggest to first add a JSONField function that extracts an JSON field:
{code}
class UDFJsonField {
/** Extracts a field of a JSON object.
* @param jsonObject The main JSON Object in string format
* @param field The field expression, e.g. "[1]a.b[3].c"
* @return the JSON Object that represents the field in string format
*/
String evaluate(String jsonObject, String field) {
...
}
}
{code}
This function will support most of the requests.
> support loading json data into hive
> -----------------------------------
>
> Key: HIVE-163
> URL: https://issues.apache.org/jira/browse/HIVE-163
> Project: Hadoop Hive
> Issue Type: New Feature
> Components: Serializers/Deserializers
> Reporter: Hao Liu
> Original Estimate: 168h
> Remaining Estimate: 168h
>
> The JSON format is commonly used for transmitting structured data over a network, especially for ajax web applications. People also choose json format to store log data.
> Support loading and query json format data will be a desirable features in Hive.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.