You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Paul Yang (JIRA)" <ji...@apache.org> on 2009/11/05 22:05:32 UTC
[jira] Created: (HIVE-914) Speed up UDFJson
Speed up UDFJson
----------------
Key: HIVE-914
URL: https://issues.apache.org/jira/browse/HIVE-914
Project: Hadoop Hive
Issue Type: Improvement
Reporter: Paul Yang
Assignee: Paul Yang
In UDFJson, a new JSONObject is created for each call to evaluate(). Since it is likely that evalue() will be called with multiple paths for a given JSON string, a performance improvement can be realized by caching the JSONObject created from a JSON string.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-914) Speed up UDFJson
Posted by "Paul Yang (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Paul Yang updated HIVE-914:
---------------------------
Attachment: HIVE-914.2.patch
* Added missing 'static' identifier for the cache map
> Speed up UDFJson
> ----------------
>
> Key: HIVE-914
> URL: https://issues.apache.org/jira/browse/HIVE-914
> Project: Hadoop Hive
> Issue Type: Improvement
> Reporter: Paul Yang
> Assignee: Paul Yang
> Attachments: HIVE-914.1.patch, HIVE-914.2.patch
>
>
> In UDFJson, a new JSONObject is created for each call to evaluate(). Since it is likely that evalue() will be called with multiple paths for a given JSON string, a performance improvement can be realized by caching the JSONObject created from a JSON string.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-914) Speed up UDFJson
Posted by "Carl Steinbach (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Carl Steinbach updated HIVE-914:
--------------------------------
Fix Version/s: 0.5.0
Component/s: UDF
> Speed up UDFJson
> ----------------
>
> Key: HIVE-914
> URL: https://issues.apache.org/jira/browse/HIVE-914
> Project: Hadoop Hive
> Issue Type: Improvement
> Components: UDF
> Reporter: Paul Yang
> Assignee: Paul Yang
> Fix For: 0.5.0
>
> Attachments: HIVE-914.1.patch, HIVE-914.2.patch, HIVE-914.3.patch
>
>
> In UDFJson, a new JSONObject is created for each call to evaluate(). Since it is likely that evalue() will be called with multiple paths for a given JSON string, a performance improvement can be realized by caching the JSONObject created from a JSON string.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Resolved: (HIVE-914) Speed up UDFJson
Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Namit Jain resolved HIVE-914.
-----------------------------
Resolution: Fixed
Hadoop Flags: [Reviewed]
Committed. Thanks Paul
> Speed up UDFJson
> ----------------
>
> Key: HIVE-914
> URL: https://issues.apache.org/jira/browse/HIVE-914
> Project: Hadoop Hive
> Issue Type: Improvement
> Reporter: Paul Yang
> Assignee: Paul Yang
> Attachments: HIVE-914.1.patch, HIVE-914.2.patch, HIVE-914.3.patch
>
>
> In UDFJson, a new JSONObject is created for each call to evaluate(). Since it is likely that evalue() will be called with multiple paths for a given JSON string, a performance improvement can be realized by caching the JSONObject created from a JSON string.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-914) Speed up UDFJson
Posted by "Paul Yang (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Paul Yang updated HIVE-914:
---------------------------
Attachment: HIVE-914.1.patch
* Added an LRU cache for JSONObjects
* Reduced new object creation in extract_json_withindex()
> Speed up UDFJson
> ----------------
>
> Key: HIVE-914
> URL: https://issues.apache.org/jira/browse/HIVE-914
> Project: Hadoop Hive
> Issue Type: Improvement
> Reporter: Paul Yang
> Assignee: Paul Yang
> Attachments: HIVE-914.1.patch
>
>
> In UDFJson, a new JSONObject is created for each call to evaluate(). Since it is likely that evalue() will be called with multiple paths for a given JSON string, a performance improvement can be realized by caching the JSONObject created from a JSON string.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-914) Speed up UDFJson
Posted by "Paul Yang (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Paul Yang updated HIVE-914:
---------------------------
Attachment: HIVE-914.3.patch
* Profiled through YourKit
* Added LRU caches for shared/time-consuming objects
> Speed up UDFJson
> ----------------
>
> Key: HIVE-914
> URL: https://issues.apache.org/jira/browse/HIVE-914
> Project: Hadoop Hive
> Issue Type: Improvement
> Reporter: Paul Yang
> Assignee: Paul Yang
> Attachments: HIVE-914.1.patch, HIVE-914.2.patch, HIVE-914.3.patch
>
>
> In UDFJson, a new JSONObject is created for each call to evaluate(). Since it is likely that evalue() will be called with multiple paths for a given JSON string, a performance improvement can be realized by caching the JSONObject created from a JSON string.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.