You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Patrick Angeles (JIRA)" <ji...@apache.org> on 2010/01/05 06:12:55 UTC

[jira] Created: (HIVE-1027) Create UDFs for XPath expression evaluation

Create UDFs for XPath expression evaluation
-------------------------------------------

                 Key: HIVE-1027
                 URL: https://issues.apache.org/jira/browse/HIVE-1027
             Project: Hadoop Hive
          Issue Type: New Feature
          Components: Query Processor
            Reporter: Patrick Angeles
            Priority: Minor


Create UDFs for evaluating XPath expressions against XML documents.

Examples:

> SELECT xpath_double ('<a><b class="odd">1</b><b class="even">2</b><b class="odd">4</b><c>8</c></a>', 'sum(a/b[@class="odd"])') FROM src LIMIT 1 ;
5.0
> SELECT xpath_string ('<a><b>b1</b><b>b2</b></a>', 'a/b[2]') FROM src LIMIT 1 ;
b2
> SELECT xpath ('<a><b>b1</b><b>b2</b><b>b3</b><c>c1</c><c>c2</c></a>', 'a/c/text()') FROM src LIMIT 1 ;
["c1","c2"]

Included functions are: xpath_short, xpath_int, xpath_long, xpath_float, xpath_double/xpath_number, xpath_string, xpath

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-1027) Create UDFs for XPath expression evaluation

Posted by "Patrick Angeles (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-1027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12796527#action_12796527 ] 

Patrick Angeles commented on HIVE-1027:
---------------------------------------

Code uses the built-in javax.xml.xpath library.

> Create UDFs for XPath expression evaluation
> -------------------------------------------
>
>                 Key: HIVE-1027
>                 URL: https://issues.apache.org/jira/browse/HIVE-1027
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Query Processor
>            Reporter: Patrick Angeles
>            Priority: Minor
>         Attachments: udf_xpath.patch
>
>
> Create UDFs for evaluating XPath expressions against XML documents.
> Examples:
> > SELECT xpath_double ('<a><b class="odd">1</b><b class="even">2</b><b class="odd">4</b><c>8</c></a>', 'sum(a/b[@class="odd"])') FROM src LIMIT 1 ;
> 5.0
> > SELECT xpath_string ('<a><b>b1</b><b>b2</b></a>', 'a/b[2]') FROM src LIMIT 1 ;
> b2
> > SELECT xpath ('<a><b>b1</b><b>b2</b><b>b3</b><c>c1</c><c>c2</c></a>', 'a/c/text()') FROM src LIMIT 1 ;
> ["c1","c2"]
> Included functions are: xpath_short, xpath_int, xpath_long, xpath_float, xpath_double/xpath_number, xpath_string, xpath

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-1027) Create UDFs for XPath expression evaluation

Posted by "Ning Zhang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-1027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12845196#action_12845196 ] 

Ning Zhang commented on HIVE-1027:
----------------------------------

Patrick, can you update the wiki page for these new UDFs?

> Create UDFs for XPath expression evaluation
> -------------------------------------------
>
>                 Key: HIVE-1027
>                 URL: https://issues.apache.org/jira/browse/HIVE-1027
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Query Processor
>            Reporter: Patrick Angeles
>            Assignee: Patrick Angeles
>            Priority: Minor
>             Fix For: 0.6.0
>
>         Attachments: hive-1027-v2.patch, hive-1027-v3.patch, hive-1027.patch, HIVE-1027_3.patch, udf_xpath.patch
>
>
> Create UDFs for evaluating XPath expressions against XML documents.
> Examples:
> > SELECT xpath_double ('<a><b class="odd">1</b><b class="even">2</b><b class="odd">4</b><c>8</c></a>', 'sum(a/b[@class="odd"])') FROM src LIMIT 1 ;
> 5.0
> > SELECT xpath_string ('<a><b>b1</b><b>b2</b></a>', 'a/b[2]') FROM src LIMIT 1 ;
> b2
> > SELECT xpath ('<a><b>b1</b><b>b2</b><b>b3</b><c>c1</c><c>c2</c></a>', 'a/c/text()') FROM src LIMIT 1 ;
> ["c1","c2"]
> Included functions are: xpath_short, xpath_int, xpath_long, xpath_float, xpath_double/xpath_number, xpath_string, xpath

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-1027) Create UDFs for XPath expression evaluation

Posted by "Ning Zhang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ning Zhang updated HIVE-1027:
-----------------------------

    Attachment: HIVE-1027_3.patch

Attaching a patch that reverses the type casting changes in FunctionRegistry.java for Hadoop 0.17.2.1.

> Create UDFs for XPath expression evaluation
> -------------------------------------------
>
>                 Key: HIVE-1027
>                 URL: https://issues.apache.org/jira/browse/HIVE-1027
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Query Processor
>            Reporter: Patrick Angeles
>            Assignee: Patrick Angeles
>            Priority: Minor
>             Fix For: 0.6.0
>
>         Attachments: hive-1027-v2.patch, hive-1027-v3.patch, hive-1027.patch, HIVE-1027_3.patch, udf_xpath.patch
>
>
> Create UDFs for evaluating XPath expressions against XML documents.
> Examples:
> > SELECT xpath_double ('<a><b class="odd">1</b><b class="even">2</b><b class="odd">4</b><c>8</c></a>', 'sum(a/b[@class="odd"])') FROM src LIMIT 1 ;
> 5.0
> > SELECT xpath_string ('<a><b>b1</b><b>b2</b></a>', 'a/b[2]') FROM src LIMIT 1 ;
> b2
> > SELECT xpath ('<a><b>b1</b><b>b2</b><b>b3</b><c>c1</c><c>c2</c></a>', 'a/c/text()') FROM src LIMIT 1 ;
> ["c1","c2"]
> Included functions are: xpath_short, xpath_int, xpath_long, xpath_float, xpath_double/xpath_number, xpath_string, xpath

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-1027) Create UDFs for XPath expression evaluation

Posted by "Patrick Angeles (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-1027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12797811#action_12797811 ] 

Patrick Angeles commented on HIVE-1027:
---------------------------------------

>Thanks for the detailed explanations. It seems we are supporting XPath 1.0 here. When you say "xpath() returns multiple nodes(list)", do you mean it returns a
> serialized XML string representing the list of nodes such as <a>a1</a><a>a2</a> ...? In this case, do you have a test case for composing xpath() functions. For
> example and subquery returns XML string from the result of xpath() and the outer query takes that input to another xpath*() function?

No, xpath() always returns a hive array of strings. If the expression results in a non-text value (e.g., another xml node) the function will return an empty array. So really, there's only 2 uses for xpath(): to get a list of node text values or to get a list of attribute values. For example:

> select xpath('<a><b>b1</b><b>b2</b></a>','a/*') from src limit 1 ;
[]
> select xpath('<a><b>b1</b><b>b2</b></a>','a/*/text()') from src limit 1 ;   // note the text() at the end of the expression
["b1","b2"]
> select xpath('<a><b id="foo">b1</b><b id="bar">b2</b></a>','//@id') from src limit 1 ;  
["foo","bar"]

This behavior can be changed, but I feel that going down the path of returning nested results is suboptimal. I'm open to ideas, however.

> For (4) I'm sure whether we should interpret of empty list as empty string etc. We can definitely define the mapping between the XML model to relation model this way,
> but it doesn't distinguish the case where the xpath_string() result is an empty list or it is a single node but the value of the node is empty (e.g., <a/> vs. no <a>
> element).
Agreed. Unfortunately, the Java XPath API on which this is built on returns an empty string on both cases. I can internally change it so it queries for a node instead of a string, then extract the string from the node. I get the feeling that this is less performant but I have no facts to back this up.

> Also all this information is better to be exposed to the wider community (not only developers) as well. Can you also add all these to the Hive's wiki page?
Absolutely... I will update the Hive Wiki once this is committed.

> Create UDFs for XPath expression evaluation
> -------------------------------------------
>
>                 Key: HIVE-1027
>                 URL: https://issues.apache.org/jira/browse/HIVE-1027
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Query Processor
>            Reporter: Patrick Angeles
>            Assignee: Patrick Angeles
>            Priority: Minor
>         Attachments: hive-1027.patch, udf_xpath.patch
>
>
> Create UDFs for evaluating XPath expressions against XML documents.
> Examples:
> > SELECT xpath_double ('<a><b class="odd">1</b><b class="even">2</b><b class="odd">4</b><c>8</c></a>', 'sum(a/b[@class="odd"])') FROM src LIMIT 1 ;
> 5.0
> > SELECT xpath_string ('<a><b>b1</b><b>b2</b></a>', 'a/b[2]') FROM src LIMIT 1 ;
> b2
> > SELECT xpath ('<a><b>b1</b><b>b2</b><b>b3</b><c>c1</c><c>c2</c></a>', 'a/c/text()') FROM src LIMIT 1 ;
> ["c1","c2"]
> Included functions are: xpath_short, xpath_int, xpath_long, xpath_float, xpath_double/xpath_number, xpath_string, xpath

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-1027) Create UDFs for XPath expression evaluation

Posted by "Patrick Angeles (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Patrick Angeles updated HIVE-1027:
----------------------------------

    Status: Patch Available  (was: Open)

Updated patch (this one includes show_functions.q.out).

> Create UDFs for XPath expression evaluation
> -------------------------------------------
>
>                 Key: HIVE-1027
>                 URL: https://issues.apache.org/jira/browse/HIVE-1027
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Query Processor
>            Reporter: Patrick Angeles
>            Priority: Minor
>         Attachments: udf_xpath.patch
>
>
> Create UDFs for evaluating XPath expressions against XML documents.
> Examples:
> > SELECT xpath_double ('<a><b class="odd">1</b><b class="even">2</b><b class="odd">4</b><c>8</c></a>', 'sum(a/b[@class="odd"])') FROM src LIMIT 1 ;
> 5.0
> > SELECT xpath_string ('<a><b>b1</b><b>b2</b></a>', 'a/b[2]') FROM src LIMIT 1 ;
> b2
> > SELECT xpath ('<a><b>b1</b><b>b2</b><b>b3</b><c>c1</c><c>c2</c></a>', 'a/c/text()') FROM src LIMIT 1 ;
> ["c1","c2"]
> Included functions are: xpath_short, xpath_int, xpath_long, xpath_float, xpath_double/xpath_number, xpath_string, xpath

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-1027) Create UDFs for XPath expression evaluation

Posted by "Patrick Angeles (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-1027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12797520#action_12797520 ] 

Patrick Angeles commented on HIVE-1027:
---------------------------------------


1) In general XPath queries return a list of nodes. What is the semantics of xpath_double (eg.) return if XPath evaluates to multiple nodes. 

Only xpath() returns multiple nodes (list).

xpath_string() returns the text of the first matching node (and its subnodes, if any).
- xpath_string('<a>aa<b>b1</b><b>b2</b></a>','a') returns 'aab1b2'
- xpath_string('<a>aa<b>b1</b><b>b2</b></a>','b') returns 'b1'

xpath_double()/float() return the numeric value of the text of the first matching node, or NaN if the text value is not numeric.
xpath_int()/long()/short() return the numberic value of the text of the first matching node, or 0 if the text value is not numeric, or MAX_INT, MAX_LONG, MAX_SHORT respectively if the value overflows.

2) Is the XPath query parsed for every input row, or only parsed once?

The XPath expression is compiled and cached. It is reused if the next expression matches the previous. Otherwise, it is recompiled. So, the xml is always parsed for every input row, but the xpath expression is precompiled and reused for the vast majority of use cases.

3a) Do you support DTD and XMLSchema?

Not sure how these would apply, as the Java XPath API is schema agnostic (no validation being performed). However, malformed xml (e.g., '<a><b>1</b></aa>') will result in a runtime exception being thrown.

3b) What about namespace and backward axes in XPath?

Namespace is not currently supported, but could be easily added later.

Backward axes are supported:

> select xpath ('<a><b id="1"><c/></b><b id="2"><c/></b></a>','/descendant::c/ancestor::b/@id') from t1 limit 1 ;
["1","2"]

4) If XPath evaluates to empty list, do you return NULL or empty string (in case of xpath())?

When no match is found:
xpath()  returns an empty list.
xpath_string() returns an empty string.
xpath_int(), float(), etc. will return 0.
xpath_boolean() will return false.

> Create UDFs for XPath expression evaluation
> -------------------------------------------
>
>                 Key: HIVE-1027
>                 URL: https://issues.apache.org/jira/browse/HIVE-1027
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Query Processor
>            Reporter: Patrick Angeles
>            Assignee: Patrick Angeles
>            Priority: Minor
>         Attachments: hive-1027.patch, udf_xpath.patch
>
>
> Create UDFs for evaluating XPath expressions against XML documents.
> Examples:
> > SELECT xpath_double ('<a><b class="odd">1</b><b class="even">2</b><b class="odd">4</b><c>8</c></a>', 'sum(a/b[@class="odd"])') FROM src LIMIT 1 ;
> 5.0
> > SELECT xpath_string ('<a><b>b1</b><b>b2</b></a>', 'a/b[2]') FROM src LIMIT 1 ;
> b2
> > SELECT xpath ('<a><b>b1</b><b>b2</b><b>b3</b><c>c1</c><c>c2</c></a>', 'a/c/text()') FROM src LIMIT 1 ;
> ["c1","c2"]
> Included functions are: xpath_short, xpath_int, xpath_long, xpath_float, xpath_double/xpath_number, xpath_string, xpath

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (HIVE-1027) Create UDFs for XPath expression evaluation

Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Namit Jain reassigned HIVE-1027:
--------------------------------

    Assignee: Patrick Angeles

> Create UDFs for XPath expression evaluation
> -------------------------------------------
>
>                 Key: HIVE-1027
>                 URL: https://issues.apache.org/jira/browse/HIVE-1027
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Query Processor
>            Reporter: Patrick Angeles
>            Assignee: Patrick Angeles
>            Priority: Minor
>         Attachments: hive-1027.patch, udf_xpath.patch
>
>
> Create UDFs for evaluating XPath expressions against XML documents.
> Examples:
> > SELECT xpath_double ('<a><b class="odd">1</b><b class="even">2</b><b class="odd">4</b><c>8</c></a>', 'sum(a/b[@class="odd"])') FROM src LIMIT 1 ;
> 5.0
> > SELECT xpath_string ('<a><b>b1</b><b>b2</b></a>', 'a/b[2]') FROM src LIMIT 1 ;
> b2
> > SELECT xpath ('<a><b>b1</b><b>b2</b><b>b3</b><c>c1</c><c>c2</c></a>', 'a/c/text()') FROM src LIMIT 1 ;
> ["c1","c2"]
> Included functions are: xpath_short, xpath_int, xpath_long, xpath_float, xpath_double/xpath_number, xpath_string, xpath

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-1027) Create UDFs for XPath expression evaluation

Posted by "Patrick Angeles (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-1027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12845366#action_12845366 ] 

Patrick Angeles commented on HIVE-1027:
---------------------------------------

Here it is. I created a 'master' UDF guide page that is linked to via a list item in the 'Hive Users Guide':

http://wiki.apache.org/hadoop/Hive/HiveUDFGuide

Side note: the wiki site is painfully slow :(

> Create UDFs for XPath expression evaluation
> -------------------------------------------
>
>                 Key: HIVE-1027
>                 URL: https://issues.apache.org/jira/browse/HIVE-1027
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Query Processor
>            Reporter: Patrick Angeles
>            Assignee: Patrick Angeles
>            Priority: Minor
>             Fix For: 0.6.0
>
>         Attachments: hive-1027-v2.patch, hive-1027-v3.patch, hive-1027.patch, HIVE-1027_3.patch, udf_xpath.patch
>
>
> Create UDFs for evaluating XPath expressions against XML documents.
> Examples:
> > SELECT xpath_double ('<a><b class="odd">1</b><b class="even">2</b><b class="odd">4</b><c>8</c></a>', 'sum(a/b[@class="odd"])') FROM src LIMIT 1 ;
> 5.0
> > SELECT xpath_string ('<a><b>b1</b><b>b2</b></a>', 'a/b[2]') FROM src LIMIT 1 ;
> b2
> > SELECT xpath ('<a><b>b1</b><b>b2</b><b>b3</b><c>c1</c><c>c2</c></a>', 'a/c/text()') FROM src LIMIT 1 ;
> ["c1","c2"]
> Included functions are: xpath_short, xpath_int, xpath_long, xpath_float, xpath_double/xpath_number, xpath_string, xpath

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-1027) Create UDFs for XPath expression evaluation

Posted by "Patrick Angeles (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Patrick Angeles updated HIVE-1027:
----------------------------------

    Attachment: udf_xpath.patch

> Create UDFs for XPath expression evaluation
> -------------------------------------------
>
>                 Key: HIVE-1027
>                 URL: https://issues.apache.org/jira/browse/HIVE-1027
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Query Processor
>            Reporter: Patrick Angeles
>            Priority: Minor
>         Attachments: udf_xpath.patch
>
>
> Create UDFs for evaluating XPath expressions against XML documents.
> Examples:
> > SELECT xpath_double ('<a><b class="odd">1</b><b class="even">2</b><b class="odd">4</b><c>8</c></a>', 'sum(a/b[@class="odd"])') FROM src LIMIT 1 ;
> 5.0
> > SELECT xpath_string ('<a><b>b1</b><b>b2</b></a>', 'a/b[2]') FROM src LIMIT 1 ;
> b2
> > SELECT xpath ('<a><b>b1</b><b>b2</b><b>b3</b><c>c1</c><c>c2</c></a>', 'a/c/text()') FROM src LIMIT 1 ;
> ["c1","c2"]
> Included functions are: xpath_short, xpath_int, xpath_long, xpath_float, xpath_double/xpath_number, xpath_string, xpath

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-1027) Create UDFs for XPath expression evaluation

Posted by "Patrick Angeles (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Patrick Angeles updated HIVE-1027:
----------------------------------

    Attachment: hive-1027-v2.patch

> Create UDFs for XPath expression evaluation
> -------------------------------------------
>
>                 Key: HIVE-1027
>                 URL: https://issues.apache.org/jira/browse/HIVE-1027
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Query Processor
>            Reporter: Patrick Angeles
>            Assignee: Patrick Angeles
>            Priority: Minor
>             Fix For: 0.6.0
>
>         Attachments: hive-1027-v2.patch, hive-1027.patch, udf_xpath.patch
>
>
> Create UDFs for evaluating XPath expressions against XML documents.
> Examples:
> > SELECT xpath_double ('<a><b class="odd">1</b><b class="even">2</b><b class="odd">4</b><c>8</c></a>', 'sum(a/b[@class="odd"])') FROM src LIMIT 1 ;
> 5.0
> > SELECT xpath_string ('<a><b>b1</b><b>b2</b></a>', 'a/b[2]') FROM src LIMIT 1 ;
> b2
> > SELECT xpath ('<a><b>b1</b><b>b2</b><b>b3</b><c>c1</c><c>c2</c></a>', 'a/c/text()') FROM src LIMIT 1 ;
> ["c1","c2"]
> Included functions are: xpath_short, xpath_int, xpath_long, xpath_float, xpath_double/xpath_number, xpath_string, xpath

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-1027) Create UDFs for XPath expression evaluation

Posted by "Ning Zhang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-1027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12845375#action_12845375 ] 

Ning Zhang commented on HIVE-1027:
----------------------------------

That's great. Thank you very much Patrick. 

> Create UDFs for XPath expression evaluation
> -------------------------------------------
>
>                 Key: HIVE-1027
>                 URL: https://issues.apache.org/jira/browse/HIVE-1027
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Query Processor
>            Reporter: Patrick Angeles
>            Assignee: Patrick Angeles
>            Priority: Minor
>             Fix For: 0.6.0
>
>         Attachments: hive-1027-v2.patch, hive-1027-v3.patch, hive-1027.patch, HIVE-1027_3.patch, udf_xpath.patch
>
>
> Create UDFs for evaluating XPath expressions against XML documents.
> Examples:
> > SELECT xpath_double ('<a><b class="odd">1</b><b class="even">2</b><b class="odd">4</b><c>8</c></a>', 'sum(a/b[@class="odd"])') FROM src LIMIT 1 ;
> 5.0
> > SELECT xpath_string ('<a><b>b1</b><b>b2</b></a>', 'a/b[2]') FROM src LIMIT 1 ;
> b2
> > SELECT xpath ('<a><b>b1</b><b>b2</b><b>b3</b><c>c1</c><c>c2</c></a>', 'a/c/text()') FROM src LIMIT 1 ;
> ["c1","c2"]
> Included functions are: xpath_short, xpath_int, xpath_long, xpath_float, xpath_double/xpath_number, xpath_string, xpath

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-1027) Create UDFs for XPath expression evaluation

Posted by "Patrick Angeles (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Patrick Angeles updated HIVE-1027:
----------------------------------

    Status: Open  (was: Patch Available)

> Create UDFs for XPath expression evaluation
> -------------------------------------------
>
>                 Key: HIVE-1027
>                 URL: https://issues.apache.org/jira/browse/HIVE-1027
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Query Processor
>            Reporter: Patrick Angeles
>            Assignee: Patrick Angeles
>            Priority: Minor
>             Fix For: 0.6.0
>
>         Attachments: hive-1027-v2.patch, hive-1027.patch, udf_xpath.patch
>
>
> Create UDFs for evaluating XPath expressions against XML documents.
> Examples:
> > SELECT xpath_double ('<a><b class="odd">1</b><b class="even">2</b><b class="odd">4</b><c>8</c></a>', 'sum(a/b[@class="odd"])') FROM src LIMIT 1 ;
> 5.0
> > SELECT xpath_string ('<a><b>b1</b><b>b2</b></a>', 'a/b[2]') FROM src LIMIT 1 ;
> b2
> > SELECT xpath ('<a><b>b1</b><b>b2</b><b>b3</b><c>c1</c><c>c2</c></a>', 'a/c/text()') FROM src LIMIT 1 ;
> ["c1","c2"]
> Included functions are: xpath_short, xpath_int, xpath_long, xpath_float, xpath_double/xpath_number, xpath_string, xpath

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-1027) Create UDFs for XPath expression evaluation

Posted by "Ning Zhang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ning Zhang updated HIVE-1027:
-----------------------------

    Fix Version/s: 0.6.0

> Create UDFs for XPath expression evaluation
> -------------------------------------------
>
>                 Key: HIVE-1027
>                 URL: https://issues.apache.org/jira/browse/HIVE-1027
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Query Processor
>            Reporter: Patrick Angeles
>            Assignee: Patrick Angeles
>            Priority: Minor
>             Fix For: 0.6.0
>
>         Attachments: hive-1027.patch, udf_xpath.patch
>
>
> Create UDFs for evaluating XPath expressions against XML documents.
> Examples:
> > SELECT xpath_double ('<a><b class="odd">1</b><b class="even">2</b><b class="odd">4</b><c>8</c></a>', 'sum(a/b[@class="odd"])') FROM src LIMIT 1 ;
> 5.0
> > SELECT xpath_string ('<a><b>b1</b><b>b2</b></a>', 'a/b[2]') FROM src LIMIT 1 ;
> b2
> > SELECT xpath ('<a><b>b1</b><b>b2</b><b>b3</b><c>c1</c><c>c2</c></a>', 'a/c/text()') FROM src LIMIT 1 ;
> ["c1","c2"]
> Included functions are: xpath_short, xpath_int, xpath_long, xpath_float, xpath_double/xpath_number, xpath_string, xpath

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-1027) Create UDFs for XPath expression evaluation

Posted by "Ning Zhang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-1027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12797427#action_12797427 ] 

Ning Zhang commented on HIVE-1027:
----------------------------------

This is cool stuff. Just some questions:
1) In general XPath queries return a list of nodes. What is the semantics of xpath_double (eg.) return if XPath evaluates to multiple nodes. 
2) Is the XPath query parsed for every input row, or only parsed once?
3) Do you support DTD and XMLSchema? What about namespace and backward axes in XPath?
4) If XPath evaluates to empty list, do you return NULL or empty string (in case of xpath())?
 

> Create UDFs for XPath expression evaluation
> -------------------------------------------
>
>                 Key: HIVE-1027
>                 URL: https://issues.apache.org/jira/browse/HIVE-1027
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Query Processor
>            Reporter: Patrick Angeles
>            Assignee: Patrick Angeles
>            Priority: Minor
>         Attachments: hive-1027.patch, udf_xpath.patch
>
>
> Create UDFs for evaluating XPath expressions against XML documents.
> Examples:
> > SELECT xpath_double ('<a><b class="odd">1</b><b class="even">2</b><b class="odd">4</b><c>8</c></a>', 'sum(a/b[@class="odd"])') FROM src LIMIT 1 ;
> 5.0
> > SELECT xpath_string ('<a><b>b1</b><b>b2</b></a>', 'a/b[2]') FROM src LIMIT 1 ;
> b2
> > SELECT xpath ('<a><b>b1</b><b>b2</b><b>b3</b><c>c1</c><c>c2</c></a>', 'a/c/text()') FROM src LIMIT 1 ;
> ["c1","c2"]
> Included functions are: xpath_short, xpath_int, xpath_long, xpath_float, xpath_double/xpath_number, xpath_string, xpath

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-1027) Create UDFs for XPath expression evaluation

Posted by "Ning Zhang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-1027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12842751#action_12842751 ] 

Ning Zhang commented on HIVE-1027:
----------------------------------

Hi Patrick,

Could you regenerate the patch? It has some conflict with the current trunk. I'll make sure review it quickly and commit it after it is regenerated. 

Thanks,
Ning

> Create UDFs for XPath expression evaluation
> -------------------------------------------
>
>                 Key: HIVE-1027
>                 URL: https://issues.apache.org/jira/browse/HIVE-1027
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Query Processor
>            Reporter: Patrick Angeles
>            Assignee: Patrick Angeles
>            Priority: Minor
>             Fix For: 0.6.0
>
>         Attachments: hive-1027.patch, udf_xpath.patch
>
>
> Create UDFs for evaluating XPath expressions against XML documents.
> Examples:
> > SELECT xpath_double ('<a><b class="odd">1</b><b class="even">2</b><b class="odd">4</b><c>8</c></a>', 'sum(a/b[@class="odd"])') FROM src LIMIT 1 ;
> 5.0
> > SELECT xpath_string ('<a><b>b1</b><b>b2</b></a>', 'a/b[2]') FROM src LIMIT 1 ;
> b2
> > SELECT xpath ('<a><b>b1</b><b>b2</b><b>b3</b><c>c1</c><c>c2</c></a>', 'a/c/text()') FROM src LIMIT 1 ;
> ["c1","c2"]
> Included functions are: xpath_short, xpath_int, xpath_long, xpath_float, xpath_double/xpath_number, xpath_string, xpath

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-1027) Create UDFs for XPath expression evaluation

Posted by "Ning Zhang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ning Zhang updated HIVE-1027:
-----------------------------

    Resolution: Fixed
        Status: Resolved  (was: Patch Available)

Committed to trunk (0.6.0). Thanks for the contribution Patrick!

> Create UDFs for XPath expression evaluation
> -------------------------------------------
>
>                 Key: HIVE-1027
>                 URL: https://issues.apache.org/jira/browse/HIVE-1027
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Query Processor
>            Reporter: Patrick Angeles
>            Assignee: Patrick Angeles
>            Priority: Minor
>             Fix For: 0.6.0
>
>         Attachments: hive-1027-v2.patch, hive-1027-v3.patch, hive-1027.patch, HIVE-1027_3.patch, udf_xpath.patch
>
>
> Create UDFs for evaluating XPath expressions against XML documents.
> Examples:
> > SELECT xpath_double ('<a><b class="odd">1</b><b class="even">2</b><b class="odd">4</b><c>8</c></a>', 'sum(a/b[@class="odd"])') FROM src LIMIT 1 ;
> 5.0
> > SELECT xpath_string ('<a><b>b1</b><b>b2</b></a>', 'a/b[2]') FROM src LIMIT 1 ;
> b2
> > SELECT xpath ('<a><b>b1</b><b>b2</b><b>b3</b><c>c1</c><c>c2</c></a>', 'a/c/text()') FROM src LIMIT 1 ;
> ["c1","c2"]
> Included functions are: xpath_short, xpath_int, xpath_long, xpath_float, xpath_double/xpath_number, xpath_string, xpath

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-1027) Create UDFs for XPath expression evaluation

Posted by "Ning Zhang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ning Zhang updated HIVE-1027:
-----------------------------

    Status: Open  (was: Patch Available)

> Create UDFs for XPath expression evaluation
> -------------------------------------------
>
>                 Key: HIVE-1027
>                 URL: https://issues.apache.org/jira/browse/HIVE-1027
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Query Processor
>            Reporter: Patrick Angeles
>            Assignee: Patrick Angeles
>            Priority: Minor
>         Attachments: hive-1027.patch, udf_xpath.patch
>
>
> Create UDFs for evaluating XPath expressions against XML documents.
> Examples:
> > SELECT xpath_double ('<a><b class="odd">1</b><b class="even">2</b><b class="odd">4</b><c>8</c></a>', 'sum(a/b[@class="odd"])') FROM src LIMIT 1 ;
> 5.0
> > SELECT xpath_string ('<a><b>b1</b><b>b2</b></a>', 'a/b[2]') FROM src LIMIT 1 ;
> b2
> > SELECT xpath ('<a><b>b1</b><b>b2</b><b>b3</b><c>c1</c><c>c2</c></a>', 'a/c/text()') FROM src LIMIT 1 ;
> ["c1","c2"]
> Included functions are: xpath_short, xpath_int, xpath_long, xpath_float, xpath_double/xpath_number, xpath_string, xpath

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-1027) Create UDFs for XPath expression evaluation

Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-1027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12797420#action_12797420 ] 

Namit Jain commented on HIVE-1027:
----------------------------------

+1

looks good - will commit if the tests pass

> Create UDFs for XPath expression evaluation
> -------------------------------------------
>
>                 Key: HIVE-1027
>                 URL: https://issues.apache.org/jira/browse/HIVE-1027
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Query Processor
>            Reporter: Patrick Angeles
>            Assignee: Patrick Angeles
>            Priority: Minor
>         Attachments: hive-1027.patch, udf_xpath.patch
>
>
> Create UDFs for evaluating XPath expressions against XML documents.
> Examples:
> > SELECT xpath_double ('<a><b class="odd">1</b><b class="even">2</b><b class="odd">4</b><c>8</c></a>', 'sum(a/b[@class="odd"])') FROM src LIMIT 1 ;
> 5.0
> > SELECT xpath_string ('<a><b>b1</b><b>b2</b></a>', 'a/b[2]') FROM src LIMIT 1 ;
> b2
> > SELECT xpath ('<a><b>b1</b><b>b2</b><b>b3</b><c>c1</c><c>c2</c></a>', 'a/c/text()') FROM src LIMIT 1 ;
> ["c1","c2"]
> Included functions are: xpath_short, xpath_int, xpath_long, xpath_float, xpath_double/xpath_number, xpath_string, xpath

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-1027) Create UDFs for XPath expression evaluation

Posted by "Patrick Angeles (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Patrick Angeles updated HIVE-1027:
----------------------------------

    Status: Patch Available  (was: Open)

> Create UDFs for XPath expression evaluation
> -------------------------------------------
>
>                 Key: HIVE-1027
>                 URL: https://issues.apache.org/jira/browse/HIVE-1027
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Query Processor
>            Reporter: Patrick Angeles
>            Assignee: Patrick Angeles
>            Priority: Minor
>             Fix For: 0.6.0
>
>         Attachments: hive-1027-v2.patch, hive-1027.patch, udf_xpath.patch
>
>
> Create UDFs for evaluating XPath expressions against XML documents.
> Examples:
> > SELECT xpath_double ('<a><b class="odd">1</b><b class="even">2</b><b class="odd">4</b><c>8</c></a>', 'sum(a/b[@class="odd"])') FROM src LIMIT 1 ;
> 5.0
> > SELECT xpath_string ('<a><b>b1</b><b>b2</b></a>', 'a/b[2]') FROM src LIMIT 1 ;
> b2
> > SELECT xpath ('<a><b>b1</b><b>b2</b><b>b3</b><c>c1</c><c>c2</c></a>', 'a/c/text()') FROM src LIMIT 1 ;
> ["c1","c2"]
> Included functions are: xpath_short, xpath_int, xpath_long, xpath_float, xpath_double/xpath_number, xpath_string, xpath

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-1027) Create UDFs for XPath expression evaluation

Posted by "Ning Zhang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-1027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12797769#action_12797769 ] 

Ning Zhang commented on HIVE-1027:
----------------------------------

Thanks for the detailed explanations. It seems we are supporting XPath 1.0 here. When you say "xpath() returns multiple nodes(list)", do you mean it returns a serialized XML string representing the list of nodes such as <a>a1</a><a>a2</a> ...? In this case, do you have a test case for composing xpath() functions. For example and subquery returns XML string from the result of xpath() and the outer query takes that input to another xpath*() function?

For (4) I'm sure whether we should interpret of empty list as empty string etc. We can definitely define the mapping between the XML model to relation model this way, but it doesn't distinguish the case where the xpath_string() result is an empty list or it is a single node but the value of the node is empty (e.g., <a/> vs. no <a> element). 

Also all this information is better to be exposed to the wider community (not only developers) as well. Can you also add all these to the Hive's wiki page? 

> Create UDFs for XPath expression evaluation
> -------------------------------------------
>
>                 Key: HIVE-1027
>                 URL: https://issues.apache.org/jira/browse/HIVE-1027
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Query Processor
>            Reporter: Patrick Angeles
>            Assignee: Patrick Angeles
>            Priority: Minor
>         Attachments: hive-1027.patch, udf_xpath.patch
>
>
> Create UDFs for evaluating XPath expressions against XML documents.
> Examples:
> > SELECT xpath_double ('<a><b class="odd">1</b><b class="even">2</b><b class="odd">4</b><c>8</c></a>', 'sum(a/b[@class="odd"])') FROM src LIMIT 1 ;
> 5.0
> > SELECT xpath_string ('<a><b>b1</b><b>b2</b></a>', 'a/b[2]') FROM src LIMIT 1 ;
> b2
> > SELECT xpath ('<a><b>b1</b><b>b2</b><b>b3</b><c>c1</c><c>c2</c></a>', 'a/c/text()') FROM src LIMIT 1 ;
> ["c1","c2"]
> Included functions are: xpath_short, xpath_int, xpath_long, xpath_float, xpath_double/xpath_number, xpath_string, xpath

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-1027) Create UDFs for XPath expression evaluation

Posted by "Patrick Angeles (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Patrick Angeles updated HIVE-1027:
----------------------------------

    Status: Patch Available  (was: Open)

Hi Ning,

Attaching a new patch against trunk.

> Create UDFs for XPath expression evaluation
> -------------------------------------------
>
>                 Key: HIVE-1027
>                 URL: https://issues.apache.org/jira/browse/HIVE-1027
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Query Processor
>            Reporter: Patrick Angeles
>            Assignee: Patrick Angeles
>            Priority: Minor
>             Fix For: 0.6.0
>
>         Attachments: hive-1027.patch, udf_xpath.patch
>
>
> Create UDFs for evaluating XPath expressions against XML documents.
> Examples:
> > SELECT xpath_double ('<a><b class="odd">1</b><b class="even">2</b><b class="odd">4</b><c>8</c></a>', 'sum(a/b[@class="odd"])') FROM src LIMIT 1 ;
> 5.0
> > SELECT xpath_string ('<a><b>b1</b><b>b2</b></a>', 'a/b[2]') FROM src LIMIT 1 ;
> b2
> > SELECT xpath ('<a><b>b1</b><b>b2</b><b>b3</b><c>c1</c><c>c2</c></a>', 'a/c/text()') FROM src LIMIT 1 ;
> ["c1","c2"]
> Included functions are: xpath_short, xpath_int, xpath_long, xpath_float, xpath_double/xpath_number, xpath_string, xpath

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-1027) Create UDFs for XPath expression evaluation

Posted by "Ning Zhang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-1027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12844123#action_12844123 ] 

Ning Zhang commented on HIVE-1027:
----------------------------------

Thanks Patrick! I will take a look and get back soon. 

> Create UDFs for XPath expression evaluation
> -------------------------------------------
>
>                 Key: HIVE-1027
>                 URL: https://issues.apache.org/jira/browse/HIVE-1027
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Query Processor
>            Reporter: Patrick Angeles
>            Assignee: Patrick Angeles
>            Priority: Minor
>             Fix For: 0.6.0
>
>         Attachments: hive-1027-v2.patch, hive-1027.patch, udf_xpath.patch
>
>
> Create UDFs for evaluating XPath expressions against XML documents.
> Examples:
> > SELECT xpath_double ('<a><b class="odd">1</b><b class="even">2</b><b class="odd">4</b><c>8</c></a>', 'sum(a/b[@class="odd"])') FROM src LIMIT 1 ;
> 5.0
> > SELECT xpath_string ('<a><b>b1</b><b>b2</b></a>', 'a/b[2]') FROM src LIMIT 1 ;
> b2
> > SELECT xpath ('<a><b>b1</b><b>b2</b><b>b3</b><c>c1</c><c>c2</c></a>', 'a/c/text()') FROM src LIMIT 1 ;
> ["c1","c2"]
> Included functions are: xpath_short, xpath_int, xpath_long, xpath_float, xpath_double/xpath_number, xpath_string, xpath

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-1027) Create UDFs for XPath expression evaluation

Posted by "Patrick Angeles (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Patrick Angeles updated HIVE-1027:
----------------------------------

    Attachment: hive-1027-v3.patch

Cleaned up patch.

The Eclipse code cleanup hook generated by 'ant eclipse-files' was automatically removing casts. This seems to only affect the build for version 0.17.2.1

> Create UDFs for XPath expression evaluation
> -------------------------------------------
>
>                 Key: HIVE-1027
>                 URL: https://issues.apache.org/jira/browse/HIVE-1027
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Query Processor
>            Reporter: Patrick Angeles
>            Assignee: Patrick Angeles
>            Priority: Minor
>             Fix For: 0.6.0
>
>         Attachments: hive-1027-v2.patch, hive-1027-v3.patch, hive-1027.patch, udf_xpath.patch
>
>
> Create UDFs for evaluating XPath expressions against XML documents.
> Examples:
> > SELECT xpath_double ('<a><b class="odd">1</b><b class="even">2</b><b class="odd">4</b><c>8</c></a>', 'sum(a/b[@class="odd"])') FROM src LIMIT 1 ;
> 5.0
> > SELECT xpath_string ('<a><b>b1</b><b>b2</b></a>', 'a/b[2]') FROM src LIMIT 1 ;
> b2
> > SELECT xpath ('<a><b>b1</b><b>b2</b><b>b3</b><c>c1</c><c>c2</c></a>', 'a/c/text()') FROM src LIMIT 1 ;
> ["c1","c2"]
> Included functions are: xpath_short, xpath_int, xpath_long, xpath_float, xpath_double/xpath_number, xpath_string, xpath

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-1027) Create UDFs for XPath expression evaluation

Posted by "Patrick Angeles (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Patrick Angeles updated HIVE-1027:
----------------------------------

    Attachment: hive-1027.patch

updated patch... includes show_functions.q.out

> Create UDFs for XPath expression evaluation
> -------------------------------------------
>
>                 Key: HIVE-1027
>                 URL: https://issues.apache.org/jira/browse/HIVE-1027
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Query Processor
>            Reporter: Patrick Angeles
>            Priority: Minor
>         Attachments: hive-1027.patch, udf_xpath.patch
>
>
> Create UDFs for evaluating XPath expressions against XML documents.
> Examples:
> > SELECT xpath_double ('<a><b class="odd">1</b><b class="even">2</b><b class="odd">4</b><c>8</c></a>', 'sum(a/b[@class="odd"])') FROM src LIMIT 1 ;
> 5.0
> > SELECT xpath_string ('<a><b>b1</b><b>b2</b></a>', 'a/b[2]') FROM src LIMIT 1 ;
> b2
> > SELECT xpath ('<a><b>b1</b><b>b2</b><b>b3</b><c>c1</c><c>c2</c></a>', 'a/c/text()') FROM src LIMIT 1 ;
> ["c1","c2"]
> Included functions are: xpath_short, xpath_int, xpath_long, xpath_float, xpath_double/xpath_number, xpath_string, xpath

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-1027) Create UDFs for XPath expression evaluation

Posted by "Ning Zhang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-1027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12844202#action_12844202 ] 

Ning Zhang commented on HIVE-1027:
----------------------------------

Patrick, the patch didn't build correctly with Hadoop 0.17.2.1 (ant -Dhadoop.version=0.17.2.1 clean package). I think it is caused by the change to remove type castings in FunctionRegistry.java (e.g., line 410 etc.). Can you take a look and fix that?

> Create UDFs for XPath expression evaluation
> -------------------------------------------
>
>                 Key: HIVE-1027
>                 URL: https://issues.apache.org/jira/browse/HIVE-1027
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Query Processor
>            Reporter: Patrick Angeles
>            Assignee: Patrick Angeles
>            Priority: Minor
>             Fix For: 0.6.0
>
>         Attachments: hive-1027-v2.patch, hive-1027.patch, udf_xpath.patch
>
>
> Create UDFs for evaluating XPath expressions against XML documents.
> Examples:
> > SELECT xpath_double ('<a><b class="odd">1</b><b class="even">2</b><b class="odd">4</b><c>8</c></a>', 'sum(a/b[@class="odd"])') FROM src LIMIT 1 ;
> 5.0
> > SELECT xpath_string ('<a><b>b1</b><b>b2</b></a>', 'a/b[2]') FROM src LIMIT 1 ;
> b2
> > SELECT xpath ('<a><b>b1</b><b>b2</b><b>b3</b><c>c1</c><c>c2</c></a>', 'a/c/text()') FROM src LIMIT 1 ;
> ["c1","c2"]
> Included functions are: xpath_short, xpath_int, xpath_long, xpath_float, xpath_double/xpath_number, xpath_string, xpath

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.