You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@phoenix.apache.org by "Aakash Pradeep (JIRA)" <ji...@apache.org> on 2015/04/16 21:53:00 UTC

[jira] [Comment Edited] (PHOENIX-1710) Implement the json_extract_path_text built-in function

    [ https://issues.apache.org/jira/browse/PHOENIX-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14498584#comment-14498584 ] 

Aakash Pradeep edited comment on PHOENIX-1710 at 4/16/15 7:52 PM:
------------------------------------------------------------------

IMHO we should make this method more universal and applicable to read any data. I looked into the corresponding method definition in Postgres and Amazon RedShift (http://docs.aws.amazon.com/redshift/latest/dg/json-functions.html) and what I observed that using this method we can only read the json for a given key, it would not allow us to read json inside a json array and to read json array we have to design a separate method like json_extract_array_element_text('json string', pos) which in case of complex json will be very difficult for user. 
Instead we can define this method to be generic to read any data inside a json. What I mean by that is to define a pattern to provide json path, so that we can define both key and array index. I am proposing this pattern to specify array index :

json_extract_path(json_column , ARRAY[ 'key1','key2', '[<index>]' , 'key_inside_array' , '[<index>]')   

if the given path start with '[' and end with ']' that means it is referring to read an index. 

So for reading "phone numbers of Jimmy's second contact"  from  this example json , we can write 

json_extact_path_text(contacts_json, ARRAY['[1]','person','contacts','[2]','phone']) and it should return "[
                            956-XXX-YYYY,
                            898-XXX-YYYY
                        ]"

Example json
---------------------

[
    {
        "person": {
            "name": "jimmy",
            "address": {
                "street": "1st market street",
                "city": "San Franciso",
                "zipcode": "945001"
            },
            "email": "test@apache.org",
            "phone": [
                9567686788,
                8988986785
            ],
            "contacts": [
                {
                    "person": {
                        "name": "sam",
                        "address": {
                            "street": "1st market street",
                            "city": "San Franciso",
                            "zipcode": "945001"
                        },
                        "email": "test@apache.org",
                        "phone": [
                            956-XXX-YYYY,
                            898-XXX-YYYY
                        ]
                    }
                },
                {
                    "person": {
                        "name": "doug",
                        "address": {
                            "street": "1st market street",
                            "city": "San Franciso",
                            "zipcode": "945001"
                        },
                        "email": "test@apache.org",
                        "phone": [
                            9567686788,
                            8988986785
                        ]
                    }
                }
            ]
        }
    }
]


was (Author: aakash.pradeep):
IMHO we should make this method more universal and applicable to read any data. I looked into the corresponding method definition in Postgres and Amazon RedShift (http://docs.aws.amazon.com/redshift/latest/dg/json-functions.html) and what I observed that using this method we can only read the json for a given key, it would not allow us to read json inside a json array and to read json array we have to design a separate method like json_extract_array_element_text('json string', pos) which in case of complex json will be very difficult for user. 
Instead we can define this method to be generic to read any data inside a json. What I mean by that is to define a pattern to provide json path, so that we can define both key and array index. I am proposing this pattern to specify array index :

json_extract_path(json_column , ARRAY[ 'key1','key2', '[<index>]' , 'key_inside_array' , '[<index>]')   

if the given path start with '[' and end with ']' that means it is referring to read an index. 

So for reading "phone numbers of Jimmy's second contact"  from  this example json , we can write 

json_extact_path_text(contacts_json, ARRAY['[1]','person','contacts','[2]','phone']) and it should return "[
                            956-XXX-YYYY,
                            898-XXX-YYYY
                        ]"

[
    {
        "person": {
            "name": "jimmy",
            "address": {
                "street": "1st market street",
                "city": "San Franciso",
                "zipcode": "945001"
            },
            "email": "test@apache.org",
            "phone": [
                9567686788,
                8988986785
            ],
            "contacts": [
                {
                    "person": {
                        "name": "sam",
                        "address": {
                            "street": "1st market street",
                            "city": "San Franciso",
                            "zipcode": "945001"
                        },
                        "email": "test@apache.org",
                        "phone": [
                            956-XXX-YYYY,
                            898-XXX-YYYY
                        ]
                    }
                },
                {
                    "person": {
                        "name": "doug",
                        "address": {
                            "street": "1st market street",
                            "city": "San Franciso",
                            "zipcode": "945001"
                        },
                        "email": "test@apache.org",
                        "phone": [
                            9567686788,
                            8988986785
                        ]
                    }
                }
            ]
        }
    }
]

> Implement the json_extract_path_text built-in function
> ------------------------------------------------------
>
>                 Key: PHOENIX-1710
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-1710
>             Project: Phoenix
>          Issue Type: Sub-task
>            Reporter: James Taylor
>            Assignee: NIsala Niroshana
>
> Implement the json_extract_path_text modeled after the Postgres function. This function returns JSON pointed to by the path elements argument. In Phoenix, it could be implemented like this:
> {code}
> VARCHAR json_extract_path_text (VARCHAR json, VARCHAR ARRAY path_elems paths)
> {code}
> For example:
> {code}
> SELECT json_extract_path_text(json_col, ARRAY['f4','f6']) FROM my_table;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)