You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2021/07/19 13:47:23 UTC

[GitHub] [druid] achimbab opened a new pull request #11467: Functions that extract a value from a JSON-encoded string.

achimbab opened a new pull request #11467:
URL: https://github.com/apache/druid/pull/11467


   This PR implements functions that extract a value from a JSON-encoded string.
   
   ### New functions
   
   | function | description |
   | --- | --- |
   | json_extract_string(expr,string...) | Extracts the String value from expr based on string arguments. |
   | json_extract_long(expr,string...) | Extracts the Long value from expr based on string arguments. |
   | json_extract_double(expr,string...) | Extracts the Double value from expr based on string arguments. |
   | jsonpath_extract_string(expr,json-path) | Extracts the String value from expr based on json-path. |
   | jsonpath_extract_long(expr,json-path) | Extracts the Long value from expr based on json-path. |
   | jsonpath_extract_double(expr,json-path) | Extracts the Double value from expr based on json-path. |
   
   ### Examples
   
   The `json_extract_*` family of functions extract a value from a JSON-encoded string.
   
   ```SQL
   SELECT json_extract_string('{"key1":"val1"}', 'key1')
   -- Extract a value from a nested object
   SELECT json_extract_string('{"key1":{"key2":"val2"}}', 'key1', 'key2')
   ```
   
   The `jsonpath_extract_*` family of functions support JSONPath expressions referring to a JSON structure in the same way as XPath expression.
   
   ```SQL
   SELECT jsonpath_extract_string('{"key":[{"k":"v1"},{"k":"v2"}]}', '$.key[0].k')
   ```
   
   ### FastJsonReader
   The `json_extract_*` family of functions use FastJsonReader.
   
   In order to improve extraction performance, I implemented light and fast parser called FastJsonReader.
   
   In my environment, it is faster than Jsonpath in the case of extracting a scalar value. However, Jsonpath offers lots of functionalities.
   
   ### Benchmark
   I did benchmark test with the query below.
   
   ```SQL
   select v, count(*) 
   from (
       select json_extract_string(json_column, 'key') as v 
       from test
   ) 
   group by v
   order by count(*) desc
   ```
   
   ####  Benchmark 1 - Single Node, Small dataset
   1 node / 6 core, 32G RAM, SSD
   
   Sample rows: 445,283
   
   ![image](https://user-images.githubusercontent.com/36371084/126165105-a31b53e2-0c4c-40c2-8fc0-64263a89a8da.png)
   
   #### Benchmark 2 - Cluster, Medium dataset
   15 Historical nodes / 32 core, 256G RAM, NVME
   
   Sample rows: 3247568
   
   ![image](https://user-images.githubusercontent.com/36371084/126166035-3076ae7d-645f-4099-8c58-5b8e8b34d808.png)
   
   
   #### Benchmark 3 - Cluster, Large dataset
   15 Historical nodes / 32 core, 256G RAM, NVME
   
   Sample rows: 9614347
   
   ![image](https://user-images.githubusercontent.com/36371084/126166027-a990a4bb-0250-4d0b-b0f5-3ef4216dccb5.png)
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] achimbab edited a comment on pull request #11467: Functions that extract a value from a JSON-encoded string.

Posted by GitBox <gi...@apache.org>.
achimbab edited a comment on pull request #11467:
URL: https://github.com/apache/druid/pull/11467#issuecomment-917428024


   I force-pushed just to modify a commit message for `cd8abc9`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] suneet-s commented on pull request #11467: Functions that extract a value from a JSON-encoded string.

Posted by GitBox <gi...@apache.org>.
suneet-s commented on pull request #11467:
URL: https://github.com/apache/druid/pull/11467#issuecomment-883070761


   closing and re-opening to try and trigger travis


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] suneet-s commented on pull request #11467: Functions that extract a value from a JSON-encoded string.

Posted by GitBox <gi...@apache.org>.
suneet-s commented on pull request #11467:
URL: https://github.com/apache/druid/pull/11467#issuecomment-916605008


   Thanks for the contribution @achimbab! The benchmark results look promising.
   
   I have to spend some time to understand the changes better and hope to get a review out next week. 
   
   In the mean time, could you resolve the merge conflicts that exist now please. Also I skimmed the changes and noticed new SQL functions - to test that they are wired up correctly, could you add a test to one of the `CalciteQueryTest` classes - these are what I use to write "integration tests" for SQL functions so we can ensure these functions continue to work with each new release of Druid.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] suneet-s commented on pull request #11467: Functions that extract a value from a JSON-encoded string.

Posted by GitBox <gi...@apache.org>.
suneet-s commented on pull request #11467:
URL: https://github.com/apache/druid/pull/11467#issuecomment-883070761


   closing and re-opening to try and trigger travis


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] achimbab edited a comment on pull request #11467: Functions that extract a value from a JSON-encoded string.

Posted by GitBox <gi...@apache.org>.
achimbab edited a comment on pull request #11467:
URL: https://github.com/apache/druid/pull/11467#issuecomment-917428024


   I force-pushed just to modify a commit messages for #cd8abc9.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] achimbab commented on pull request #11467: Functions that extract a value from a JSON-encoded string.

Posted by GitBox <gi...@apache.org>.
achimbab commented on pull request #11467:
URL: https://github.com/apache/druid/pull/11467#issuecomment-917428024


   I force-pushed just to modify commit messages.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] suneet-s closed pull request #11467: Functions that extract a value from a JSON-encoded string.

Posted by GitBox <gi...@apache.org>.
suneet-s closed pull request #11467:
URL: https://github.com/apache/druid/pull/11467


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] suneet-s closed pull request #11467: Functions that extract a value from a JSON-encoded string.

Posted by GitBox <gi...@apache.org>.
suneet-s closed pull request #11467:
URL: https://github.com/apache/druid/pull/11467


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] suneet-s commented on pull request #11467: Functions that extract a value from a JSON-encoded string.

Posted by GitBox <gi...@apache.org>.
suneet-s commented on pull request #11467:
URL: https://github.com/apache/druid/pull/11467#issuecomment-883070761


   closing and re-opening to try and trigger travis


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] achimbab commented on pull request #11467: Functions that extract a value from a JSON-encoded string.

Posted by GitBox <gi...@apache.org>.
achimbab commented on pull request #11467:
URL: https://github.com/apache/druid/pull/11467#issuecomment-917430060


   @suneet-s
   - I resolved the merge conflicts. 
   - I added tests to CalciteQueryTest class.
   
   Thank you. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] achimbab edited a comment on pull request #11467: Functions that extract a value from a JSON-encoded string.

Posted by GitBox <gi...@apache.org>.
achimbab edited a comment on pull request #11467:
URL: https://github.com/apache/druid/pull/11467#issuecomment-894963542


   @asdf2014 @suneet-s 
   Can you please do code review?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] achimbab edited a comment on pull request #11467: Functions that extract a value from a JSON-encoded string.

Posted by GitBox <gi...@apache.org>.
achimbab edited a comment on pull request #11467:
URL: https://github.com/apache/druid/pull/11467#issuecomment-894963542


   @asdf2014 @suneet-s 
   Can you please review this PR?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] suneet-s closed pull request #11467: Functions that extract a value from a JSON-encoded string.

Posted by GitBox <gi...@apache.org>.
suneet-s closed pull request #11467:
URL: https://github.com/apache/druid/pull/11467


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] achimbab commented on pull request #11467: Functions that extract a value from a JSON-encoded string.

Posted by GitBox <gi...@apache.org>.
achimbab commented on pull request #11467:
URL: https://github.com/apache/druid/pull/11467#issuecomment-894963542


   @asdf2014 @suneet-s 
   Can you please do a code review?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] achimbab commented on pull request #11467: Functions that extract a value from a JSON-encoded string.

Posted by GitBox <gi...@apache.org>.
achimbab commented on pull request #11467:
URL: https://github.com/apache/druid/pull/11467#issuecomment-928964627


   @suneet-s 
   I fixed `json_extract_*` functions to unescape strings.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] achimbab commented on pull request #11467: Functions that extract a value from a JSON-encoded string.

Posted by GitBox <gi...@apache.org>.
achimbab commented on pull request #11467:
URL: https://github.com/apache/druid/pull/11467#issuecomment-928964627


   @suneet-s 
   I fixed `json_extract_*` functions to unescape strings.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] suneet-s commented on pull request #11467: Functions that extract a value from a JSON-encoded string.

Posted by GitBox <gi...@apache.org>.
suneet-s commented on pull request #11467:
URL: https://github.com/apache/druid/pull/11467#issuecomment-929291877


   Thanks @achimbab I am reviewing this today!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] achimbab commented on pull request #11467: Functions that extract a value from a JSON-encoded string.

Posted by GitBox <gi...@apache.org>.
achimbab commented on pull request #11467:
URL: https://github.com/apache/druid/pull/11467#issuecomment-999219964


   @suneet-s 
   Hi, hope you are doing well. please let me know if there is something that I have to do in order to merge PR.
   Wish you a happy christmas.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] achimbab edited a comment on pull request #11467: Functions that extract a value from a JSON-encoded string.

Posted by GitBox <gi...@apache.org>.
achimbab edited a comment on pull request #11467:
URL: https://github.com/apache/druid/pull/11467#issuecomment-917428024


   I force-pushed just to modify a commit messages for the commit of `cd8abc9`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] suneet-s commented on pull request #11467: Functions that extract a value from a JSON-encoded string.

Posted by GitBox <gi...@apache.org>.
suneet-s commented on pull request #11467:
URL: https://github.com/apache/druid/pull/11467#issuecomment-929291877


   Thanks @achimbab I am reviewing this today!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org