You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2021/07/19 13:47:23 UTC
[GitHub] [druid] achimbab opened a new pull request #11467: Functions that extract a value from a JSON-encoded string.
achimbab opened a new pull request #11467:
URL: https://github.com/apache/druid/pull/11467
This PR implements functions that extract a value from a JSON-encoded string.
### New functions
| function | description |
| --- | --- |
| json_extract_string(expr,string...) | Extracts the String value from expr based on string arguments. |
| json_extract_long(expr,string...) | Extracts the Long value from expr based on string arguments. |
| json_extract_double(expr,string...) | Extracts the Double value from expr based on string arguments. |
| jsonpath_extract_string(expr,json-path) | Extracts the String value from expr based on json-path. |
| jsonpath_extract_long(expr,json-path) | Extracts the Long value from expr based on json-path. |
| jsonpath_extract_double(expr,json-path) | Extracts the Double value from expr based on json-path. |
### Examples
The `json_extract_*` family of functions extract a value from a JSON-encoded string.
```SQL
SELECT json_extract_string('{"key1":"val1"}', 'key1')
-- Extract a value from a nested object
SELECT json_extract_string('{"key1":{"key2":"val2"}}', 'key1', 'key2')
```
The `jsonpath_extract_*` family of functions support JSONPath expressions referring to a JSON structure in the same way as XPath expression.
```SQL
SELECT jsonpath_extract_string('{"key":[{"k":"v1"},{"k":"v2"}]}', '$.key[0].k')
```
### FastJsonReader
The `json_extract_*` family of functions use FastJsonReader.
In order to improve extraction performance, I implemented light and fast parser called FastJsonReader.
In my environment, it is faster than Jsonpath in the case of extracting a scalar value. However, Jsonpath offers lots of functionalities.
### Benchmark
I did benchmark test with the query below.
```SQL
select v, count(*)
from (
select json_extract_string(json_column, 'key') as v
from test
)
group by v
order by count(*) desc
```
#### Benchmark 1 - Single Node, Small dataset
1 node / 6 core, 32G RAM, SSD
Sample rows: 445,283
![image](https://user-images.githubusercontent.com/36371084/126165105-a31b53e2-0c4c-40c2-8fc0-64263a89a8da.png)
#### Benchmark 2 - Cluster, Medium dataset
15 Historical nodes / 32 core, 256G RAM, NVME
Sample rows: 3247568
![image](https://user-images.githubusercontent.com/36371084/126166035-3076ae7d-645f-4099-8c58-5b8e8b34d808.png)
#### Benchmark 3 - Cluster, Large dataset
15 Historical nodes / 32 core, 256G RAM, NVME
Sample rows: 9614347
![image](https://user-images.githubusercontent.com/36371084/126166027-a990a4bb-0250-4d0b-b0f5-3ef4216dccb5.png)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org
[GitHub] [druid] achimbab edited a comment on pull request #11467: Functions that extract a value from a JSON-encoded string.
Posted by GitBox <gi...@apache.org>.
achimbab edited a comment on pull request #11467:
URL: https://github.com/apache/druid/pull/11467#issuecomment-917428024
I force-pushed just to modify a commit message for `cd8abc9`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org
[GitHub] [druid] suneet-s commented on pull request #11467: Functions that extract a value from a JSON-encoded string.
Posted by GitBox <gi...@apache.org>.
suneet-s commented on pull request #11467:
URL: https://github.com/apache/druid/pull/11467#issuecomment-883070761
closing and re-opening to try and trigger travis
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org
[GitHub] [druid] suneet-s commented on pull request #11467: Functions that extract a value from a JSON-encoded string.
Posted by GitBox <gi...@apache.org>.
suneet-s commented on pull request #11467:
URL: https://github.com/apache/druid/pull/11467#issuecomment-916605008
Thanks for the contribution @achimbab! The benchmark results look promising.
I have to spend some time to understand the changes better and hope to get a review out next week.
In the mean time, could you resolve the merge conflicts that exist now please. Also I skimmed the changes and noticed new SQL functions - to test that they are wired up correctly, could you add a test to one of the `CalciteQueryTest` classes - these are what I use to write "integration tests" for SQL functions so we can ensure these functions continue to work with each new release of Druid.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org
[GitHub] [druid] suneet-s commented on pull request #11467: Functions that extract a value from a JSON-encoded string.
Posted by GitBox <gi...@apache.org>.
suneet-s commented on pull request #11467:
URL: https://github.com/apache/druid/pull/11467#issuecomment-883070761
closing and re-opening to try and trigger travis
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org
[GitHub] [druid] achimbab edited a comment on pull request #11467: Functions that extract a value from a JSON-encoded string.
Posted by GitBox <gi...@apache.org>.
achimbab edited a comment on pull request #11467:
URL: https://github.com/apache/druid/pull/11467#issuecomment-917428024
I force-pushed just to modify a commit messages for #cd8abc9.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org
[GitHub] [druid] achimbab commented on pull request #11467: Functions that extract a value from a JSON-encoded string.
Posted by GitBox <gi...@apache.org>.
achimbab commented on pull request #11467:
URL: https://github.com/apache/druid/pull/11467#issuecomment-917428024
I force-pushed just to modify commit messages.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org
[GitHub] [druid] suneet-s closed pull request #11467: Functions that extract a value from a JSON-encoded string.
Posted by GitBox <gi...@apache.org>.
suneet-s closed pull request #11467:
URL: https://github.com/apache/druid/pull/11467
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org
[GitHub] [druid] suneet-s closed pull request #11467: Functions that extract a value from a JSON-encoded string.
Posted by GitBox <gi...@apache.org>.
suneet-s closed pull request #11467:
URL: https://github.com/apache/druid/pull/11467
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org
[GitHub] [druid] suneet-s commented on pull request #11467: Functions that extract a value from a JSON-encoded string.
Posted by GitBox <gi...@apache.org>.
suneet-s commented on pull request #11467:
URL: https://github.com/apache/druid/pull/11467#issuecomment-883070761
closing and re-opening to try and trigger travis
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org
[GitHub] [druid] achimbab commented on pull request #11467: Functions that extract a value from a JSON-encoded string.
Posted by GitBox <gi...@apache.org>.
achimbab commented on pull request #11467:
URL: https://github.com/apache/druid/pull/11467#issuecomment-917430060
@suneet-s
- I resolved the merge conflicts.
- I added tests to CalciteQueryTest class.
Thank you.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org
[GitHub] [druid] achimbab edited a comment on pull request #11467: Functions that extract a value from a JSON-encoded string.
Posted by GitBox <gi...@apache.org>.
achimbab edited a comment on pull request #11467:
URL: https://github.com/apache/druid/pull/11467#issuecomment-894963542
@asdf2014 @suneet-s
Can you please do code review?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org
[GitHub] [druid] achimbab edited a comment on pull request #11467: Functions that extract a value from a JSON-encoded string.
Posted by GitBox <gi...@apache.org>.
achimbab edited a comment on pull request #11467:
URL: https://github.com/apache/druid/pull/11467#issuecomment-894963542
@asdf2014 @suneet-s
Can you please review this PR?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org
[GitHub] [druid] suneet-s closed pull request #11467: Functions that extract a value from a JSON-encoded string.
Posted by GitBox <gi...@apache.org>.
suneet-s closed pull request #11467:
URL: https://github.com/apache/druid/pull/11467
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org
[GitHub] [druid] achimbab commented on pull request #11467: Functions that extract a value from a JSON-encoded string.
Posted by GitBox <gi...@apache.org>.
achimbab commented on pull request #11467:
URL: https://github.com/apache/druid/pull/11467#issuecomment-894963542
@asdf2014 @suneet-s
Can you please do a code review?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org
[GitHub] [druid] achimbab commented on pull request #11467: Functions that extract a value from a JSON-encoded string.
Posted by GitBox <gi...@apache.org>.
achimbab commented on pull request #11467:
URL: https://github.com/apache/druid/pull/11467#issuecomment-928964627
@suneet-s
I fixed `json_extract_*` functions to unescape strings.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org
[GitHub] [druid] achimbab commented on pull request #11467: Functions that extract a value from a JSON-encoded string.
Posted by GitBox <gi...@apache.org>.
achimbab commented on pull request #11467:
URL: https://github.com/apache/druid/pull/11467#issuecomment-928964627
@suneet-s
I fixed `json_extract_*` functions to unescape strings.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org
[GitHub] [druid] suneet-s commented on pull request #11467: Functions that extract a value from a JSON-encoded string.
Posted by GitBox <gi...@apache.org>.
suneet-s commented on pull request #11467:
URL: https://github.com/apache/druid/pull/11467#issuecomment-929291877
Thanks @achimbab I am reviewing this today!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org
[GitHub] [druid] achimbab commented on pull request #11467: Functions that extract a value from a JSON-encoded string.
Posted by GitBox <gi...@apache.org>.
achimbab commented on pull request #11467:
URL: https://github.com/apache/druid/pull/11467#issuecomment-999219964
@suneet-s
Hi, hope you are doing well. please let me know if there is something that I have to do in order to merge PR.
Wish you a happy christmas.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org
[GitHub] [druid] achimbab edited a comment on pull request #11467: Functions that extract a value from a JSON-encoded string.
Posted by GitBox <gi...@apache.org>.
achimbab edited a comment on pull request #11467:
URL: https://github.com/apache/druid/pull/11467#issuecomment-917428024
I force-pushed just to modify a commit messages for the commit of `cd8abc9`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org
[GitHub] [druid] suneet-s commented on pull request #11467: Functions that extract a value from a JSON-encoded string.
Posted by GitBox <gi...@apache.org>.
suneet-s commented on pull request #11467:
URL: https://github.com/apache/druid/pull/11467#issuecomment-929291877
Thanks @achimbab I am reviewing this today!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org