You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pinot.apache.org by Pinot Slack Email Digest <ap...@gmail.com> on 2021/12/01 02:00:16 UTC

Apache Pinot Daily Email Digest (2021-11-30)

### _#general_

  
 **@kiril.lstpd:** @kiril.lstpd has joined the channel  
 **@sainikeshk:** @sainikeshk has joined the channel  
 **@ashish:** json path expressions in query are not working. select
payload.push_id from githubEvents limit 10 throws unknown column name
exception. But using json functions works. @amrish.k.lal and @steotia - this
is using json batch quickstart setup.  
**@amrish.k.lal:** Thanks, will take a look.  
**@g.kishore:** I dont think that is tested well and can result in unexpected
behavior when there is nested structure. Please use json functions if possible  
**@ashish:** ok, thanks  
**@ashish:** We are going to be using a flat structure - so if json path
expressions work consistently for those, we are good.  
**@amrish.k.lal:** There are some examples in our test file that could help
you decide:  
**@amrish.k.lal:** Seems like this broke recently due to . (?) @xiangfu0 FYI.  
**@xiangfu0:** got it, then we can loose the identifier check for json type
expression with dot  
**@amrish.k.lal:** brackets are also valid `[` `]` currently along with dot
`.` In future, we hope to expand it to take a full json path expression
`SELECT tablename.columnname.jsonpath FROM myTable` provided that
`tablename.columnname` is. column of type `JSON`.  
**@xiangfu0:** hmm, why `tablename.columnname` not just `columnname`?  
**@amrish.k.lal:** From what I recall, `tablename.columnname` was already
supported in queries before we expanded identifiers to JSON notation.  
**@xiangfu0:** I feel those column information should be standardized before
the check?  
**@xiangfu0:** it is  
**@xiangfu0:** since we are doing multiple round of checks/rewrite, then maybe
we should rewrite the idenfier by checking the table name and just keep the
column name  
**@amrish.k.lal:** Yes, after that `JsonStatementOptimizer` does a check to
ensure that the the dot notation is applying to a JSON column. I can add a fix
for this if you like? I think we would need to add a couple of integration
tests as well to make sure this doesn't break again. The change should have
gotten caught by a test case :slightly_smiling_face: otherwise its easy to
skip this again.  
**@jackie.jxt:** We should add some json queries into the integration test.
That should be able to catch such regressions  
**@ashish:** Thanks  
 **@ashish:** One question on mergeRollup tasks - do they honor segment
partitioning strategy? For example, I have setup segmentPrunerTypes with some
dimension based partitioning. When merge rollup tasks produce the merged
segment, will the merged segments contain data that honors the dimension based
partitioning?  
**@snlee:** For custom partitioning enabled tables, merge/rollup tasks will
try to pick the segments from the same partition at the best effort (we honor
segments with 1 partition). If some segments have data from more than 1
partitions, we fall back to the original behavior.  
**@ashish:** Thanks  

###  _#random_

  
 **@kiril.lstpd:** @kiril.lstpd has joined the channel  
 **@sainikeshk:** @sainikeshk has joined the channel  
 **@bagi.priyank:** @bagi.priyank has left the channel  

###  _#troubleshooting_

  
 **@anusha.munukuntla:** Hello Team, I see that the new version 0.9.0 is
released. I am trying to enable authentication but I am unable to.. Could
someone please guide me. Is there any documentation available for that ?
Thanks in advance.  
**@mayanks:** Hi @anusha.munukuntla you can refer to:  
**@very312:** Hello team, My team wants to make a table which has two 3
timestamp cols and few other string cols. While we construct this table
schema, when we added two timestamp cols (unixtime milli) in dimension spec,
topic couldnt consume the event. Could you please give us reason why?  
**@mayanks:** Put all 3 time stamps in the datetimefieldspec. Specify the
actual time column in table config.  
**@very312:** What do you mean by “specify the actual time column in table
config?” Does it mean we need to assign certain time like below?  
**@very312:** I’ve just tried below schema, but it doesnt work still.  
**@mark.needham:** can you share the error message that you're seeing?  
**@mayanks:** As Mark mentioned, if you can share more details on what
issue/error you run into, that would be helpful.  
 **@valentin:** Hello, I was wondering if it was planned to add the `LIKE`
operator to `JSON_MATCH` ? I’m currently using
```REGEXP_LIKE(JSONEXTRACTSCALAR("labels", '$.demande_intention', 'STRING'),
'terminal')``` but it’s very slow (even with small number of scanned documents
(21). And maybe having it directly with `JSON_MATCH` could speed-up this
operation? ```JSON_MATCH("labels", 'demande_intention LIKE ''terminal''')```
Thank you  
**@atri.sharma:** Can you open an issue about it and tag me, please?  
**@valentin:** done  
**@richard892:** on this in particular, there are a lot of performance
problems in the JsonPath library, but I've fixed a lot of them. For instance,
since you mentioned that your labels are sparse, JsonPath throws an exception
when it can't produce a value. Unfortunately the changes haven't been released
yet.  
**@richard892:** Fixing this won't be as effective as a specialised index, but
it would be interesting to see how much it helps  
**@atri.sharma:** Yeah, I am only looking at supporting LIKE within JSON_MATCH  
**@atri.sharma:** Even for completeness dake  
**@atri.sharma:** Sake  
**@valentin:** For exceptions, I’ve seen this, I use a `JSON_MATCH(labels,
'demande_intention IS NOT NULL')` before trying to extract  
**@richard892:** well, the exception is still thrown but caught  
**@valentin:** Do you have any idea when you will be able to implement the
LIKE and release it?  
**@atri.sharma:** I can't promise a timeline, but should be a part of the next
Pinot release for sure  
**@valentin:** And you think using the LIKE with JSON_MATCH will be more
performant than using JSONEXTRACTSCALAR ?  
 **@nair.a:** Hi Team, I was trying out "comparisonColumn" config of
upsertConfig, it seems like table config is not accepting this config. after
updating the table config, config is still the same like below.
"upsertConfig": { "mode": "FULL" }, Can someone please help?  
 **@kiril.lstpd:** @kiril.lstpd has joined the channel  
 **@sainikeshk:** @sainikeshk has joined the channel  

###  _#getting-started_

  
 **@shubhendu.goswami:** @shubhendu.goswami has joined the channel  
\--------------------------------------------------------------------- To
unsubscribe, e-mail: dev-unsubscribe@pinot.apache.org For additional commands,
e-mail: dev-help@pinot.apache.org