You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pinot.apache.org by Pinot Slack Email Digest <sn...@apache.org> on 2020/07/24 02:00:06 UTC

Apache Pinot Daily Email Digest (2020-07-23)

<h3><u>#general</u></h3><br><strong>@ssmgood: </strong>@ssmgood has joined the channel<br><strong>@blcksrx: </strong>Hey Guys! I'm honored to announced that now is available to use *Apache Pinot* with  *PYTHON* SQL DB-API!
please check this out:
<https://u17000708.ct.sendgrid.net/ls/click?upn=1BiFF0-2FtVRazUn1cLzaiMeVicqEQpQl1k010IWh14BeZqRSPbbK4Ouq9qJtsyjBOm3uT_vGLQYiKGfBLXsUt3KGBrxeq6BCTMpPOLROqAvDqBeTyK8AHHfac81Dbv1pNZTgrG5NCX-2FnPr7fASj4khSR-2BcNSWjpFxmQ0JML44AuuiEBI8eM4XXqsSjcVYeCRPJliI75XyNsnTcxcuu7W2-2BOeLxRYwNMqtIFKXayBHB9GR5NJ6q6e5HniYv796o58hTLG3UmsNPZSvwrSR6B4JQE7kUV2M72hbdAXctfBt-2Bz85b4qs-3D><br><strong>@damianoporta: </strong>Hello everybody! Finally i get my small cluster up and running, thank you all for the support! :slightly_smiling_face: i am doing a final test to understand if i need to add one more node or not. However, just to make one thing a little bit clearer, i would like to know if we can "organize" data inside a Pinot Server by a specific column. For those of you who know Citus, I am referring to the distribution key for shards. Basically what i am asking is, if we have a specific column that is often used in group by clause, How can we store documents that have the same column (used in group by) on the same server? I think it is an important thing. Because for example, in my custom aggregation func i need to sort the documents of each segment (in `aggregateGroupBySV()`) before working on it (i am trying to do a similar thing that *window functions* do). I know that a Server has more segments and the documents order in segments could be random.... BUT if i have all the documents of that specific key in the same server i could avoid sorting again everything in `extractFinalResult()`  that is called at Broker level. I know there is a `merge()` method used to merge all the results of each segment, if i can do something after that MERGE i can shift all the computation process at the Server level instead of Broker and i think it is an important thing, otherwise the Broker should work with all the results of each Server and then sort+compute (in my case).<br><strong>@axitkhurana: </strong>@axitkhurana has joined the channel<br><strong>@mailtobuchi: </strong>Simple question: Does broker send the list of segments to be queried to Server along with the query? I think not but want to double check.<br><h3><u>#random</u></h3><br><strong>@ssmgood: </strong>@ssmgood has joined the channel<br><strong>@axitkhurana: </strong>@axitkhurana has joined the channel<br><h3><u>#troubleshooting</u></h3><br><strong>@ronak: </strong>@ronak has joined the channel<br><strong>@yash.agarwal: </strong>What is the correct schema for a date column ? i am using the following,
```{
      "name": "sls_d",
      "dataType": "STRING",
      "format": "1:DAYS:SIMPLE_DATE_FORMAT:yyyy-MM-dd",
      "granularity": "1:DAYS"
}```
 but i am getting
```Caused by: java.lang.IllegalArgumentException: Invalid format: "null"
	at org.joda.time.format.DateTimeParserBucket.doParseMillis(DateTimeParserBucket.java:187) ~[pinot-all.jar:0.4.0-8355d2e0e489a8d127f2e32793671fba505628a8]```<br><h3><u>#pinot-dev</u></h3><br><strong>@axitkhurana: </strong>@axitkhurana has joined the channel<br><strong>@jlli: </strong>Hi @npawar, I have a PR to fix the issue of  incorrectly fetching the value of multi-value column. Could you review it? <https://u17000708.ct.sendgrid.net/ls/click?upn=1BiFF0-2FtVRazUn1cLzaiMSfW2QiSG4bkQpnpkSL7FiK3MHb8libOHmhAW89nP5XKS49vQ7VScMU5EvCZ5zt1AQ-3D-3DQAgj_vGLQYiKGfBLXsUt3KGBrxeq6BCTMpPOLROqAvDqBeTyK8AHHfac81Dbv1pNZTgrGqBrkdHzo1-2Fph-2BPlHukltnXk-2BvPDv9nXohsCGRfV-2BDoI9ZIGaP0LatsKHRu18oTw1bbMLaKDyyYK0P-2FfouMuB1KjF0ZhNEMyeDOZKfwqtUF3TNr7TitfN9FmY5gZTT1vacr6Hy2GpZQeMs4xPAJ9BPJP6V5It4t8546HCeqblKqs-3D><br><strong>@npawar: </strong>@jlli, is it possible to fix the avro files? such that the field you’re interested in is not in a GenericRecord?<br><h3><u>#presto-pinot-streaming</u></h3><br><strong>@jackie.jxt: </strong>@elon.azoulay Did you get a chance to address the comments in the PR?<br><strong>@jackie.jxt: </strong>Can you please give me push access to your fork branch so that I can also work on it?<br><strong>@elon.azoulay: </strong>Will be pushing shortly, sure!<br><strong>@g.kishore: </strong>i got it to compile<br><strong>@g.kishore: </strong>it was just intelliJ acting weird<br>