You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pinot.apache.org by Pinot Slack Email Digest <sn...@apache.org> on 2020/10/06 02:00:17 UTC

Apache Pinot Daily Email Digest (2020-10-05)

<h3><u>#general</u></h3><br><strong>@suwanda.hendry: </strong>@suwanda.hendry has joined the channel<br><strong>@wojciech: </strong>@wojciech has joined the channel<br><strong>@wojciech: </strong>Hi guys, I'm looking into adopting Pinot in our organisation and it looks like good fit! The problem I'm trying to solve is to move away from using BigQuery for daily Superset dashboards and make use of Pinot in user-facing apps. I have pre-cubed data coming from Spark/Snappy  data in Kafka and want to use it as source in Pinot. The only problem is the data is "append-only" (comes from Debezium), so we have all create and update records in one stream (let's say user bet's on sport, his ticket is created, has a state "accepted", then his ticket changes the state to winning or non-winning and we have another record in Kafka). In BQ we use `row_number over (partition by ticket_number order by source.lsn DESC)` which numbers rows with the 1 as the newest and then we look for rows numbers = 1 in sub-query (<https://u17000708.ct.sendgrid.net/ls/click?upn=1BiFF0-2FtVRazUn1cLzaiMcdN0SE1Hov52ehRdpfp8kg0j45-2FdY7mbsE7aR7ReWZwOuiPOcpRge6ySQ8I1rUVtV5K8lIfo8fNHhbuaP0AR29o6Ck3gdYeh-2B8Ts-2FLkBLfIPdWXHlKmqfnLXs76rr350g-3D-3DhU6y_vGLQYiKGfBLXsUt3KGBrxeq6BCTMpPOLROqAvDqBeTy3PRQIaB0EnVZB7xdS07OqeG5wdxRjTXAK23nn9wk2kDL4CpBKW-2Bl06N4cQ-2FK07UMoYE0n4ygqrsd0Eg8LmdH-2BW6StaWyfrtbA20CAHwuNKwsBBGdV-2F9kJZW-2FjQL0cfmrFSIhFTfR25owF8dkH4EWFvzLS8uXudoVhFg4aLjyzz5cgrRa1sD0owQ274-2BKuets-3D docs>). How would you solve it in Pinot, I didn't find windowing/analytic functions. Thanks in advance!<br><strong>@mail2samarthk: </strong>@mail2samarthk has joined the channel<br><strong>@tirumalap: </strong>hello there, getting the following exception..
`Caused by: java.lang.IllegalArgumentException: Parameter 'Bucket' must not be null`
I am using 0.5.0

GenerationJobRunner,
 segmentTarPushJobRunnerClassName: org.apache.pinot.plugin.ingestion.batch.standalone.SegmentTarPushJobRunner,
 segmentUriPushJobRunnerClassName: org.apache.pinot.plugin.ingestion.batch.standalone.SegmentUriPushJobRunner}
includeFileNamePattern: glob:**/*.parquet
inputDirURI: <s3://edp-pinot-data/nem13/>
jobType: SegmentCreationAndUriPush
outputDirURI: <s3://edp-pinot-segments/nem13/segments>
overwriteOutput: true
pinotClusterSpecs:
- {controllerURI: '<https://u17000708.ct.sendgrid.net/ls/click?upn=iSrCRfgZvz-2BV64a3Rv7HYX9MrNSia07OjpICnslY2pE-3DuDJr_vGLQYiKGfBLXsUt3KGBrxeq6BCTMpPOLROqAvDqBeTy3PRQIaB0EnVZB7xdS07OqYGTCJ4T9luciUg-2B3DE1Ja-2Ff31HIhteXQWcV6tDx4e4MAJPGW7xaTk-2FiVUQo7CiKkw9vrcLa2uELROtHxn1Z7WtU2oYOUGkxwLbpaZYy1V4akgwi-2FkJh-2Bu2KZl4Jf8a8-2BBXRG1bQwcD1B7jWO0LVwVETPqVAMBqrkY1xGKZP-2FBvE-3D>'}
pinotFSSpecs:
- {className: org.apache.pinot.spi.filesystem.LocalPinotFS, configs: null, scheme: file}
- className: org.apache.pinot.plugin.filesystem.S3PinotFS
 configs: {region: ap-southeast-2}
 scheme: s3
pushJobSpec: {pushAttempts: 1, pushParallelism: 1, pushRetryIntervalMillis: 1000,
 segmentUriPrefix: '<s3://edp-pinot-segments>', segmentUriSuffix: null}
recordReaderSpec: {className: org.apache.pinot.plugin.inputformat.parquet.ParquetRecordReader,
 configClassName: null, configs: null, dataFormat: parquet}
segmentNameGeneratorSpec: null
tableSpec: {schemaURI: '<https://u17000708.ct.sendgrid.net/ls/click?upn=iSrCRfgZvz-2BV64a3Rv7HYe8MIpHtV99AuKK5kkWoXnDMXkKpb8u85O5wU3AKHTs7xm8PebeNrgSNczjw-2BOYtcA-3D-3D1lHs_vGLQYiKGfBLXsUt3KGBrxeq6BCTMpPOLROqAvDqBeTy3PRQIaB0EnVZB7xdS07OqyWIOjB1qUxrB30-2BnwLzRT-2Bza6w1oLorXpPagbozBPCiEyY4PXlTLIQ5jo-2FyzmUhTVJnE6nacHPlQWRwcs-2FBZejSeqLi9Lp2y890aDFuoCGTAlvyVuLuDS5G9Kn-2B1KMXmj31k2REOV1G7zsh2l4uw4oPR1jw8kLHVxd3Z3hJa6AE-3D>', tableConfigURI: '<https://u17000708.ct.sendgrid.net/ls/click?upn=iSrCRfgZvz-2BV64a3Rv7HYe8MIpHtV99AuKK5kkWoXnBQFxSnLNHnmx-2ByJtTVxkrVl_PE_vGLQYiKGfBLXsUt3KGBrxeq6BCTMpPOLROqAvDqBeTy3PRQIaB0EnVZB7xdS07OqBd59genGpBMi8RWfUBiubGRcVSEktEYwKQJACFnbQdh5hIEkMW2bPMlAj-2B1CpykO-2FTsXMisNeszc7o1VHq09cyKJU8z3L0z9IBqcu-2FlBUXMsCSeKDInhOmzf8LsztlazLRIK-2Bgp7zf0DT3ZKRzf-2Fc6Q3ZNT5my2RJxvIdgSH5Ik-3D>',
 tableName: nem13}

Am I missing anything? Please help!!!<br><h3><u>#random</u></h3><br><strong>@suwanda.hendry: </strong>@suwanda.hendry has joined the channel<br><strong>@wojciech: </strong>@wojciech has joined the channel<br><strong>@mail2samarthk: </strong>@mail2samarthk has joined the channel<br>