You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pinot.apache.org by Pinot Slack Email Digest <ap...@gmail.com> on 2022/05/14 11:19:19 UTC

Apache Pinot Daily Email Digest (2022-05-14)

### _#general_

  
 **@ysuo:** Hi team, I have a column property like this, when I query this
field, part of its value is …..>>>ignored size:5051. Does it mean the
maxLength is smaller than the real size for the string content? Or Pinot
stores the whole string, but doesn’t show the full content when queried? {
“name”: “content”, “dataType”: “STRING”, “maxLength”: 17825792 }  
**@mayanks:** Default max length is 512 for strings. You can change it in the
schema  
**@ysuo:** Actually I have changed it to 17 825 792.  
**@ysuo:** Sorry, my bad. The data in Kafka is the same.:sweat_smile:  
 **@zhixun.hong:** @zhixun.hong has joined the channel  
 **@mapshen:** Hi, according to  > with `unnestFields` , a record with the
nested collection will unnest into multiple records If we have two fields to
unnest, each field is an array of 4 elements, does it mean we would get 4 * 4
= 16 records?  
 **@nizar.hejazi:** Created this github issue to propose returning Null
(instead of default value) in response for selection queries (if a config is
set):  Please review and let me know your input  
 **@rbobbala:** @rbobbala has joined the channel  

###  _#random_

  
 **@zhixun.hong:** @zhixun.hong has joined the channel  
 **@rbobbala:** @rbobbala has joined the channel  

###  _#troubleshooting_

  
 **@saumya2700:** one of the realtime table is skipping records, not found any
OOM related issues in logs, external view and segment health is also good,
only found this line which is different than others and getting issue in same
table, is it indicating something fishy or just info: ```Processed
requestId=6212,table=tSCalibrationAttempt_REALTIME,segments(queried/processed/matched/consuming)=4/2/0/2,schedulerWaitMs=1,reqDeserMs=0,totalExecMs=0,resSerMs=0,totalTimeMs=1,minConsumingFreshnessMs=1652072636087,broker=Broker_pinot-
broker-0.pinot-broker-
headless.pinot.svc.cluster.local_8099,numDocsScanned=0,scanInFilter=0,scanPostFilter=0,sched=FCFS,threadCpuTimeNs(total/thread/sysActivity/resSer)=0/0/0/0```  
**@kharekartik:** skipping records as in they don't show up in query?  
 **@zhixun.hong:** @zhixun.hong has joined the channel  
 **@mapshen:** Hi there, we’ve got a field MSGDATETIME, with values like
“2022-05-13T18:21:25.444Z”. Trying to write the schema for it in Pinot as a
time colume. However, with the following configuration: >
`"dateTimeFieldSpecs": [` > `{` > `"name": "MSGDATETIME",` > `"dataType":
"STRING",` > `"format": "1:MILLISECONDS:SIMPLE_DATE_FORMAT:yyyy-MM-
dd'T'HH:mm:ss.fff'Z'",` > `"granularity": "1:MILLISECONDS"` > `}` > `],` Pinot
reports that > invalid datetime format:
1:MILLISECONDS:SIMPLE_DATE_FORMAT:yyyy-MM-dd’T’HH:mm:ss.fff’Z’” What would be
the right way to write this schema?  
**@mapshen:** nvm. I figured why. Was mixing the ‘f’ notation with the “S”
notation  
 **@rbobbala:** @rbobbala has joined the channel  
 **@snlee:** Has anyone faced the issue with lz4 compression?
```net.jpountz.lz4.LZ4Exception: Malformed input at 13 at
net.jpountz.lz4.LZ4JavaUnsafeSafeDecompressor.decompress(LZ4JavaUnsafeSafeDecompressor.java:180)
~[lz4-java-1.7.1.jar:?] at
net.jpountz.lz4.LZ4SafeDecompressor.decompress(LZ4SafeDecompressor.java:145)
~[lz4-java-1.7.1.jar:?] at
org.apache.pinot.segment.local.io.compression.LZ4Decompressor.decompress(LZ4Decompressor.java:42)
~[pinot-segment-
local-0.10.0-dev-471.jar:0.10.0-dev-471-91c2ebbf297c4bf3fecb5f98413e9f0
0e324e2dc] at
org.apache.pinot.segment.local.segment.index.readers.forward.BaseChunkSVForwardIndexReader.decompressChunk(BaseChunkSVForwardIndexReader.java:137)
~[pinot-segment-local-0.10.0-dev-471.j
ar:0.10.0-dev-471-91c2ebbf297c4bf3fecb5f98413e9f00e324e2dc] at
org.apache.pinot.segment.local.segment.index.readers.forward.BaseChunkSVForwardIndexReader.getChunkBuffer(BaseChunkSVForwardIndexReader.java:118)
~[pinot-segment-local-0.10.0-dev-471.ja
r:0.10.0-dev-471-91c2ebbf297c4bf3fecb5f98413e9f00e324e2dc] at
org.apache.pinot.segment.local.segment.index.readers.forward.VarByteChunkSVForwardIndexReader.getStringCompressed(VarByteChunkSVForwardIndexReader.java:72)
~[pinot-segment-local-0.10.0-
dev-471.jar:0.10.0-dev-471-91c2ebbf297c4bf3fecb5f98413e9f00e324e2dc] at
org.apache.pinot.segment.local.segment.index.readers.forward.VarByteChunkSVForwardIndexReader.getString(VarByteChunkSVForwardIndexReader.java:61)
~[pinot-segment-local-0.10.0-dev-471.ja
r:0.10.0-dev-471-91c2ebbf297c4bf3fecb5f98413e9f00e324e2dc] at
org.apache.pinot.segment.local.segment.index.readers.forward.VarByteChunkSVForwardIndexReader.getString(VarByteChunkSVForwardIndexReader.java:35)
~[pinot-segment-local-0.10.0-dev-471.ja
r:0.10.0-dev-471-91c2ebbf297c4bf3fecb5f98413e9f00e324e2dc] at
org.apache.pinot.core.operator.dociditerators.SVScanDocIdIterator$StringMatcher.doesValueMatch(SVScanDocIdIterator.java:176)
~[pinot-core-0.10.0-dev-471.jar:0.10.0-dev-471-91c2ebbf297c4
bf3fecb5f98413e9f00e324e2dc] at
org.apache.pinot.core.operator.dociditerators.SVScanDocIdIterator.applyAnd(SVScanDocIdIterator.java:88)
~[pinot-
core-0.10.0-dev-471.jar:0.10.0-dev-471-91c2ebbf297c4bf3fecb5f98413e9f00e3
24e2dc] at
org.apache.pinot.core.operator.docidsets.AndDocIdSet.iterator(AndDocIdSet.java:128)
~[pinot-
core-0.10.0-dev-471.jar:0.10.0-dev-471-91c2ebbf297c4bf3fecb5f98413e9f00e324e2dc]
at
org.apache.pinot.core.operator.DocIdSetOperator.getNextBlock(DocIdSetOperator.java:67)
~[pinot-
core-0.10.0-dev-471.jar:0.10.0-dev-471-91c2ebbf297c4bf3fecb5f98413e9f00e324e2dc]
at
org.apache.pinot.core.operator.DocIdSetOperator.getNextBlock(DocIdSetOperator.java:38)
~[pinot-
core-0.10.0-dev-471.jar:0.10.0-dev-471-91c2ebbf297c4bf3fecb5f98413e9f00e324e2dc]
at org.apache.pinot.core.operator.BaseOperator.nextBlock(BaseOperator.java:49)
~[pinot-
core-0.10.0-dev-471.jar:0.10.0-dev-471-91c2ebbf297c4bf3fecb5f98413e9f00e324e2dc]
at
org.apache.pinot.core.operator.ProjectionOperator.getNextBlock(ProjectionOperator.java:61)
~[pinot-
core-0.10.0-dev-471.jar:0.10.0-dev-471-91c2ebbf297c4bf3fecb5f98413e9f00e324e2dc]
at
org.apache.pinot.core.operator.ProjectionOperator.getNextBlock(ProjectionOperator.java:33)
~[pinot-
core-0.10.0-dev-471.jar:0.10.0-dev-471-91c2ebbf297c4bf3fecb5f98413e9f00e324e2dc]
at org.apache.pinot.core.operator.BaseOperator.nextBlock(BaseOperator.java:49)
~[pinot-
core-0.10.0-dev-471.jar:0.10.0-dev-471-91c2ebbf297c4bf3fecb5f98413e9f00e324e2dc]
at
org.apache.pinot.core.operator.transform.PassThroughTransformOperator.getNextBlock(PassThroughTransformOperator.java:48)
~[pinot-core-0.10.0-dev-471.jar:0.10.0-dev-471-91c2ebbf297c4bf3f
ecb5f98413e9f00e324e2dc] at
org.apache.pinot.core.operator.transform.PassThroughTransformOperator.getNextBlock(PassThroughTransformOperator.java:31)
~[pinot-core-0.10.0-dev-471.jar:0.10.0-dev-471-91c2ebbf297c4bf3f
ecb5f98413e9f00e324e2dc] at
org.apache.pinot.core.operator.BaseOperator.nextBlock(BaseOperator.java:49)
~[pinot-
core-0.10.0-dev-471.jar:0.10.0-dev-471-91c2ebbf297c4bf3fecb5f98413e9f00e324e2dc]
at
org.apache.pinot.core.operator.query.AggregationGroupByOrderByOperator.getNextBlock(AggregationGroupByOrderByOperator.java:107)
~[pinot-
core-0.10.0-dev-471.jar:0.10.0-dev-471-91c2ebbf297c4bf3fecb5f98413e9f00e324e2dc]
at
org.apache.pinot.core.operator.query.AggregationGroupByOrderByOperator.getNextBlock(AggregationGroupByOrderByOperator.java:46)
~[pinot-
core-0.10.0-dev-471.jar:0.10.0-dev-471-91c2ebbf297c4bf3fecb5f98413e9f00e324e2dc]
at org.apache.pinot.core.operator.BaseOperator.nextBlock(BaseOperator.java:49)
~[pinot-
core-0.10.0-dev-471.jar:0.10.0-dev-471-91c2ebbf297c4bf3fecb5f98413e9f00e324e2dc]
at
org.apache.pinot.core.operator.combine.GroupByOrderByCombineOperator.processSegments(GroupByOrderByCombineOperator.java:137)
~[pinot-
core-0.10.0-dev-471.jar:0.10.0-dev-471-91c2ebbf297c4bf3fecb5f98413e9f00e324e2dc]
at
org.apache.pinot.core.operator.combine.BaseCombineOperator$1.runJob(BaseCombineOperator.java:100)
[pinot-
core-0.10.0-dev-471.jar:0.10.0-dev-471-91c2ebbf297c4bf3fecb5f98413e9f00e324e2dc]
at org.apache.pinot.core.util.trace.TraceRunnable.run(TraceRunnable.java:40)
[pinot-
core-0.10.0-dev-471.jar:0.10.0-dev-471-91c2ebbf297c4bf3fecb5f98413e9f00e324e2dc]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
[?:?] at java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?] at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) [?:?]
at
com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125)
[guava-30.1.1-jre.jar:?] at
com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:69)
[guava-30.1.1-jre.jar:?] at
com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78)
[guava-30.1.1-jre.jar:?] at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
[?:?] at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
[?:?] at java.lang.Thread.run(Thread.java:834) [?:?]```  
 **@ysuo:** Hi team, one of my table shows dead status. I found one of the
segments is error status. And query shows this segment is unavailable. Then I
tried to reset this segment and it response successfully. But this segment is
still error status, and is unavailable when querying this table. Is there
something I can do?  
**@ysuo:**  

###  _#thirdeye-pinot_

  
 **@sandeep908:** @sandeep908 has joined the channel  
 **@zhixun.hong:** @zhixun.hong has joined the channel  
 **@zhixun.hong:** Hello, I'm Luy and I'm working on data integration from
pinot data source.  
 **@zhixun.hong:** I have already set data source in pinot, but can't see it
in thirdeye.  
 **@zhixun.hong:** How can I import pinot data in ThirdEye?  
 **@luys8611:** @luys8611 has joined the channel  
 **@luys8611:** @luys8611 has left the channel  
 **@zhixun.hong:** Anyone can help me with this?  
 **@g.kishore:** @pyne.suvodeep ^^  
 **@pyne.suvodeep:** Hey @zhixun.hong Did you try out the getting started
guide that I shared with you earlier.  After that add pinot data source  
**@zhixun.hong:** I tried that, but it seems that is wrong link. I can't find
thirdeye subfolder and also there's difference between repo and docs.  
 **@zhixun.hong:** I found this repo.  
**@zhixun.hong:** Is it working code?  
 **@zhixun.hong:** I could install Pinot in Docker and add table in Pinot
dataset manager. Also I could run the ThirdEye frontend, and now want to
import Pinot data.  
 **@zhixun.hong:**  
 **@zhixun.hong:** But i can't find this config file in above git.  

###  _#getting-started_

  
 **@zhixun.hong:** @zhixun.hong has joined the channel  
 **@rbobbala:** @rbobbala has joined the channel  

###  _#introductions_

  
 **@zhixun.hong:** @zhixun.hong has joined the channel  
 **@rbobbala:** @rbobbala has joined the channel  
\--------------------------------------------------------------------- To
unsubscribe, e-mail: dev-unsubscribe@pinot.apache.org For additional commands,
e-mail: dev-help@pinot.apache.org