You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@pinot.apache.org by Pinot Slack Email Digest <ap...@gmail.com> on 2022/04/06 02:00:29 UTC

Apache Pinot Daily Email Digest (2022-04-05)

### _#general_

**@diana.arnos:** Hey there, which metric can I use to check the consumption
lag from the servers in comparison with the topic they are consuming from?
**@npawar:** No metric as of now. This PR is in progress which will display
the difference: and potentially become a metric. As of now, the API being
modified (/consumingSegmentsInfo) already displays the currentOffset per
partition.
**@ryantle1028:** hello everyone i try to config tls-ssl on my apache pinot
cluster. Still not working. Anyone can share how to config enable tls-ssl on
pinot cluster controller, broker, server, minion... how to connect with tls-
ssl to apache pinot.
**@dlavoie:** Hello, this doc should be pretty extensive:
**@ryantle1028:** i try to use this parameter on this document already, look
like it's not working maybe i do something wrong.
**@ryantle1028:** Did you have some example for config and connect tls-ssl for
pinot ? @dlavoie
**@moinuddinmbd:** @moinuddinmbd has joined the channel
**@varun.mukundhan640:** Hey folks, just an observation, the pinot go client
doesn’t have any provision to specify http/query timeout. It uses the client
that keeps the connection open indefinitely. Any way to circumvent this?
**@francois:** Hi. Starting to get dirty hands on code to implement my GDPR
purge process. I’m a bit struggling with the build time and the no way to
test. Is there any way to get a faster build to test it ? I’ve used the
folowing maven command `mvn install package -DskipTests -Pbin-dist` and it
take more than 20 minutes to build :confused: Anyway to get faster ?
**@g.kishore:** we should have a profile to skip building all the plugins
**@francois:** I’ve first been just building pinot-minion (helps a bit
:slightly_smiling_face: ) but then It’s not included in the jar with
dependency :confused:
**@npawar:** you could try adding `-T 4`. Are you incrementally testing with
the build everytime? there are integration tests and quickstarts that can help
you test in your IDE quickly without doing the mvn build
**@francois:** Yes incrementally testing :/ I will try with the ide
quickstart.
**@tonya:** Hey folks! Just a reminder that in 30 minutes we're going to
start the meetup with @dunithd. Come join us if you can!
**@drojas:** Hi there. Is there a way to transform a string field to a
double/float upon ingestion? I'm looking at the supported transform functions
and do not see a function that supports this
**@npawar:** what sort of transformation? could you give an example? if your
column is STRING datatype in the source but is actually umeric, and all you
want to is that it be stored as double/float in Pinot, that should already
work afaik. All you’d have to do is set the right dataType in Pinot schema.
Pinot has a DataTypeTransformer which is automatically applied during
ingestion
**@drojas:** Ah it does look like this is the case. Yeah, my column is a
STRING datatype but is actually a numeric value. Looks like the conversion is
applied automatically during ingestion. Thanks!
**@brianjzhan:** @brianjzhan has joined the channel
**@jdelmerico676:** @jdelmerico676 has joined the channel
**@gigi.dba:** @gigi.dba has joined the channel
**@jove:** @jove has joined the channel
**@hanseulnam:** @hanseulnam has joined the channel
**@zliu:** @zliu has joined the channel

### _#random_

**@moinuddinmbd:** @moinuddinmbd has joined the channel
**@brianjzhan:** @brianjzhan has joined the channel
**@jdelmerico676:** @jdelmerico676 has joined the channel
**@gigi.dba:** @gigi.dba has joined the channel
**@jove:** @jove has joined the channel
**@hanseulnam:** @hanseulnam has joined the channel
**@zliu:** @zliu has joined the channel

### _#troubleshooting_

**@diana.arnos:** Hello :wave: I have 3 of my 4 servers stuck with this kind
of message: ```Find unloaded segment: <tableName>__0__35__20220404T0749Z,
table: <tableName>_REALTIME, expected: ONLINE, actual: CONSUMING Sleeping 1
second waiting for all segments loaded for partial-upsert table:
<tableName>_REALTIME``` Which endpoint should I use to try to sort this out?
The `reload` one does not work, for the segment is still consuming and the
`reset` always fails, for it can't stop a consuming segment for some reason.
Would it be okay to just delete this segment? Would the Controller know it
needs to be consumed again?
**@npawar:** deleting the consumig segment won’t help. it’ll get stuck.
**@npawar:** @jackie.jxt ^ any idea about this error?
**@jackie.jxt:** @diana.arnos Have you tried restarting the servers?
**@jackie.jxt:** Not sure how you run into this scenario. Somehow the
consuming segment is already committed, but the failed servers haven't started
consumption yet
**@lars-kristian_svenoy:** Hello everyone. Is there any way to specify that
we do not want any indexes for a field? We are struggling with a very large
text blob, which seems to be stored in the indexes folder on the servers. We
want the data to only reside on our deep store, and not be stored on disk at
all. I’ve tried adding the field to the noDictionaryColumns and setting the
fieldConfig encodingType to RAW, but it still seems to be creating a forward
index which is stored on disk. Any ideas?
**@lars-kristian_svenoy:** Here’s output from ls -laH ```drwxr-xr-x 2 root
root 4096 Mar 7 15:57 . drwxr-xr-x 3 root root 4096 Mar 7 15:57 .. -rw-r--r--
1 root root 276774030 Mar 7 15:57 columns.psf -rw-r--r-- 1 root root 16 Mar 7
15:57 creation.meta -rw-r--r-- 1 root root 2617 Mar 7 15:57 index_map -rw-r--
r-- 1 root root 16669 Mar 7 15:57 metadata.properties```
**@lars-kristian_svenoy:** And looking at the index_map
```large_text_field.forward_index.startOffset = 17683196
large_text_field.forward_index.size = 255005500```
**@lars-kristian_svenoy:** From metadata.properties
```column.large_text_field.cardinality = -2147483648
column.large_text_field.totalDocs = 3629495 column.large_text_field.dataType =
STRING column.large_text_field.bitsPerElement = 31
column.large_text_field.lengthOfEachEntry = 0
column.large_text_field.columnType = DIMENSION
column.large_text_field.isSorted = false column.large_text_field.hasNullValue
= false column.large_text_field.hasDictionary = false
column.large_text_field.textIndexType = NONE
column.large_text_field.hasInvertedIndex = true
column.large_text_field.hasFSTIndex = false
column.large_text_field.hasJsonIndex = false
column.large_text_field.isSingleValues = true
column.large_text_field.maxNumberOfMultiValues = 0
column.large_text_field.totalNumberOfEntries = 3629495
column.large_text_field.isAutoGenerated = false
column.large_text_field.defaultNullValue = null```
**@lars-kristian_svenoy:** Any help greatly appreciated
**@richard892:** the forward index is just the storage for the column
**@lars-kristian_svenoy:** Why is it storing it on disk?
**@richard892:** it's called a forward index because it implicitly maps
"forward" from the value to the docId
**@lars-kristian_svenoy:** So there is no way to further reduce the disk space
usage for that dimension field, except for enabling compression?
**@richard892:** it should be compressed by default
**@lars-kristian_svenoy:** Yeah that’s right, snappy compressed
**@lars-kristian_svenoy:** The problem is it’s a very large base64 string
**@lars-kristian_svenoy:** So my disks are exploding
**@richard892:** we changed the default to LZ4 (LZ4_WITH_LENGTH is better)
because it decompresses faster and has a better ratio
**@lars-kristian_svenoy:** I’m trying to figure out if there is any way for me
to prevent it from being stored on disk at all, and only retrieved from deep
store
**@richard892:** yes, let me look into it
**@lars-kristian_svenoy:** Thank you :slightly_smiling_face:
**@richard892:** @npawar is the expert on this
**@lars-kristian_svenoy:** :+1: Thank you. I’ll wait for more info on this..
In the meantime, looking to upgrade to 0.10
**@lars-kristian_svenoy:** Perhaps I need to look into making this specific
field external to pinot
**@lars-kristian_svenoy:** These text blobs are ~2-5 MB in size
**@lars-kristian_svenoy:** So they really bloat my segments
**@richard892:** there are some primitives in tiered storage for separating
indexes from data, but I'm not sure where the boundaries are
**@lars-kristian_svenoy:** Tiered storage is not OSS right?
**@richard892:** by the way, I've found in the past that base64 encoding can
confuse compression algorithms like LZ4 and SNAPPy because it scrambles data
across byte boundaries those algorithms exploit
**@richard892:** you can probably get a big improvement by not base64 encoding
and changing it to BYTES
**@richard892:** then apply LZ4_WITH_LENGTH and use V4 raw index
**@richard892:** > Tiered storage is not OSS right? no, we have a proprietary
implementation but the primitives to support that are open source
**@lars-kristian_svenoy:** That makes sense. I could attempt doing that too
**@moinuddinmbd:** @moinuddinmbd has joined the channel
**@saumya2700:** hi everyone, pinot is showing strange behavior after adding
second broker, it is skipping data 4 times out of 10. It is happening even
from pinot query console. Server logs are not showing any errors and because
we have 6,7 realtime tables so logs are very quickly filling up and not able
to track particular message from logs. Is there any way we can define groupId
for consumer in table config.
**@npawar:** what exactly happens when you say Pinot is skipping data? are you
getting results different than what you expected or empty results?
**@eduardo.cusa:** Hello guys, we're using the `ingestFromFile` endpoint to
data, but after some minutes the table is empty again. Do we need to set up a
backend?
**@mark.needham:** hmmm, did you set a low retention time or something?
**@mark.needham:** not really sure why else the data would disappear
**@mayanks:** Yeah, sounds like you set really low retention, or don’t have
the right unit for time.
**@eduardo.cusa:** ```{ "tableName": "ads31", "tableType": "OFFLINE",
"segmentsConfig": { "replication": 1, "timeColumnName": "device_timestamp",
"timeType": "MILLISECONDS", "retentionTimeUnit": "DAYS", "retentionTimeValue":
365 }, "tenants": { "broker":"DefaultTenant", "server":"DefaultTenant" },
"tableIndexConfig": { "loadMode": "MMAP" }, "ingestionConfig": {
"batchIngestionConfig": { "segmentIngestionType": "APPEND",
"segmentIngestionFrequency": "DAILY" } }, "metadata": {} }```
**@mayanks:** Ok, another possibility is that your data is more than 1 year
old
**@mayanks:** May be try removing retention, if this is just for testing.
**@david.cyze:** I’m trying to get Pinot running on a linux VM, outside of
docker, with the quick start “transcript” data. Then, I want to query the data
using the presto connector Last week, the docs recommended version 0.9.3.
There were two issues I needed to resolve to get this to work: • The
`timestamp` column in the schema needed to be renamed (I chose
`timestamparoo`), because presto queries interpreted `timestamp` as a casting
function as opposed to a column • The `timeFieldSpec` field in the table
schema needed to change to `dateTimeFieldSpec` After making this changes, I
could ingest and query (mostly) fine The docs have since recommended changing
to 0.10.0, which I have tried doing. However, now when I run `./bin/pinot-
admin.sh LaunchDataIngestionJob -jobSpecFile ~/pinot-
tutorial/transcript/batch-job-spec.yml`, I get an exception related to the
timestamp column: ```Exception while collecting stats for column:timestamparoo
in row:{ "fieldToValueMap" : { "studentID" : 200, "firstName" : "Lucy",
"lastName" : "Smith", "score" : 3.8, "gender" : "Female", "subject" : "Maths",
"timestamparoo" : null }, "nullValueFields" : [ ] }
or.collect(LongColumnPreIndexStatsCollector.java:50) ~[pinot-all-0.10.0-jar-
with-dependencies.jar:0.10.0-30c4635bfeee88f88aa9c9f63b93bcd4a650607f] at
org.apache.pinot.segment.local.segment.creator.impl.stats.SegmentPreIndexStatsCollectorImpl.collectRow(SegmentPreIndexStatsCollectorImpl.java:96)
~[pinot-all-0.10.0-jar-with-
dependencies.jar:0.10.0-30c4635bfeee88f88aa9c9f63b93bcd4a650607f]``` It seems
Pinot isn’t parsing the values for this column from the CSV. Why would that
be? (More supporting files in thread)
**@david.cyze:** job spec: ```executionFrameworkSpec: name: 'standalone'
segmentGenerationJobRunnerClassName:
'org.apache.pinot.plugin.ingestion.batch.standalone.SegmentGenerationJobRunner'
segmentTarPushJobRunnerClassName:
'org.apache.pinot.plugin.ingestion.batch.standalone.SegmentTarPushJobRunner'
segmentUriPushJobRunnerClassName:
'org.apache.pinot.plugin.ingestion.batch.standalone.SegmentUriPushJobRunner'
jobType: SegmentCreationAndTarPush inputDirURI: '/home/vagrant/pinot-
tutorial/transcript/rawData/' includeFileNamePattern: 'glob:**/*.csv'
outputDirURI: '/home/vagrant/pinot-tutorial/transcript/segments/'
overwriteOutput: true pinotFSSpecs: \- scheme: file className:
org.apache.pinot.spi.filesystem.LocalPinotFS recordReaderSpec: dataFormat:
'csv' className: 'org.apache.pinot.plugin.inputformat.csv.CSVRecordReader'
configClassName:
'org.apache.pinot.plugin.inputformat.csv.CSVRecordReaderConfig' configs:
fileFormat: 'default' tableSpec: tableName: 'transcript' pinotClusterSpecs: \-
controllerURI: ''``` csv:
```studentID,firstName,lastName,gender,subject,score,timestamparoo
200,Lucy,Smith,Female,Maths,3.8,1570863600000
200,Lucy,Smith,Female,English,3.5,1571036400000
201,Bob,King,Male,Maths,3.2,1571900400000
202,Nick,Young,Male,Physics,3.6,1572418800000``` Table schema: ```{
"schemaName": "transcript", "dimensionFieldSpecs": [ { "name": "studentID",
"dataType": "INT" }, { "name": "firstName", "dataType": "STRING" }, { "name":
"lastName", "dataType": "STRING" }, { "name": "gender", "dataType": "STRING"
}, { "name": "subject", "dataType": "STRING" } ], "metricFieldSpecs": [ {
"name": "score", "dataType": "FLOAT" } ], "dateTimeFieldSpecs": [ { "name":
"timestamparoo", "dataType": "LONG", "format": "1:MILLISECONDS:EPOCH",
"granularity": "1:MILLISECONDS" } ] }``` table config: ```{ "tableName":
"transcript", "segmentsConfig" : { "timeColumnName": "timestamparoo",
"timeType": "MILLISECONDS", "replication" : "2", "schemaName" : "transcript"
}, "tableIndexConfig" : { "invertedIndexColumns" : [], "loadMode" : "MMAP" },
"tenants" : { "broker":"DefaultTenant", "server":"DefaultTenant" },
"tableType":"OFFLINE", "metadata": {} }```
**@mark.needham:** so you get that error with 0.10.0, but not with 0.9.3?
**@david.cyze:** Correct
**@ken:** Wasn’t there some change in 0.10 for handling null values? I see
`timestamparoo" : null` in the record that was rejected.
**@ken:** Or is that just what gets displayed with a timestamp field that
can’t be parsed?
**@brianjzhan:** @brianjzhan has joined the channel
**@drojas:** Hi. I'm experimenting with Pinot in Kubernetes and used the
Pinot helm chart in the . The problem I am facing is that upon real-time
ingestion from my Kafka topic, the Pinot servers and table segments get into a
bad state after ingesting a few million records. The Pinot server pods
encounter a JRE fatal error and restart. The Broker reports `Failed to find
servers hosting segment: <segmentName> for table: <tableName> (all
ONLINE/CONSUMING instances: [] and OFFLINE instances: [] are disabled,
counting segment as unavailable)` Could this be due to an under-provisioned
Pinot cluster? Something else?
**@mayanks:** My guess is that your servers are running out of memory. Would
need a bit more info, let me ping you.
**@pankaj:** Hey I am observing a peculiar behavior in my pinot setup. We
have 3 offline servers each with about 366 segments. When I run a single
query; all the servers respond fairly quickly and results come back within
100ms or so for this query. When I run the same query in parallel say about
100 times; one of the servers shows large scheduling delay (as printed by the
server logs) but the execution time is much smaller. Due to this the overall
queries now takes really long upto 4 seconds or so. Has anyone seen this
behavior? I have tried attaching profiler to see what is happening and it
indicates that the server that is showing more scheduling delay is more busy
and doing more work than others. I am not able to figure out why this is the
case. Any insights?
**@mayanks:** This means you are under-provisioned for that load. But before
adding more resource, you probably want to check if you have optimized
indexing, and other features like partition/sort/replica-groups etc. These
will with throughput.
**@pankaj:** Why would the behavior be different across different servers?
**@pankaj:** That is what is throwing me off. If all of them show scheduling
delays, I can understand its underprovisioned
**@mayanks:** How is the data partitioned? Is that server doing a lot more
work for that query? For example, it hosts most of the data that needs to be
scanned/processed for the query?
**@mayanks:** You can check server logs on stats about how much processing it
had to do
**@ken:** Also I’m assuming you aren’t running the controller or broker
processes on these servers, right? Only the Pinot server process?
**@pankaj:** My setup has 3 k8s nodes; so they do run 1 offline; 1 realtime; 1
kafka instance; 1 broker; 1 controller each. there is no ingestion going on at
this time.
**@pankaj:** @mayanks what do I check for in the server logs? I am typically
looking at this line only: 2022/04/05 19:31:41.008 INFO [QueryScheduler]
[pqr-2] Processed
requestId=18389,table=kf_metrics_REALTIME,segments(queried/processed/matched/consuming)=53/53/53/-1,schedulerWaitMs=6171,reqDeserMs=1,totalExecMs=549,resSerMs=2,totalTimeMs=6723,minConsumingFreshnessMs=-1,broker=Broker_pinot-
broker-0.pinot-broker-
headless.kfuse.svc.cluster.local_8099,numDocsScanned=1724964,scanInFilter=32807,scanPostFilter=5174892,sched=fcfs,threadCpuTimeNs=0
**@pankaj:** Data should be evenly distributed and based on segment count it
seems so;
**@pankaj:**
**@mayanks:** That is the correct log to look at. For the same requestId, you
can compare the numDocsScanned, totalExecMs and schedulerWait times. One of
these numbers has to be much higher for the slow server. And depending on
which one it is, we will know what’s wrong.
**@mayanks:** Also, I take it that these are identical nodes in terms of
cpu/mem/jvm args etc?
**@jdelmerico676:** @jdelmerico676 has joined the channel
**@gigi.dba:** @gigi.dba has joined the channel
**@jove:** @jove has joined the channel
**@hanseulnam:** @hanseulnam has joined the channel
**@zliu:** @zliu has joined the channel

### _#pinot-dev_

**@varun.j:** @varun.j has joined the channel
**@francois:** @francois has joined the channel
**@moinuddinmbd:** @moinuddinmbd has joined the channel

### _#getting-started_

### _#releases_

**@moinuddinmbd:** @moinuddinmbd has joined the channel
**@hanseulnam:** @hanseulnam has joined the channel
\--------------------------------------------------------------------- To
unsubscribe, e-mail: dev-unsubscribe@pinot.apache.org For additional commands,
e-mail: dev-help@pinot.apache.org