You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pinot.apache.org by Pinot Slack Email Digest <sn...@apache.org> on 2021/01/29 02:00:15 UTC

Apache Pinot Daily Email Digest (2021-01-28)

### _#general_

  
 **@jorgarcia1994:** @jorgarcia1994 has joined the channel  
 **@g.kishore:** "Intro to Pinot" session by @chinmay.cerebro - starting now.  
**@wrbriggs:** I have to leave early, but I’m looking forward to seeing the
rest of the presentation via the recording - thank you @chinmay.cerebro for
providing all this information to the community, and @karinwolok1 for
organizing this!  
 **@wrbriggs:** I have to leave early, but I’m looking forward to seeing the
rest of the presentation via the recording - thank you @chinmay.cerebro for
providing all this information to the community, and @karinwolok1 for
organizing this!  

###  _#random_

  
 **@jorgarcia1994:** @jorgarcia1994 has joined the channel  
 **@asmamnoor96:** hello guys I am new to pinot any suggestion where to start
from  
**@fx19880617:** hi, welcome! You can start from the pinot documentations for
pinot intro:  
**@fx19880617:** then try out to get start Pinot in your own enviroment:  
**@asmamnoor96:** okay.. thanx :slightly_smiling_face:  
 **@apadhy:** Hey guys,I am planning to do POC on pinot for the real time
analytics over kafka wanted to understand does it support joins to multiple
kafka topics and how efficient over flink at this time  
**@wrbriggs:** Pinot by itself does not handle handle joining Kafka streams at
ingestion time. It does support defining a dimension table, and using a
`lookup` UDF to do star-schema style dimension lookups. It cannot do standard
SQL-style `JOIN` operations between multiple tables, but PrestoDB can use
Pinot as a back-end to accomplish full ANSI-SQL joins:  
**@wrbriggs:** Also, you probably will be better off asking questions like
this in <#CDRCA57FC|general>, as this <#CDRJ5UE21|random> channel is intended
more for non-technical discussion.  
**@wrbriggs:** if your use case requires Stream -> Stream joins on unbounded
inputs (e.g., two infinite Kafka streams), IMO, you would be better off doing
that work in Spark, Flink, or Kafka Streams, where you have better control
over the time window for data retention as well as the logic for the join and
any subsequent transformations or projections of the join result, and then
ingesting the output into Pinot for query-time analysis. I’m not affiliated
with the Pinot project, so take my opinion with a whole handful of salt -
people much smarter than I am might have some better solutions for you
:slightly_smiling_face:  
**@g.kishore:** @wrbriggs is right!  
**@ssubrama:** @apadhy another possible solution is to use samza before
ingesting to pinot  
**@wrbriggs:** Sorry @ssubrama! I like Samza, I swear! I've never used it in a
production use case so I always forget to suggest it :(  
**@ssubrama:** no issues. At Linkedin, many use cases use samza to process
data first before ingesting to Pinot  

###  _#troubleshooting_

  
 **@jorgarcia1994:** @jorgarcia1994 has joined the channel  
 **@suraj:** Our brokers have been running into direct memory allocation OOM
errors. We have allocated 128M. Noticed that the brokers don't crash but catch
the exception and log it. The only symptom we see is query timeouts. Would
like to understand: *a) what is the direct memory used for ? b) any guidelines
to size it ?*  
**@g.kishore:** its used by netty, 128M is too less if you are moving a lot of
data between server and broker  
**@g.kishore:** increase it to 1G  
**@suraj:** thanks  
 **@suraj:** `2021/01/22 00:15:36.706 ERROR [DataTableHandler]
[nioEventLoopGroup-2-3] Caught exception *while* handling response from
server: pinot-server-3_R` `java.lang.OutOfMemoryError: Direct buffer memory`
`at java.nio.Bits.reserveMemory(Bits.java:175) ~[?:?]` `at
java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:118) ~[?:?]` `at
java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:317) ~[?:?]` `at
io.netty.buffer.PoolArena$DirectArena.allocateDirect(PoolArena.java:758)
~[pinot-
all-0.6.0-jar-*with*-dependencies.jar:0.6.0-bb646baceafcd9b849a1ecdec7a11203c7027e21]`
`at io.netty.buffer.PoolArena$DirectArena.newUnpooledChunk(PoolArena.java:748)
~[pinot-
all-0.6.0-jar-*with*-dependencies.jar:0.6.0-bb646baceafcd9b849a1ecdec7a11203c7027e21]`
`at io.netty.buffer.PoolArena.allocateHuge(PoolArena.java:260) ~[pinot-
all-0.6.0-jar-*with*-dependencies.jar:0.6.0-bb646baceafcd9b849a1ecdec7a11203c7027e21]`
`at io.netty.buffer.PoolArena.allocate(PoolArena.java:232) ~[pinot-
all-0.6.0-jar-*with*-dependencies.jar:0.6.0-bb646baceafcd9b849a1ecdec7a11203c7027e21]`
`at io.netty.buffer.PoolArena.reallocate(PoolArena.java:397) ~[pinot-
all-0.6.0-jar-*with*-dependencies.jar:0.6.0-bb646baceafcd9b849a1ecdec7a11203c7027e21]`
`at io.netty.buffer.PooledByteBuf.capacity(PooledByteBuf.java:119) ~[pinot-
all-0.6.0-jar-*with*-dependencies.jar:0.6.0-bb646baceafcd9b849a1ecdec7a11203c7027e21]`
`at io.netty.buffer.AbstractByteBuf.ensureWritable0(AbstractByteBuf.java:310)
~[pinot-
all-0.6.0-jar-*with*-dependencies.jar:0.6.0-bb646baceafcd9b849a1ecdec7a11203c7027e21]`
`at io.netty.buffer.AbstractByteBuf.ensureWritable(AbstractByteBuf.java:281)
~[pinot-
all-0.6.0-jar-*with*-dependencies.jar:0.6.0-bb646baceafcd9b849a1ecdec7a11203c7027e21]`
`at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:1118)
~[pinot-
all-0.6.0-jar-*with*-dependencies.jar:0.6.0-bb646baceafcd9b849a1ecdec7a11203c7027e21]`
`at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:1111)
~[pinot-
all-0.6.0-jar-*with*-dependencies.jar:0.6.0-bb646baceafcd9b849a1ecdec7a11203c7027e21]`
`at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:1102)
~[pinot-
all-0.6.0-jar-*with*-dependencies.jar:0.6.0-bb646baceafcd9b849a1ecdec7a11203c7027e21]`
`at
io.netty.handler.codec.ByteToMessageDecoder$1.cumulate(ByteToMessageDecoder.java:96)
~[pinot-
all-0.6.0-jar-*with*-dependencies.jar:0.6.0-bb646baceafcd9b849a1ecdec7a11203c7027e21]`
`at
io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:281)
~[pinot-
all-0.6.0-jar-*with*-dependencies.jar:0.6.0-bb646baceafcd9b849a1ecdec7a11203c7027e21]`
`at
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:374)
[pinot-
all-0.6.0-jar-*with*-dependencies.jar:0.6.0-bb646baceafcd9b849a1ecdec7a11203c7027e21]`
`at
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:360)
[pinot-
all-0.6.0-jar-*with*-dependencies.jar:0.6.0-bb646baceafcd9b849a1ecdec7a11203c7027e21]`
`at
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:352)
[pinot-
all-0.6.0-jar-*with*-dependencies.jar:0.6.0-bb646baceafcd9b849a1ecdec7a11203c7027e21]`
`at
io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1422)
[pinot-
all-0.6.0-jar-*with*-dependencies.jar:0.6.0-bb646baceafcd9b849a1ecdec7a11203c7027e21]`
`at
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:374)
[pinot-
all-0.6.0-jar-*with*-dependencies.jar:0.6.0-bb646baceafcd9b849a1ecdec7a11203c7027e21]`
`at
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:360)
[pinot-
all-0.6.0-jar-*with*-dependencies.jar:0.6.0-bb646baceafcd9b849a1ecdec7a11203c7027e21]`
`at
io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:931)
[pinot-
all-0.6.0-jar-*with*-dependencies.jar:0.6.0-bb646baceafcd9b849a1ecdec7a11203c7027e21]`
`at
io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:163)
[pinot-
all-0.6.0-jar-*with*-dependencies.jar:0.6.0-bb646baceafcd9b849a1ecdec7a11203c7027e21]`
`at
io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:700)
[pinot-
all-0.6.0-jar-*with*-dependencies.jar:0.6.0-bb646baceafcd9b849a1ecdec7a11203c7027e21]`
`at
io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:635)
[pinot-
all-0.6.0-jar-*with*-dependencies.jar:0.6.0-bb646baceafcd9b849a1ecdec7a11203c7027e21]`
`at
io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:552)
[pinot-
all-0.6.0-jar-*with*-dependencies.jar:0.6.0-bb646baceafcd9b849a1ecdec7a11203c7027e21]`
`at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:514) [pinot-
all-0.6.0-jar-*with*-dependencies.jar:0.6.0-bb646baceafcd9b849a1ecdec7a11203c7027e21]`
`at
io.netty.util.concurrent.SingleThreadEventExecutor$6.run(SingleThreadEventExecutor.java:1044)
[pinot-
all-0.6.0-jar-*with*-dependencies.jar:0.6.0-bb646baceafcd9b849a1ecdec7a11203c7027e21]`
`at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
[pinot-
all-0.6.0-jar-*with*-dependencies.jar:0.6.0-bb646baceafcd9b849a1ecdec7a11203c7027e21]`
`at
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
[pinot-
all-0.6.0-jar-*with*-dependencies.jar:0.6.0-bb646baceafcd9b849a1ecdec7a11203c7027e21]`
`at java.lang.Thread.run(Thread.java:834) [?:?]`  

###  _#pinot-dev_

  
 **@amrish.k.lal:** I get this error while running Quickstart.java (straight
out of box). Has anything changed here? ```TotalProcessed time for event:
MessageChange took: 18 ms Exception in thread "main"
java.lang.RuntimeException: Failed to create IngestionJobRunner instance for
class -
org.apache.pinot.plugin.ingestion.batch.standalone.SegmentGenerationJobRunner
at
org.apache.pinot.spi.ingestion.batch.IngestionJobLauncher.kickoffIngestionJob(IngestionJobLauncher.java:137)
at
org.apache.pinot.spi.ingestion.batch.IngestionJobLauncher.runIngestionJob(IngestionJobLauncher.java:113)
at
org.apache.pinot.tools.BootstrapTableTool.bootstrapOfflineTable(BootstrapTableTool.java:189)
at
org.apache.pinot.tools.BootstrapTableTool.execute(BootstrapTableTool.java:99)
at
org.apache.pinot.tools.admin.command.QuickstartRunner.bootstrapTable(QuickstartRunner.java:207)
at org.apache.pinot.tools.Quickstart.execute(Quickstart.java:180) at
org.apache.pinot.tools.Quickstart.main(Quickstart.java:223) Caused by:
java.lang.ClassNotFoundException:
org.apache.pinot.plugin.ingestion.batch.standalone.SegmentGenerationJobRunner
at java.net.URLClassLoader.findClass(URLClassLoader.java:381) at
java.lang.ClassLoader.loadClass(ClassLoader.java:424) at
org.apache.pinot.spi.plugin.PluginClassLoader.loadClass(PluginClassLoader.java:80)
at
org.apache.pinot.spi.plugin.PluginManager.createInstance(PluginManager.java:293)
at
org.apache.pinot.spi.plugin.PluginManager.createInstance(PluginManager.java:264)
at
org.apache.pinot.spi.plugin.PluginManager.createInstance(PluginManager.java:245)
at
org.apache.pinot.spi.ingestion.batch.IngestionJobLauncher.kickoffIngestionJob(IngestionJobLauncher.java:135)
... 6 more```  
 **@amrish.k.lal:** @fx19880617 I am wondering if this ^^ is related to  ?  
 **@fx19880617:** shouldn’t be, where you running this quickstart ?  
 **@amrish.k.lal:** In IntelliJ, right click and run  
 **@fx19880617:** are you running in pinot-distribution directory  
 **@fx19880617:** hmm  
 **@fx19880617:** are you on master branch?  
 **@amrish.k.lal:** yes and without any changes.  
 **@fx19880617:** hmm  
 **@fx19880617:** have you tried to build the pinot using maven  
 **@fx19880617:** I feel we may need to put pinot-batch-ingestion-standalone
into runtime dependency so the first time user can run it through IDE  
**@amrish.k.lal:** I find Quickstart useful for debugging that's why I was
running it from intellij.  
 **@amrish.k.lal:** Yes, I am able to build sucessfully through `mvn clean
install package -DskipTests -Pbin-dist -DdownloadSources -DdownloadJavadocs`  
 **@fx19880617:** ok  
 **@fx19880617:** can you try onething, add ``` <dependency>
<groupId>org.apache.pinot</groupId> <artifactId>pinot-batch-ingestion-
standalone</artifactId> <version>${project.version}</version>
<scope>runtime</scope> </dependency>``` into pinot-tools/pom.xml  
**@amrish.k.lal:** ok  
**@amrish.k.lal:** I still get the same error after selecting Quickstart.java,
right click and run/debug Quickstart.main()  
**@amrish.k.lal:** Is there any other way to run Quickstart under debugger?  
**@fx19880617:** can you try remove the <scope>runtime</scope>  
**@fx19880617:** and refresh the module  
**@fx19880617:**  
**@fx19880617:** when my intellij build this module  
**@fx19880617:** it copies all the resources  
**@fx19880617:**  
**@fx19880617:** have you tried to build pinot module  
**@fx19880617:**  
**@amrish.k.lal:** Trying it...  
**@amrish.k.lal:** Ahh, works now :slightly_smiling_face:  
**@amrish.k.lal:** I removed <scope>runtime</scope> and reloaded the pinot-
tools project. works fine after that :slightly_smiling_face:  
**@amrish.k.lal:** Thanks.  

###  _#announcements_

  
 **@jorgarcia1994:** @jorgarcia1994 has joined the channel  
 **@g.kishore:** "Intro to Pinot" session by @chinmay.cerebro - starting now.  
**@chinmay.cerebro:** @chinmay.cerebro has joined the channel  

###  _#pinot-docs_

  
 **@ken:** I was looking into the `cleanUpOutputDir` flag - it doesn’t seem to
be documented yet. Seems like this really means
`deleteOutputSegmentAfterPush`, yes?  
 **@ken:** I was also looking into the `overwriteOutput` flag, which also
doesn’t seem to be documented. It seems to have different meanings in the
code…in the Hadoop & Spark batch ingest code, it’s checking whether the
staging dir/output/ directory already contains the segment tar file before
it’s copied from local, which seems odd since this directory is created at the
start of the job, and deleted at the end, so there shouldn’t be collisions. In
standalone batch ingest it’s checking whether the actual output file exists,
which seems correct. And in the `SegmentGenerationAndPushTaskGenerator` it’s
documented as “overwriteOutput - Optional, delete the output segment directory
if set to true”. I’m guessing the Hadoop/Spark code is a bug, and the flag
should be getting checked when the staging/segmentTar/ files are being copied
to the output dir, and the documentation in
`SegmentGenerationAndPushTaskGenerator` is wrong.  

###  _#getting-started_

  
 **@jorgarcia1994:** @jorgarcia1994 has joined the channel  
\--------------------------------------------------------------------- To
unsubscribe, e-mail: dev-unsubscribe@pinot.apache.org For additional commands,
e-mail: dev-help@pinot.apache.org