You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pinot.apache.org by Pinot Slack Email Digest <sn...@apache.org> on 2021/06/05 02:00:17 UTC

Apache Pinot Daily Email Digest (2021-06-04)

### _#general_

  
 **@bryagd:** @bryagd has joined the channel  
 **@mercyshans:** @mercyshans has joined the channel  

###  _#random_

  
 **@bryagd:** @bryagd has joined the channel  
 **@mercyshans:** @mercyshans has joined the channel  

###  _#troubleshooting_

  
 **@bryagd:** @bryagd has joined the channel  
 **@patidar.rahul8392:** While loading data from hdfs to pinot table I m
getting this exception. [r-2 apache-pinot-incubating-0.7.1-bin]$ hadoop jar
${PINOT_DISTRIBUTION_DIR}/lib/pinot-all-${PINOT_VERSION}-jar-with-
dependencies.jar
org.apache.pinot.tools.admin.command.LaunchDataIngestionJobCommand
-jobSpecFile /home/rah/executionFrameworkSpec.yaml Exception in thread "main"
java.io.FileNotFoundException: /tmp/hadoop-
unjar7575411926296177023/shaded/com/google/common/collect/ImmutableSetMultimap$EntrySet.class
(No space left on device) at java.io.FileOutputStream.open0(Native Method) at
java.io.FileOutputStream.open(FileOutputStream.java:270) at
java.io.FileOutputStream.<init>(FileOutputStream.java:213) at
java.io.FileOutputStream.<init>(FileOutputStream.java:162) at
org.apache.hadoop.util.RunJar.unJar(RunJar.java:110) at
org.apache.hadoop.util.RunJar.unJar(RunJar.java:85) at
org.apache.hadoop.util.RunJar.run(RunJar.java:221) at
org.apache.hadoop.util.RunJar.main(RunJar.java:148) Someone kindly suggest.  
**@jmeyer:** Maybe something to do with `No space left on device` ?  
**@patidar.rahul8392:** Yes @jmeyer it's related to space issue in /tmp direct
pinot is creating plugin file here, but I am not giving this /tmp directory
anywhere in my confirmation .is the any way to change this tmp location I want
to move this file in different location where I have enough  /tmp .dir we
don't have much space available  
**@patidar.rahul8392:**  
**@patidar.rahul8392:** These are the files pinot is creating at /tmp location  
**@patidar.rahul8392:** Or any way to clean this /tmp location while loading
data for next file. I am able to load data for 3 files when I am trying for
4th day I ma getting this space issue  
**@jmeyer:** From what I read, default volume size for docker containers is
10Gb, there must exist a setting to increase that value Alternatively, you
could explicitly mount a volume which iirc, would only be limited by its host
storage  
**@jmeyer:** But the opinion of someone more familiar with Pinot / Docker
would be appreciated :smile:  
**@jmeyer:** (if you're using docker at all, actually)  
**@patidar.rahul8392:** Ok @jmeyer I m not using docker  
**@patidar.rahul8392:** Actually here only 943 mb space is available so I am
wondering if is there any way in pinot so that we can create these /tmp folder
files at any other location.  
**@laxman:** @patidar.rahul8392: you can set `hadoop.tmp.dir` to the location
you want to  Default value of this property is `/tmp/...`  
**@patidar.rahul8392:** Ok in which file I need to set this or its a
executable directly I can run through export on terminal.  
**@patidar.rahul8392:**?  
**@laxman:** You have to set this in `core-site.xml` Or hadoop configs can be
set as a Java properties as well `-Dkey=value`  
**@patidar.rahul8392:** Ok @laxman here I am trying to load only 2 kB's file
and space available is 943 mb . But still it's giving space issue ?  
 **@jmeyer:** Hello :slightly_smiling_face: I've just changed the topic from
which a REALTIME table is consuming, and some new messages are being published
on that topic However it looks like Pinot isn't consuming them I can see a
segment with `"segment.realtime.status": "IN_PROGRESS"` / "CONSUMING" Also,
I'm not seeing any related logs On the previous topic, consumption was OK
:heavy_check_mark:  
**@ssubrama:** You cannot change the topic on a live table. You need to drop
the table and recreate it.  
**@jmeyer:** Should we find a way to clarify this ? Maybe deny the table
update ? \+ documentation (if it does not already exist ^^)  
**@mayanks:** Agree on both. @jmeyer could you add it to FAQ?  
**@mayanks:** And add an issue for preventing the table update?  
**@jmeyer:** @mayanks Yep :slightly_smiling_face:  
**@mayanks:** :thankyou:  
**@jmeyer:** Issue:  
**@jmeyer:** PR on docs:  Hard to compare before / after given the binary docs
format I guess it's much easier with Gitbook :slightly_smiling_face:  
**@mayanks:** :thankyou:  
 **@jmeyer:** Hello again Not a Pinot only question, but I'm sure most of you
had to deal with this issue so here I go Given a limited Kafka retention, how
do you handle recreating a table with past data that is no longer available in
Kafka ? Basically, what is the "workflow" that you use to repopulate a Pinot
table from past data ?  
**@mayanks:** Typically, folks ETL the kafka data into a sot store like HDFS.
You can then backfill via an offline pipeline. However, this pattern is
applicable to hybrid tables  
**@mayanks:** For realtime-only tables, you are going to be limited by kafka
retentnion  
**@jmeyer:** > For realtime-only tables, you are going to be limited by kafka
retentnion Probably not a smart solution, but dumping past data (from some
object store) into Kafka into the REALTIME table could be a solution too I
guess ? Otherwise, does using an hybrid table add a lot of complexity /
limitations ?  
**@ssubrama:** @jmeyer you can also use  
**@jmeyer:** Thanks @ssubrama I'll give it a good read :slightly_smiling_face:  
**@ken:** Thanks @ssubrama - I didn’t know about this built-in support for
auto-offlining old data.  
 **@mercyshans:** @mercyshans has joined the channel  
 **@mercyshans:** hi ,I am wondering if this is  the comprehensive list of
supported transformation function, I am looking for other functions like
coalesce, is there a coming feature list  

###  _#getting-started_

  
 **@santosh.reddy:** @santosh.reddy has joined the channel  

###  _#releases_

  
 **@santosh.reddy:** @santosh.reddy has joined the channel  
\--------------------------------------------------------------------- To
unsubscribe, e-mail: dev-unsubscribe@pinot.apache.org For additional commands,
e-mail: dev-help@pinot.apache.org