You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pinot.apache.org by Pinot Slack Email Digest <sn...@apache.org> on 2021/01/20 02:00:15 UTC

Apache Pinot Daily Email Digest (2021-01-19)

### _#general_

  
 **@sanjay.1506:** @sanjay.1506 has joined the channel  
 **@smcloete:** @smcloete has joined the channel  
 **@karinwolok1:** Welcome new Pinot community users! :wave: :wine_glass:
@ranemihir45 @james.zhao619 @parvezht @moonbow @humengyuk18 @sandeep
@robmandinga @dmettem @huwfoley @qiaoliang310 @davideberdin @pavel @gstbimo
@sanjay.1506 @smcloete!! Would love to know what brought you here! :smile:  
**@pavel:** Thank you Karin :wine_glass: At my company we were looking to
replace Elasticsearch and decided on Pinot! I am the DevOps guy here, so I was
looking for a community where I could learn more about operating a Pinot
cluster, and all that that entails :slightly_smiling_face:  
**@davideberdin:** Thank you Karin for the warm welcome :) I discovered Pinot
recently and I liked how friendly it was to set it up and use it! For this,
perhaps silly, I decided to make a Kubernetes Operator for Pinot :) you can
find it here  
**@kennybastani:** @pavel Awesome. Would love to chat sometime and hear more
about the DevOps journey. I'll DM.  

###  _#random_

  
 **@sanjay.1506:** @sanjay.1506 has joined the channel  
 **@smcloete:** @smcloete has joined the channel  

###  _#feat-text-search_

  
 **@smcloete:** @smcloete has joined the channel  

###  _#feat-presto-connector_

  
 **@smcloete:** @smcloete has joined the channel  

###  _#group-by-refactor_

  
 **@smcloete:** @smcloete has joined the channel  

###  _#pinot-power-bi_

  
 **@smcloete:** @smcloete has joined the channel  

###  _#troubleshooting_

  
 **@humengyuk18:** Anyone had experience setting up the pinot deep storage
with hdfs, I have pinot connected to hdfs, but it has an error saying Data dir
is not a directory, below are the stack trace for this error:  
**@humengyuk18:** ```java.lang.RuntimeException: Caught exception while
initializing ControllerFilePathProvider at
org.apache.pinot.controller.ControllerStarter.initControllerFilePathProvider(ControllerStarter.java:489)
~[pinot-all-0.7.0-SNAPSHOT-jar-with-
dependencies.jar:0.7.0-SNAPSHOT-8085fb76f8a70eb5bc6326850fcc735fe955d98b] at
org.apache.pinot.controller.ControllerStarter.setUpPinotController(ControllerStarter.java:330)
~[pinot-all-0.7.0-SNAPSHOT-jar-with-
dependencies.jar:0.7.0-SNAPSHOT-8085fb76f8a70eb5bc6326850fcc735fe955d98b] at
org.apache.pinot.controller.ControllerStarter.start(ControllerStarter.java:287)
~[pinot-all-0.7.0-SNAPSHOT-jar-with-
dependencies.jar:0.7.0-SNAPSHOT-8085fb76f8a70eb5bc6326850fcc735fe955d98b] at
org.apache.pinot.tools.service.PinotServiceManager.startController(PinotServiceManager.java:116)
~[pinot-all-0.7.0-SNAPSHOT-jar-with-
dependencies.jar:0.7.0-SNAPSHOT-8085fb76f8a70eb5bc6326850fcc735fe955d98b] at
org.apache.pinot.tools.service.PinotServiceManager.startRole(PinotServiceManager.java:91)
~[pinot-all-0.7.0-SNAPSHOT-jar-with-
dependencies.jar:0.7.0-SNAPSHOT-8085fb76f8a70eb5bc6326850fcc735fe955d98b] at
org.apache.pinot.tools.admin.command.StartServiceManagerCommand.lambda$startBootstrapServices$0(StartServiceManagerCommand.java:234)
~[pinot-all-0.7.0-SNAPSHOT-jar-with-
dependencies.jar:0.7.0-SNAPSHOT-8085fb76f8a70eb5bc6326850fcc735fe955d98b] at
org.apache.pinot.tools.admin.command.StartServiceManagerCommand.startPinotService(StartServiceManagerCommand.java:286)
[pinot-all-0.7.0-SNAPSHOT-jar-with-
dependencies.jar:0.7.0-SNAPSHOT-8085fb76f8a70eb5bc6326850fcc735fe955d98b] at
org.apache.pinot.tools.admin.command.StartServiceManagerCommand.startBootstrapServices(StartServiceManagerCommand.java:233)
[pinot-all-0.7.0-SNAPSHOT-jar-with-
dependencies.jar:0.7.0-SNAPSHOT-8085fb76f8a70eb5bc6326850fcc735fe955d98b] at
org.apache.pinot.tools.admin.command.StartServiceManagerCommand.execute(StartServiceManagerCommand.java:183)
[pinot-all-0.7.0-SNAPSHOT-jar-with-
dependencies.jar:0.7.0-SNAPSHOT-8085fb76f8a70eb5bc6326850fcc735fe955d98b] at
org.apache.pinot.tools.admin.command.StartControllerCommand.execute(StartControllerCommand.java:130)
[pinot-all-0.7.0-SNAPSHOT-jar-with-
dependencies.jar:0.7.0-SNAPSHOT-8085fb76f8a70eb5bc6326850fcc735fe955d98b] at
org.apache.pinot.tools.admin.PinotAdministrator.execute(PinotAdministrator.java:164)
[pinot-all-0.7.0-SNAPSHOT-jar-with-
dependencies.jar:0.7.0-SNAPSHOT-8085fb76f8a70eb5bc6326850fcc735fe955d98b] at
org.apache.pinot.tools.admin.PinotAdministrator.main(PinotAdministrator.java:184)
[pinot-all-0.7.0-SNAPSHOT-jar-with-
dependencies.jar:0.7.0-SNAPSHOT-8085fb76f8a70eb5bc6326850fcc735fe955d98b]
Caused by:
org.apache.pinot.controller.api.resources.InvalidControllerConfigException:
Caught exception while initializing file upload path provider at
org.apache.pinot.controller.api.resources.ControllerFilePathProvider.<init>(ControllerFilePathProvider.java:107)
~[pinot-all-0.7.0-SNAPSHOT-jar-with-
dependencies.jar:0.7.0-SNAPSHOT-8085fb76f8a70eb5bc6326850fcc735fe955d98b] at
org.apache.pinot.controller.api.resources.ControllerFilePathProvider.init(ControllerFilePathProvider.java:49)
~[pinot-all-0.7.0-SNAPSHOT-jar-with-
dependencies.jar:0.7.0-SNAPSHOT-8085fb76f8a70eb5bc6326850fcc735fe955d98b] at
org.apache.pinot.controller.ControllerStarter.initControllerFilePathProvider(ControllerStarter.java:487)
~[pinot-all-0.7.0-SNAPSHOT-jar-with-
dependencies.jar:0.7.0-SNAPSHOT-8085fb76f8a70eb5bc6326850fcc735fe955d98b] ...
11 more Caused by: java.lang.IllegalStateException: Data directory:  must be a
directory at
shaded.com.google.common.base.Preconditions.checkState(Preconditions.java:518)
~[pinot-all-0.7.0-SNAPSHOT-jar-with-
dependencies.jar:0.7.0-SNAPSHOT-8085fb76f8a70eb5bc6326850fcc735fe955d98b] at
org.apache.pinot.controller.api.resources.ControllerFilePathProvider.<init>(ControllerFilePathProvider.java:73)
~[pinot-all-0.7.0-SNAPSHOT-jar-with-
dependencies.jar:0.7.0-SNAPSHOT-8085fb76f8a70eb5bc6326850fcc735fe955d98b] at
org.apache.pinot.controller.api.resources.ControllerFilePathProvider.init(ControllerFilePathProvider.java:49)
~[pinot-all-0.7.0-SNAPSHOT-jar-with-
dependencies.jar:0.7.0-SNAPSHOT-8085fb76f8a70eb5bc6326850fcc735fe955d98b] at
org.apache.pinot.controller.ControllerStarter.initControllerFilePathProvider(ControllerStarter.java:487)
~[pinot-all-0.7.0-SNAPSHOT-jar-with-
dependencies.jar:0.7.0-SNAPSHOT-8085fb76f8a70eb5bc6326850fcc735fe955d98b] ...
11 more```  
**@fx19880617:** @tingchen @yupeng any idea ?  
**@fx19880617:** did you specify the pinot fs config in pinot controller ?  
**@humengyuk18:** , I filed this issue and trying fixing it.  
**@fx19880617:**  
**@humengyuk18:** @fx19880617 I left a minor comment.  
**@fx19880617:** fixed  
 **@humengyuk18:** The directory really exist in hdfs, and I looked at the
source code, FileStatus seems always return false? setPath doesn’t have any
affect on isDir  
 **@sanjay.1506:** @sanjay.1506 has joined the channel  
 **@smcloete:** @smcloete has joined the channel  
 **@neer.shay:** Hi, I am having issues ingesting data and would appreciate
some assistance. I have a K8s setup and the data I am trying to ingest is on a
different machine in the cluster (no virtual volume unfortunately). Here is
what my ingestion spec looks like: ```executionFrameworkSpec: name:
'standalone' segmentGenerationJobRunnerClassName:
'org.apache.pinot.plugin.ingestion.batch.standalone.SegmentGenerationJobRunner'
segmentTarPushJobRunnerClassName:
'org.apache.pinot.plugin.ingestion.batch.standalone.SegmentTarPushJobRunner'
segmentUriPushJobRunnerClassName:
'org.apache.pinot.plugin.ingestion.batch.standalone.SegmentUriPushJobRunner'
jobType: SegmentCreationAndTarPush inputDirURI: '' includeFileNamePattern:
'glob:**/*.parquet' outputDirURI: '/tmp/pinot-quick-start/segments/'
overwriteOutput: true pinotFSSpecs: \- scheme: file className:
org.apache.pinot.spi.filesystem.LocalPinotFS recordReaderSpec: dataFormat:
'parquet' className:
'org.apache.pinot.plugin.inputformat.parquet.ParquetRecordReader'
configClassName:
'org.apache.pinot.plugin.inputformat.parquet.ParquetRecordReaderConfig'
tableSpec: tableName: 'table_name' schemaURI: '' tableConfigURI: ''
pinotClusterSpecs: \- controllerURI: ''``` After trying to run the ingestion
(./pinot-admin.sh LaunchDataIngestionJob -jobSpecFile ~/spec.yml), I get the
following exception:  
**@neer.shay:** ```java.lang.RuntimeException: Caught exception during running
-
org.apache.pinot.plugin.ingestion.batch.standalone.SegmentGenerationJobRunner
at
org.apache.pinot.spi.ingestion.batch.IngestionJobLauncher.kickoffIngestionJob(IngestionJobLauncher.java:144)
~[pinot-all-0.7.0-SNAPSHOT-jar-with-
dependencies.jar:0.7.0-SNAPSHOT-b592c8c4787f58a21be6c59be1fced94c6ab2ebd] at
org.apache.pinot.spi.ingestion.batch.IngestionJobLauncher.runIngestionJob(IngestionJobLauncher.java:113)
~[pinot-all-0.7.0-SNAPSHOT-jar-with-
dependencies.jar:0.7.0-SNAPSHOT-b592c8c4787f58a21be6c59be1fced94c6ab2ebd] at
org.apache.pinot.tools.admin.command.LaunchDataIngestionJobCommand.execute(LaunchDataIngestionJobCommand.java:123)
[pinot-all-0.7.0-SNAPSHOT-jar-with-
dependencies.jar:0.7.0-SNAPSHOT-b592c8c4787f58a21be6c59be1fced94c6ab2ebd] at
org.apache.pinot.tools.admin.PinotAdministrator.execute(PinotAdministrator.java:164)
[pinot-all-0.7.0-SNAPSHOT-jar-with-
dependencies.jar:0.7.0-SNAPSHOT-b592c8c4787f58a21be6c59be1fced94c6ab2ebd] at
org.apache.pinot.tools.admin.PinotAdministrator.main(PinotAdministrator.java:184)
[pinot-all-0.7.0-SNAPSHOT-jar-with-
dependencies.jar:0.7.0-SNAPSHOT-b592c8c4787f58a21be6c59be1fced94c6ab2ebd]
Caused by: java.lang.IllegalStateException: PinotFS for scheme: http has not
been initialized at
shaded.com.google.common.base.Preconditions.checkState(Preconditions.java:518)
~[pinot-all-0.7.0-SNAPSHOT-jar-with-
dependencies.jar:0.7.0-SNAPSHOT-b592c8c4787f58a21be6c59be1fced94c6ab2ebd] at
org.apache.pinot.spi.filesystem.PinotFSFactory.create(PinotFSFactory.java:80)
~[pinot-all-0.7.0-SNAPSHOT-jar-with-
dependencies.jar:0.7.0-SNAPSHOT-b592c8c4787f58a21be6c59be1fced94c6ab2ebd] at
org.apache.pinot.plugin.ingestion.batch.standalone.SegmentGenerationJobRunner.run(SegmentGenerationJobRunner.java:125)
~[pinot-batch-ingestion-standalone-0.7.0-SNAPSHOT-
shaded.jar:0.7.0-SNAPSHOT-b592c8c4787f58a21be6c59be1fced94c6ab2ebd] at
org.apache.pinot.spi.ingestion.batch.IngestionJobLauncher.kickoffIngestionJob(IngestionJobLauncher.java:142)
~[pinot-all-0.7.0-SNAPSHOT-jar-with-
dependencies.jar:0.7.0-SNAPSHOT-b592c8c4787f58a21be6c59be1fced94c6ab2ebd] ...
4 more Exception caught: java.lang.RuntimeException: Caught exception during
running -
org.apache.pinot.plugin.ingestion.batch.standalone.SegmentGenerationJobRunner
at
org.apache.pinot.spi.ingestion.batch.IngestionJobLauncher.kickoffIngestionJob(IngestionJobLauncher.java:144)
~[pinot-all-0.7.0-SNAPSHOT-jar-with-
dependencies.jar:0.7.0-SNAPSHOT-b592c8c4787f58a21be6c59be1fced94c6ab2ebd] at
org.apache.pinot.spi.ingestion.batch.IngestionJobLauncher.runIngestionJob(IngestionJobLauncher.java:113)
~[pinot-all-0.7.0-SNAPSHOT-jar-with-
dependencies.jar:0.7.0-SNAPSHOT-b592c8c4787f58a21be6c59be1fced94c6ab2ebd] at
org.apache.pinot.tools.admin.command.LaunchDataIngestionJobCommand.execute(LaunchDataIngestionJobCommand.java:123)
~[pinot-all-0.7.0-SNAPSHOT-jar-with-
dependencies.jar:0.7.0-SNAPSHOT-b592c8c4787f58a21be6c59be1fced94c6ab2ebd] at
org.apache.pinot.tools.admin.PinotAdministrator.execute(PinotAdministrator.java:164)
[pinot-all-0.7.0-SNAPSHOT-jar-with-
dependencies.jar:0.7.0-SNAPSHOT-b592c8c4787f58a21be6c59be1fced94c6ab2ebd] at
org.apache.pinot.tools.admin.PinotAdministrator.main(PinotAdministrator.java:184)
[pinot-all-0.7.0-SNAPSHOT-jar-with-
dependencies.jar:0.7.0-SNAPSHOT-b592c8c4787f58a21be6c59be1fced94c6ab2ebd]
Caused by: java.lang.IllegalStateException: PinotFS for scheme: http has not
been initialized at
shaded.com.google.common.base.Preconditions.checkState(Preconditions.java:518)
~[pinot-all-0.7.0-SNAPSHOT-jar-with-
dependencies.jar:0.7.0-SNAPSHOT-b592c8c4787f58a21be6c59be1fced94c6ab2ebd] at
org.apache.pinot.spi.filesystem.PinotFSFactory.create(PinotFSFactory.java:80)
~[pinot-all-0.7.0-SNAPSHOT-jar-with-
dependencies.jar:0.7.0-SNAPSHOT-b592c8c4787f58a21be6c59be1fced94c6ab2ebd] at
org.apache.pinot.plugin.ingestion.batch.standalone.SegmentGenerationJobRunner.run(SegmentGenerationJobRunner.java:125)
~[pinot-batch-ingestion-standalone-0.7.0-SNAPSHOT-
shaded.jar:0.7.0-SNAPSHOT-b592c8c4787f58a21be6c59be1fced94c6ab2ebd] at
org.apache.pinot.spi.ingestion.batch.IngestionJobLauncher.kickoffIngestionJob(IngestionJobLauncher.java:142)
~[pinot-all-0.7.0-SNAPSHOT-jar-with-
dependencies.jar:0.7.0-SNAPSHOT-b592c8c4787f58a21be6c59be1fced94c6ab2ebd] ...
4 more```  
 **@neer.shay:** Is it even possible to ingest via this setup? Thanks in
advance for the help!  
 **@davideberdin:** @davideberdin has left the channel  
 **@g.kishore:** ```java.lang.IllegalStateException: PinotFS for scheme: http
has not been initialized```  
 **@g.kishore:** ```inputDirURI: ''```  
**@g.kishore:** can that file be accessed via http?  
 **@g.kishore:** we dont have httpFS  
 **@fx19880617:** I think you can try to use wget to download the file to
local then point the input uri to local directory then delete the raw file
@neer.shay  
 **@g.kishore:** We should probably have httpFS implementation  
**@fx19880617:** hmm, if we treat http as fs, then we may only implement the
copy operation, which is probably fine  
 **@fx19880617:** we can make it a default fs as well  
 **@g.kishore:** Yes, copy to local is the only thing we can support for now  
 **@neer.shay:** thanks for the feedback @g.kishore & @fx19880617. The file is
accessible via http but I guess I'll find another method to get it ingested  
**@fx19880617:** sure, I feel that http server is one way to access data, is
it backed by hdfs or s3 kind storage?  

###  _#pinot-s3_

  
 **@renatoj.marroquin:** @renatoj.marroquin has joined the channel  

###  _#pinot-k8s-operator_

  
 **@davideberdin:** thanks :slightly_smiling_face: I really appreciate it  

###  _#aggregators_

  
 **@smcloete:** @smcloete has joined the channel  

###  _#pinot-dev_

  
 **@smcloete:** @smcloete has joined the channel  

###  _#community_

  
 **@smcloete:** @smcloete has joined the channel  

###  _#announcements_

  
 **@smcloete:** @smcloete has joined the channel  

###  _#pinot-docs_

  
 **@smcloete:** @smcloete has joined the channel  

###  _#pinot-perf-tuning_

  
 **@smcloete:** @smcloete has joined the channel  

###  _#getting-started_

  
 **@smcloete:** @smcloete has joined the channel  
\--------------------------------------------------------------------- To
unsubscribe, e-mail: dev-unsubscribe@pinot.apache.org For additional commands,
e-mail: dev-help@pinot.apache.org