You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@predictionio.apache.org by Shane Johnson <sh...@liftiq.com> on 2018/08/24 14:17:07 UTC

Setting up PredictionIO workflow in IntelliJ?

Hi team,

Is anyone out there using IntelliJ to manage their PIO projects. I have
been using Zeppelin and Atom but wanted to try to manage my workflow
through IntelliJ.

I followed this setup https://predictionio.apache.org/resources/intellij/
but am getting the following error when trying to run the pio train
application.



[INFO] [Engine] Extracting datasource params...
[INFO] [Engine] Datasource params: (,DataSourceParams(Some(5)))
[INFO] [Engine] Extracting preparator params...
[WARN] [WorkflowUtils$] Non-empty parameters supplied to
org.template.liftscoring.Preparator, but its constructor does not accept
any arguments. Stubbing with empty parameters.
[INFO] [Engine] Preparator params: (,Empty)
[INFO] [Engine] Extracting serving params...
[WARN] [WorkflowUtils$] Non-empty parameters supplied to
org.template.liftscoring.Serving, but its constructor does not accept any
arguments. Stubbing with empty parameters.
[INFO] [Engine] Serving params: (,Empty)
Exception in thread "main"
org.apache.predictionio.data.storage.StorageClientException: Data source
ELASTICSEARCH was not properly initialized.
at
org.apache.predictionio.data.storage.Storage$$anonfun$10.apply(Storage.scala:316)
at
org.apache.predictionio.data.storage.Storage$$anonfun$10.apply(Storage.scala:316)
at scala.Option.getOrElse(Option.scala:121)
at
org.apache.predictionio.data.storage.Storage$.getDataObject(Storage.scala:315)
at
org.apache.predictionio.data.storage.Storage$.getDataObjectFromRepo(Storage.scala:300)
at
org.apache.predictionio.data.storage.Storage$.getMetaDataEngineInstances(Storage.scala:402)
at
org.apache.predictionio.workflow.CreateWorkflow$.main(CreateWorkflow.scala:248)
at
org.apache.predictionio.workflow.CreateWorkflow.main(CreateWorkflow.scala)
[ERROR] [Storage$] Error initializing storage client for source
ELASTICSEARCH.
java.lang.ClassNotFoundException: elasticsearch.StorageClient
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:264)
at
org.apache.predictionio.data.storage.Storage$.getClient(Storage.scala:257)
at
org.apache.predictionio.data.storage.Storage$.org$apache$predictionio$data$storage$Storage$$updateS2CM(Storage.scala:283)
at
org.apache.predictionio.data.storage.Storage$$anonfun$sourcesToClientMeta$1.apply(Storage.scala:244)
at
org.apache.predictionio.data.storage.Storage$$anonfun$sourcesToClientMeta$1.apply(Storage.scala:244)
at scala.collection.mutable.MapLike$class.getOrElseUpdate(MapLike.scala:194)
at scala.collection.mutable.AbstractMap.getOrElseUpdate(Map.scala:80)
at
org.apache.predictionio.data.storage.Storage$.sourcesToClientMeta(Storage.scala:244)
at
org.apache.predictionio.data.storage.Storage$.getDataObject(Storage.scala:315)
at
org.apache.predictionio.data.storage.Storage$.getDataObjectFromRepo(Storage.scala:300)
at
org.apache.predictionio.data.storage.Storage$.getMetaDataEngineInstances(Storage.scala:402)
at
org.apache.predictionio.workflow.CreateWorkflow$.main(CreateWorkflow.scala:248)
at
org.apache.predictionio.workflow.CreateWorkflow.main(CreateWorkflow.scala)

When I run pio status in the terminal I get the following which makes me
think the services are all running but the error has something to do with
not loading certain PredictionIO dependencies within IntelliJ. I have added
the pio-runtime-jars. However I did notice that the
spark-assembly-2.1.1-hadoop2.4.0.jar was not in the Apache Spark
installation directory under assembly or lib.

spark-assembly-2.1.1-hadoop2.4.0.jar

[INFO] [Management$] Inspecting PredictionIO...
[INFO] [Management$] PredictionIO 0.12.1 is installed at
/Users/shanejohnson/Desktop/Apps/apache-predictionio-0.12.1/PredictionIO-0.12.1
[INFO] [Management$] Inspecting Apache Spark...
[INFO] [Management$] Apache Spark is installed at
/Users/shanejohnson/Desktop/Apps/apache-predictionio-0.12.1/PredictionIO-0.12.1/vendors/spark-2.1.1-bin-hadoop2.6
[INFO] [Management$] Apache Spark 2.1.1 detected (meets minimum requirement
of 1.3.0)
[INFO] [Management$] Inspecting storage backend connections...
[INFO] [Storage$] Verifying Meta Data Backend (Source: ELASTICSEARCH)...
[INFO] [Storage$] Verifying Model Data Backend (Source: LOCALFS)...
[INFO] [Storage$] Verifying Event Data Backend (Source: HBASE)...
[INFO] [Storage$] Test writing to Event Store (App Id 0)...
[INFO] [HBLEvents] The table pio_event:events_0 doesn't exist yet. Creating
now...
[INFO] [HBLEvents] Removing table pio_event:events_0...
[INFO] [Management$] Your system is all ready to go.


Here are my ENV variables in IntelliJ and in pio-env.sh (where I have a
working environment):

IntelliJ:










PIO Local Setup: All services are running correctly. I have also built the
project before working out of IntelliJ.

# SPARK_HOME: Apache Spark is a hard dependency and must be configured.
# SPARK_HOME=$PIO_HOME/vendors/spark-2.0.2-bin-hadoop2.7
SPARK_HOME=$PIO_HOME/vendors/spark-2.1.1-bin-hadoop2.6

POSTGRES_JDBC_DRIVER=$PIO_HOME/lib/postgresql-42.0.0.jar
MYSQL_JDBC_DRIVER=$PIO_HOME/lib/mysql-connector-java-5.1.41.jar

# ES_CONF_DIR: You must configure this if you have advanced configuration
for
#              your Elasticsearch setup.
# ES_CONF_DIR=/opt/elasticsearch

# HADOOP_CONF_DIR: You must configure this if you intend to run PredictionIO
#                  with Hadoop 2.
# HADOOP_CONF_DIR=/opt/hadoop

# HBASE_CONF_DIR: You must configure this if you intend to run PredictionIO
#                 with HBase on a remote cluster.
# HBASE_CONF_DIR=$PIO_HOME/vendors/hbase-1.0.0/conf

# Filesystem paths where PredictionIO uses as block storage.
PIO_FS_BASEDIR=$HOME/.pio_store
PIO_FS_ENGINESDIR=$PIO_FS_BASEDIR/engines
PIO_FS_TMPDIR=$PIO_FS_BASEDIR/tmp

# PredictionIO Storage Configuration
#
# This section controls programs that make use of PredictionIO's built-in
# storage facilities. Default values are shown below.
#
# For more information on storage configuration please refer to
# http://predictionio.apache.org/system/anotherdatastore/

# Storage Repositories

# Default is to use PostgreSQL
PIO_STORAGE_REPOSITORIES_METADATA_NAME=pio_meta
PIO_STORAGE_REPOSITORIES_METADATA_SOURCE=ELASTICSEARCH

PIO_STORAGE_REPOSITORIES_EVENTDATA_NAME=pio_event
PIO_STORAGE_REPOSITORIES_EVENTDATA_SOURCE=HBASE

PIO_STORAGE_REPOSITORIES_MODELDATA_NAME=pio_model
PIO_STORAGE_REPOSITORIES_MODELDATA_SOURCE=LOCALFS

# Storage Data Sources

# PostgreSQL Default Settings
# Please change "pio" to your database name in PIO_STORAGE_SOURCES_PGSQL_URL
# Please change PIO_STORAGE_SOURCES_PGSQL_USERNAME and
# PIO_STORAGE_SOURCES_PGSQL_PASSWORD accordingly
PIO_STORAGE_SOURCES_PGSQL_TYPE=jdbc
PIO_STORAGE_SOURCES_PGSQL_URL=jdbc:postgresql://localhost/pio
PIO_STORAGE_SOURCES_PGSQL_USERNAME=pio
PIO_STORAGE_SOURCES_PGSQL_PASSWORD=pio

# MySQL Example
# PIO_STORAGE_SOURCES_MYSQL_TYPE=jdbc
# PIO_STORAGE_SOURCES_MYSQL_URL=jdbc:mysql://localhost/pio
# PIO_STORAGE_SOURCES_MYSQL_USERNAME=pio
# PIO_STORAGE_SOURCES_MYSQL_PASSWORD=pio

# Elasticsearch Example
PIO_STORAGE_SOURCES_ELASTICSEARCH_TYPE=elasticsearch
PIO_STORAGE_SOURCES_ELASTICSEARCH_HOSTS=localhost
PIO_STORAGE_SOURCES_ELASTICSEARCH_PORTS=9200
PIO_STORAGE_SOURCES_ELASTICSEARCH_SCHEMES=http
PIO_STORAGE_SOURCES_ELASTICSEARCH_HOME=$PIO_HOME/vendors/elasticsearch-5.5.2
# Optional basic HTTP auth
# PIO_STORAGE_SOURCES_ELASTICSEARCH_USERNAME=my-name
# PIO_STORAGE_SOURCES_ELASTICSEARCH_PASSWORD=my-secret
# Elasticsearch 1.x Example
# PIO_STORAGE_SOURCES_ELASTICSEARCH_TYPE=elasticsearch
# PIO_STORAGE_SOURCES_ELASTICSEARCH_CLUSTERNAME=<elasticsearch_cluster_name>
# PIO_STORAGE_SOURCES_ELASTICSEARCH_HOSTS=localhost
# PIO_STORAGE_SOURCES_ELASTICSEARCH_PORTS=9300
#
PIO_STORAGE_SOURCES_ELASTICSEARCH_HOME=$PIO_HOME/vendors/elasticsearch-1.7.6

# Local File System Example
PIO_STORAGE_SOURCES_LOCALFS_TYPE=localfs
PIO_STORAGE_SOURCES_LOCALFS_PATH=$PIO_FS_BASEDIR/models

# HBase Example
PIO_STORAGE_SOURCES_HBASE_TYPE=hbase
PIO_STORAGE_SOURCES_HBASE_HOME=$PIO_HOME/vendors/hbase-1.2.6

Thanks in advance for your help.

Best,

*Shane Johnson | LIFT IQ*
*Founder | CEO*

*www.liftiq.com <http://www.liftiq.com/>* or *shane@liftiq.com
<sh...@liftiq.com>*
mobile: (801) 360-3350
LinkedIn <https://www.linkedin.com/in/shanewjohnson/>  |  Twitter
<https://twitter.com/SWaldenJ> |  Facebook
<https://www.facebook.com/shane.johnson.71653>

Re: Setting up PredictionIO workflow in IntelliJ?

Posted by Shane Johnson <sh...@liftiq.com>.
Thank you Donald, I was able to follow this http://predictionio.apache.
org/resources/intellij/ and get everything working in intellij. Thank you
for the support!

*Shane Johnson | LIFT IQ*
*Founder | CEO*

*www.liftiq.com <http://www.liftiq.com/>* or *shane@liftiq.com
<sh...@liftiq.com>*
mobile: (801) 360-3350
LinkedIn <https://www.linkedin.com/in/shanewjohnson/>  |  Twitter
<https://twitter.com/SWaldenJ> |  Facebook
<https://www.facebook.com/shane.johnson.71653>



On Sun, Aug 26, 2018 at 6:39 PM, Donald Szeto <do...@apache.org> wrote:

> An updated guide for IntelliJ is now available at
> http://predictionio.apache.org/resources/intellij/.
>
> On Sat, Aug 25, 2018 at 10:15 AM Donald Szeto <do...@apache.org> wrote:
>
>> I took a look at your other screenshots. I see that in your main run
>> configuration, `Use classpath of module` is set to `template-scala-parallel-liftscoring`.
>> That means the JARs that get added to `pio-runtime-jars` should actually go
>> to `template-scala-parallel-liftscoring` module. The new guide that I am
>> working on will remove `pio-runtime-jars` completely.
>>
>> Let me know if that works for you.
>>
>> On Sat, Aug 25, 2018 at 7:02 AM Shane Johnson <sh...@liftiq.com> wrote:
>>
>>> Thanks Donald,
>>>
>>> I added in the dependency in the pio-runtime-jars but did not see any
>>> changes. Still the same error, any thoughts? I also verified that pio
>>> status was working and that I could run the engine from terminal with pio
>>> build, train and deploy.
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> *Shane Johnson | LIFT IQ*
>>> *Founder | CEO*
>>>
>>> *www.liftiq.com <http://www.liftiq.com/>* or *shane@liftiq.com
>>> <sh...@liftiq.com>*
>>> mobile: (801) 360-3350
>>> LinkedIn <https://www.linkedin.com/in/shanewjohnson/>  |  Twitter
>>> <https://twitter.com/SWaldenJ> |  Facebook
>>> <https://www.facebook.com/shane.johnson.71653>
>>>
>>>
>>>
>>> On Fri, Aug 24, 2018 at 4:27 PM, Donald Szeto <do...@apache.org> wrote:
>>>
>>>> Hey Shane,
>>>>
>>>> While in the process of brushing up the guide, a quick solution to your
>>>> problem is to also add the pio-data-elasticsearch-assembly-0.12.1.jar
>>>> in your $PIO_HOME/lib/spark directory to pio-runtime-jars. Let me know if
>>>> that works for you.
>>>>
>>>> The most recent version of IntelliJ has made a lot of steps in the
>>>> guide obsolete. It will be updated shortly.
>>>>
>>>> Regards,
>>>> Donald
>>>>
>>>> On Fri, Aug 24, 2018 at 12:05 PM Donald Szeto <do...@apache.org>
>>>> wrote:
>>>>
>>>>> I’m in the process of reviving that section. Stay tuned.
>>>>>
>>>>> On Fri, Aug 24, 2018 at 11:38 AM Shane Johnson <sh...@liftiq.com>
>>>>> wrote:
>>>>>
>>>>>> Thanks Pat.
>>>>>>
>>>>>> *Shane Johnson | LIFT IQ*
>>>>>> *Founder | CEO*
>>>>>>
>>>>>> *www.liftiq.com <http://www.liftiq.com/>* or *shane@liftiq.com
>>>>>> <sh...@liftiq.com>*
>>>>>> mobile: (801) 360-3350
>>>>>> LinkedIn <https://www.linkedin.com/in/shanewjohnson/>  |  Twitter
>>>>>> <https://twitter.com/SWaldenJ> |  Facebook
>>>>>> <https://www.facebook.com/shane.johnson.71653>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Fri, Aug 24, 2018 at 12:07 PM, Pat Ferrel <pa...@occamsmachete.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Those instructions no longer work and should be removed from the
>>>>>>> docs. I don’t know anyone who has been able to use the debugger on
>>>>>>> templates for that Apache releases. Using it on the PIO code is possible
>>>>>>> but not templates afaik.
>>>>>>>
>>>>>>>
>>>>>>> From: Shane Johnson <sh...@liftiq.com> <sh...@liftiq.com>
>>>>>>> Reply: user@predictionio.apache.org <us...@predictionio.apache.org>
>>>>>>> <us...@predictionio.apache.org>
>>>>>>> Date: August 24, 2018 at 7:17:07 AM
>>>>>>> To: user@predictionio.apache.org <us...@predictionio.apache.org>
>>>>>>> <us...@predictionio.apache.org>
>>>>>>> Subject:  Setting up PredictionIO workflow in IntelliJ?
>>>>>>>
>>>>>>> Hi team,
>>>>>>>
>>>>>>> Is anyone out there using IntelliJ to manage their PIO projects. I
>>>>>>> have been using Zeppelin and Atom but wanted to try to manage my workflow
>>>>>>> through IntelliJ.
>>>>>>>
>>>>>>> I followed this setup https://predictionio.
>>>>>>> apache.org/resources/intellij/ but am getting the following error
>>>>>>> when trying to run the pio train application.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> [INFO] [Engine] Extracting datasource params...
>>>>>>> [INFO] [Engine] Datasource params: (,DataSourceParams(Some(5)))
>>>>>>> [INFO] [Engine] Extracting preparator params...
>>>>>>> [WARN] [WorkflowUtils$] Non-empty parameters supplied to
>>>>>>> org.template.liftscoring.Preparator, but its constructor does not
>>>>>>> accept any arguments. Stubbing with empty parameters.
>>>>>>> [INFO] [Engine] Preparator params: (,Empty)
>>>>>>> [INFO] [Engine] Extracting serving params...
>>>>>>> [WARN] [WorkflowUtils$] Non-empty parameters supplied to
>>>>>>> org.template.liftscoring.Serving, but its constructor does not
>>>>>>> accept any arguments. Stubbing with empty parameters.
>>>>>>> [INFO] [Engine] Serving params: (,Empty)
>>>>>>> Exception in thread "main" org.apache.predictionio.data.storage.StorageClientException:
>>>>>>> Data source ELASTICSEARCH was not properly initialized.
>>>>>>> at org.apache.predictionio.data.storage.Storage$$anonfun$10.
>>>>>>> apply(Storage.scala:316)
>>>>>>> at org.apache.predictionio.data.storage.Storage$$anonfun$10.
>>>>>>> apply(Storage.scala:316)
>>>>>>> at scala.Option.getOrElse(Option.scala:121)
>>>>>>> at org.apache.predictionio.data.storage.Storage$.
>>>>>>> getDataObject(Storage.scala:315)
>>>>>>> at org.apache.predictionio.data.storage.Storage$.
>>>>>>> getDataObjectFromRepo(Storage.scala:300)
>>>>>>> at org.apache.predictionio.data.storage.Storage$.
>>>>>>> getMetaDataEngineInstances(Storage.scala:402)
>>>>>>> at org.apache.predictionio.workflow.CreateWorkflow$.main(
>>>>>>> CreateWorkflow.scala:248)
>>>>>>> at org.apache.predictionio.workflow.CreateWorkflow.main(
>>>>>>> CreateWorkflow.scala)
>>>>>>> [ERROR] [Storage$] Error initializing storage client for source
>>>>>>> ELASTICSEARCH.
>>>>>>> java.lang.ClassNotFoundException: elasticsearch.StorageClient
>>>>>>> at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>>>>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>>>>>>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335)
>>>>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>>>>>>> at java.lang.Class.forName0(Native Method)
>>>>>>> at java.lang.Class.forName(Class.java:264)
>>>>>>> at org.apache.predictionio.data.storage.Storage$.getClient(
>>>>>>> Storage.scala:257)
>>>>>>> at org.apache.predictionio.data.storage.Storage$.org$apache$
>>>>>>> predictionio$data$storage$Storage$$updateS2CM(Storage.scala:283)
>>>>>>> at org.apache.predictionio.data.storage.Storage$$anonfun$
>>>>>>> sourcesToClientMeta$1.apply(Storage.scala:244)
>>>>>>> at org.apache.predictionio.data.storage.Storage$$anonfun$
>>>>>>> sourcesToClientMeta$1.apply(Storage.scala:244)
>>>>>>> at scala.collection.mutable.MapLike$class.getOrElseUpdate(
>>>>>>> MapLike.scala:194)
>>>>>>> at scala.collection.mutable.AbstractMap.getOrElseUpdate(
>>>>>>> Map.scala:80)
>>>>>>> at org.apache.predictionio.data.storage.Storage$.
>>>>>>> sourcesToClientMeta(Storage.scala:244)
>>>>>>> at org.apache.predictionio.data.storage.Storage$.
>>>>>>> getDataObject(Storage.scala:315)
>>>>>>> at org.apache.predictionio.data.storage.Storage$.
>>>>>>> getDataObjectFromRepo(Storage.scala:300)
>>>>>>> at org.apache.predictionio.data.storage.Storage$.
>>>>>>> getMetaDataEngineInstances(Storage.scala:402)
>>>>>>> at org.apache.predictionio.workflow.CreateWorkflow$.main(
>>>>>>> CreateWorkflow.scala:248)
>>>>>>> at org.apache.predictionio.workflow.CreateWorkflow.main(
>>>>>>> CreateWorkflow.scala)
>>>>>>>
>>>>>>> When I run pio status in the terminal I get the following which
>>>>>>> makes me think the services are all running but the error has something to
>>>>>>> do with not loading certain PredictionIO dependencies within IntelliJ. I
>>>>>>> have added the pio-runtime-jars. However I did notice that the
>>>>>>> spark-assembly-2.1.1-hadoop2.4.0.jar was not in the Apache Spark
>>>>>>> installation directory under assembly or lib.
>>>>>>>
>>>>>>> spark-assembly-2.1.1-hadoop2.4.0.jar
>>>>>>>
>>>>>>> [INFO] [Management$] Inspecting PredictionIO...
>>>>>>> [INFO] [Management$] PredictionIO 0.12.1 is installed at
>>>>>>> /Users/shanejohnson/Desktop/Apps/apache-predictionio-0.12.
>>>>>>> 1/PredictionIO-0.12.1
>>>>>>> [INFO] [Management$] Inspecting Apache Spark...
>>>>>>> [INFO] [Management$] Apache Spark is installed at
>>>>>>> /Users/shanejohnson/Desktop/Apps/apache-predictionio-0.12.
>>>>>>> 1/PredictionIO-0.12.1/vendors/spark-2.1.1-bin-hadoop2.6
>>>>>>> [INFO] [Management$] Apache Spark 2.1.1 detected (meets minimum
>>>>>>> requirement of 1.3.0)
>>>>>>> [INFO] [Management$] Inspecting storage backend connections...
>>>>>>> [INFO] [Storage$] Verifying Meta Data Backend (Source:
>>>>>>> ELASTICSEARCH)...
>>>>>>> [INFO] [Storage$] Verifying Model Data Backend (Source: LOCALFS)...
>>>>>>> [INFO] [Storage$] Verifying Event Data Backend (Source: HBASE)...
>>>>>>> [INFO] [Storage$] Test writing to Event Store (App Id 0)...
>>>>>>> [INFO] [HBLEvents] The table pio_event:events_0 doesn't exist yet.
>>>>>>> Creating now...
>>>>>>> [INFO] [HBLEvents] Removing table pio_event:events_0...
>>>>>>> [INFO] [Management$] Your system is all ready to go.
>>>>>>>
>>>>>>>
>>>>>>> Here are my ENV variables in IntelliJ and in pio-env.sh (where I
>>>>>>> have a working environment):
>>>>>>>
>>>>>>> IntelliJ:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> PIO Local Setup: All services are running correctly. I have also
>>>>>>> built the project before working out of IntelliJ.
>>>>>>>
>>>>>>> # SPARK_HOME: Apache Spark is a hard dependency and must be
>>>>>>> configured.
>>>>>>> # SPARK_HOME=$PIO_HOME/vendors/spark-2.0.2-bin-hadoop2.7
>>>>>>> SPARK_HOME=$PIO_HOME/vendors/spark-2.1.1-bin-hadoop2.6
>>>>>>>
>>>>>>> POSTGRES_JDBC_DRIVER=$PIO_HOME/lib/postgresql-42.0.0.jar
>>>>>>> MYSQL_JDBC_DRIVER=$PIO_HOME/lib/mysql-connector-java-5.1.41.jar
>>>>>>>
>>>>>>> # ES_CONF_DIR: You must configure this if you have advanced
>>>>>>> configuration for
>>>>>>> #              your Elasticsearch setup.
>>>>>>> # ES_CONF_DIR=/opt/elasticsearch
>>>>>>>
>>>>>>> # HADOOP_CONF_DIR: You must configure this if you intend to run
>>>>>>> PredictionIO
>>>>>>> #                  with Hadoop 2.
>>>>>>> # HADOOP_CONF_DIR=/opt/hadoop
>>>>>>>
>>>>>>> # HBASE_CONF_DIR: You must configure this if you intend to run
>>>>>>> PredictionIO
>>>>>>> #                 with HBase on a remote cluster.
>>>>>>> # HBASE_CONF_DIR=$PIO_HOME/vendors/hbase-1.0.0/conf
>>>>>>>
>>>>>>> # Filesystem paths where PredictionIO uses as block storage.
>>>>>>> PIO_FS_BASEDIR=$HOME/.pio_store
>>>>>>> PIO_FS_ENGINESDIR=$PIO_FS_BASEDIR/engines
>>>>>>> PIO_FS_TMPDIR=$PIO_FS_BASEDIR/tmp
>>>>>>>
>>>>>>> # PredictionIO Storage Configuration
>>>>>>> #
>>>>>>> # This section controls programs that make use of PredictionIO's
>>>>>>> built-in
>>>>>>> # storage facilities. Default values are shown below.
>>>>>>> #
>>>>>>> # For more information on storage configuration please refer to
>>>>>>> # http://predictionio.apache.org/system/anotherdatastore/
>>>>>>>
>>>>>>> # Storage Repositories
>>>>>>>
>>>>>>> # Default is to use PostgreSQL
>>>>>>> PIO_STORAGE_REPOSITORIES_METADATA_NAME=pio_meta
>>>>>>> PIO_STORAGE_REPOSITORIES_METADATA_SOURCE=ELASTICSEARCH
>>>>>>>
>>>>>>> PIO_STORAGE_REPOSITORIES_EVENTDATA_NAME=pio_event
>>>>>>> PIO_STORAGE_REPOSITORIES_EVENTDATA_SOURCE=HBASE
>>>>>>>
>>>>>>> PIO_STORAGE_REPOSITORIES_MODELDATA_NAME=pio_model
>>>>>>> PIO_STORAGE_REPOSITORIES_MODELDATA_SOURCE=LOCALFS
>>>>>>>
>>>>>>> # Storage Data Sources
>>>>>>>
>>>>>>> # PostgreSQL Default Settings
>>>>>>> # Please change "pio" to your database name in
>>>>>>> PIO_STORAGE_SOURCES_PGSQL_URL
>>>>>>> # Please change PIO_STORAGE_SOURCES_PGSQL_USERNAME and
>>>>>>> # PIO_STORAGE_SOURCES_PGSQL_PASSWORD accordingly
>>>>>>> PIO_STORAGE_SOURCES_PGSQL_TYPE=jdbc
>>>>>>> PIO_STORAGE_SOURCES_PGSQL_URL=jdbc:postgresql://localhost/pio
>>>>>>> PIO_STORAGE_SOURCES_PGSQL_USERNAME=pio
>>>>>>> PIO_STORAGE_SOURCES_PGSQL_PASSWORD=pio
>>>>>>>
>>>>>>> # MySQL Example
>>>>>>> # PIO_STORAGE_SOURCES_MYSQL_TYPE=jdbc
>>>>>>> # PIO_STORAGE_SOURCES_MYSQL_URL=jdbc:mysql://localhost/pio
>>>>>>> # PIO_STORAGE_SOURCES_MYSQL_USERNAME=pio
>>>>>>> # PIO_STORAGE_SOURCES_MYSQL_PASSWORD=pio
>>>>>>>
>>>>>>> # Elasticsearch Example
>>>>>>> PIO_STORAGE_SOURCES_ELASTICSEARCH_TYPE=elasticsearch
>>>>>>> PIO_STORAGE_SOURCES_ELASTICSEARCH_HOSTS=localhost
>>>>>>> PIO_STORAGE_SOURCES_ELASTICSEARCH_PORTS=9200
>>>>>>> PIO_STORAGE_SOURCES_ELASTICSEARCH_SCHEMES=http
>>>>>>> PIO_STORAGE_SOURCES_ELASTICSEARCH_HOME=$PIO_HOME/
>>>>>>> vendors/elasticsearch-5.5.2
>>>>>>> # Optional basic HTTP auth
>>>>>>> # PIO_STORAGE_SOURCES_ELASTICSEARCH_USERNAME=my-name
>>>>>>> # PIO_STORAGE_SOURCES_ELASTICSEARCH_PASSWORD=my-secret
>>>>>>> # Elasticsearch 1.x Example
>>>>>>> # PIO_STORAGE_SOURCES_ELASTICSEARCH_TYPE=elasticsearch
>>>>>>> # PIO_STORAGE_SOURCES_ELASTICSEARCH_CLUSTERNAME=<
>>>>>>> elasticsearch_cluster_name>
>>>>>>> # PIO_STORAGE_SOURCES_ELASTICSEARCH_HOSTS=localhost
>>>>>>> # PIO_STORAGE_SOURCES_ELASTICSEARCH_PORTS=9300
>>>>>>> # PIO_STORAGE_SOURCES_ELASTICSEARCH_HOME=$PIO_HOME/
>>>>>>> vendors/elasticsearch-1.7.6
>>>>>>>
>>>>>>> # Local File System Example
>>>>>>> PIO_STORAGE_SOURCES_LOCALFS_TYPE=localfs
>>>>>>> PIO_STORAGE_SOURCES_LOCALFS_PATH=$PIO_FS_BASEDIR/models
>>>>>>>
>>>>>>> # HBase Example
>>>>>>> PIO_STORAGE_SOURCES_HBASE_TYPE=hbase
>>>>>>> PIO_STORAGE_SOURCES_HBASE_HOME=$PIO_HOME/vendors/hbase-1.2.6
>>>>>>>
>>>>>>> Thanks in advance for your help.
>>>>>>>
>>>>>>> Best,
>>>>>>>
>>>>>>> *Shane Johnson | LIFT IQ*
>>>>>>> *Founder | CEO*
>>>>>>>
>>>>>>> *www.liftiq.com <http://www.liftiq.com/>* or *shane@liftiq.com
>>>>>>> <sh...@liftiq.com>*
>>>>>>> mobile: (801) 360-3350
>>>>>>> LinkedIn <https://www.linkedin.com/in/shanewjohnson/>  |  Twitter
>>>>>>> <https://twitter.com/SWaldenJ> |  Facebook
>>>>>>> <https://www.facebook.com/shane.johnson.71653>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>

Re: Setting up PredictionIO workflow in IntelliJ?

Posted by Donald Szeto <do...@apache.org>.
An updated guide for IntelliJ is now available at
http://predictionio.apache.org/resources/intellij/.

On Sat, Aug 25, 2018 at 10:15 AM Donald Szeto <do...@apache.org> wrote:

> I took a look at your other screenshots. I see that in your main run
> configuration, `Use classpath of module` is set to
> `template-scala-parallel-liftscoring`. That means the JARs that get added
> to `pio-runtime-jars` should actually go to
> `template-scala-parallel-liftscoring` module. The new guide that I am
> working on will remove `pio-runtime-jars` completely.
>
> Let me know if that works for you.
>
> On Sat, Aug 25, 2018 at 7:02 AM Shane Johnson <sh...@liftiq.com> wrote:
>
>> Thanks Donald,
>>
>> I added in the dependency in the pio-runtime-jars but did not see any
>> changes. Still the same error, any thoughts? I also verified that pio
>> status was working and that I could run the engine from terminal with pio
>> build, train and deploy.
>>
>>
>>
>>
>>
>>
>>
>> *Shane Johnson | LIFT IQ*
>> *Founder | CEO*
>>
>> *www.liftiq.com <http://www.liftiq.com/>* or *shane@liftiq.com
>> <sh...@liftiq.com>*
>> mobile: (801) 360-3350
>> LinkedIn <https://www.linkedin.com/in/shanewjohnson/>  |  Twitter
>> <https://twitter.com/SWaldenJ> |  Facebook
>> <https://www.facebook.com/shane.johnson.71653>
>>
>>
>>
>> On Fri, Aug 24, 2018 at 4:27 PM, Donald Szeto <do...@apache.org> wrote:
>>
>>> Hey Shane,
>>>
>>> While in the process of brushing up the guide, a quick solution to your
>>> problem is to also add the pio-data-elasticsearch-assembly-0.12.1.jar in
>>> your $PIO_HOME/lib/spark directory to pio-runtime-jars. Let me know if that
>>> works for you.
>>>
>>> The most recent version of IntelliJ has made a lot of steps in the guide
>>> obsolete. It will be updated shortly.
>>>
>>> Regards,
>>> Donald
>>>
>>> On Fri, Aug 24, 2018 at 12:05 PM Donald Szeto <do...@apache.org> wrote:
>>>
>>>> I’m in the process of reviving that section. Stay tuned.
>>>>
>>>> On Fri, Aug 24, 2018 at 11:38 AM Shane Johnson <sh...@liftiq.com>
>>>> wrote:
>>>>
>>>>> Thanks Pat.
>>>>>
>>>>> *Shane Johnson | LIFT IQ*
>>>>> *Founder | CEO*
>>>>>
>>>>> *www.liftiq.com <http://www.liftiq.com/>* or *shane@liftiq.com
>>>>> <sh...@liftiq.com>*
>>>>> mobile: (801) 360-3350
>>>>> LinkedIn <https://www.linkedin.com/in/shanewjohnson/>  |  Twitter
>>>>> <https://twitter.com/SWaldenJ> |  Facebook
>>>>> <https://www.facebook.com/shane.johnson.71653>
>>>>>
>>>>>
>>>>>
>>>>> On Fri, Aug 24, 2018 at 12:07 PM, Pat Ferrel <pa...@occamsmachete.com>
>>>>> wrote:
>>>>>
>>>>>> Those instructions no longer work and should be removed from the
>>>>>> docs. I don’t know anyone who has been able to use the debugger on
>>>>>> templates for that Apache releases. Using it on the PIO code is possible
>>>>>> but not templates afaik.
>>>>>>
>>>>>>
>>>>>> From: Shane Johnson <sh...@liftiq.com> <sh...@liftiq.com>
>>>>>> Reply: user@predictionio.apache.org <us...@predictionio.apache.org>
>>>>>> <us...@predictionio.apache.org>
>>>>>> Date: August 24, 2018 at 7:17:07 AM
>>>>>> To: user@predictionio.apache.org <us...@predictionio.apache.org>
>>>>>> <us...@predictionio.apache.org>
>>>>>> Subject:  Setting up PredictionIO workflow in IntelliJ?
>>>>>>
>>>>>> Hi team,
>>>>>>
>>>>>> Is anyone out there using IntelliJ to manage their PIO projects. I
>>>>>> have been using Zeppelin and Atom but wanted to try to manage my workflow
>>>>>> through IntelliJ.
>>>>>>
>>>>>> I followed this setup
>>>>>> https://predictionio.apache.org/resources/intellij/ but am getting
>>>>>> the following error when trying to run the pio train application.
>>>>>>
>>>>>>
>>>>>>
>>>>>> [INFO] [Engine] Extracting datasource params...
>>>>>> [INFO] [Engine] Datasource params: (,DataSourceParams(Some(5)))
>>>>>> [INFO] [Engine] Extracting preparator params...
>>>>>> [WARN] [WorkflowUtils$] Non-empty parameters supplied to
>>>>>> org.template.liftscoring.Preparator, but its constructor does not accept
>>>>>> any arguments. Stubbing with empty parameters.
>>>>>> [INFO] [Engine] Preparator params: (,Empty)
>>>>>> [INFO] [Engine] Extracting serving params...
>>>>>> [WARN] [WorkflowUtils$] Non-empty parameters supplied to
>>>>>> org.template.liftscoring.Serving, but its constructor does not accept any
>>>>>> arguments. Stubbing with empty parameters.
>>>>>> [INFO] [Engine] Serving params: (,Empty)
>>>>>> Exception in thread "main"
>>>>>> org.apache.predictionio.data.storage.StorageClientException: Data source
>>>>>> ELASTICSEARCH was not properly initialized.
>>>>>> at
>>>>>> org.apache.predictionio.data.storage.Storage$$anonfun$10.apply(Storage.scala:316)
>>>>>> at
>>>>>> org.apache.predictionio.data.storage.Storage$$anonfun$10.apply(Storage.scala:316)
>>>>>> at scala.Option.getOrElse(Option.scala:121)
>>>>>> at
>>>>>> org.apache.predictionio.data.storage.Storage$.getDataObject(Storage.scala:315)
>>>>>> at
>>>>>> org.apache.predictionio.data.storage.Storage$.getDataObjectFromRepo(Storage.scala:300)
>>>>>> at
>>>>>> org.apache.predictionio.data.storage.Storage$.getMetaDataEngineInstances(Storage.scala:402)
>>>>>> at
>>>>>> org.apache.predictionio.workflow.CreateWorkflow$.main(CreateWorkflow.scala:248)
>>>>>> at
>>>>>> org.apache.predictionio.workflow.CreateWorkflow.main(CreateWorkflow.scala)
>>>>>> [ERROR] [Storage$] Error initializing storage client for source
>>>>>> ELASTICSEARCH.
>>>>>> java.lang.ClassNotFoundException: elasticsearch.StorageClient
>>>>>> at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>>>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>>>>>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335)
>>>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>>>>>> at java.lang.Class.forName0(Native Method)
>>>>>> at java.lang.Class.forName(Class.java:264)
>>>>>> at
>>>>>> org.apache.predictionio.data.storage.Storage$.getClient(Storage.scala:257)
>>>>>> at
>>>>>> org.apache.predictionio.data.storage.Storage$.org$apache$predictionio$data$storage$Storage$$updateS2CM(Storage.scala:283)
>>>>>> at
>>>>>> org.apache.predictionio.data.storage.Storage$$anonfun$sourcesToClientMeta$1.apply(Storage.scala:244)
>>>>>> at
>>>>>> org.apache.predictionio.data.storage.Storage$$anonfun$sourcesToClientMeta$1.apply(Storage.scala:244)
>>>>>> at
>>>>>> scala.collection.mutable.MapLike$class.getOrElseUpdate(MapLike.scala:194)
>>>>>> at scala.collection.mutable.AbstractMap.getOrElseUpdate(Map.scala:80)
>>>>>> at
>>>>>> org.apache.predictionio.data.storage.Storage$.sourcesToClientMeta(Storage.scala:244)
>>>>>> at
>>>>>> org.apache.predictionio.data.storage.Storage$.getDataObject(Storage.scala:315)
>>>>>> at
>>>>>> org.apache.predictionio.data.storage.Storage$.getDataObjectFromRepo(Storage.scala:300)
>>>>>> at
>>>>>> org.apache.predictionio.data.storage.Storage$.getMetaDataEngineInstances(Storage.scala:402)
>>>>>> at
>>>>>> org.apache.predictionio.workflow.CreateWorkflow$.main(CreateWorkflow.scala:248)
>>>>>> at
>>>>>> org.apache.predictionio.workflow.CreateWorkflow.main(CreateWorkflow.scala)
>>>>>>
>>>>>> When I run pio status in the terminal I get the following which makes
>>>>>> me think the services are all running but the error has something to do
>>>>>> with not loading certain PredictionIO dependencies within IntelliJ. I have
>>>>>> added the pio-runtime-jars. However I did notice that the
>>>>>> spark-assembly-2.1.1-hadoop2.4.0.jar was not in the Apache Spark
>>>>>> installation directory under assembly or lib.
>>>>>>
>>>>>> spark-assembly-2.1.1-hadoop2.4.0.jar
>>>>>>
>>>>>> [INFO] [Management$] Inspecting PredictionIO...
>>>>>> [INFO] [Management$] PredictionIO 0.12.1 is installed at
>>>>>> /Users/shanejohnson/Desktop/Apps/apache-predictionio-0.12.1/PredictionIO-0.12.1
>>>>>> [INFO] [Management$] Inspecting Apache Spark...
>>>>>> [INFO] [Management$] Apache Spark is installed at
>>>>>> /Users/shanejohnson/Desktop/Apps/apache-predictionio-0.12.1/PredictionIO-0.12.1/vendors/spark-2.1.1-bin-hadoop2.6
>>>>>> [INFO] [Management$] Apache Spark 2.1.1 detected (meets minimum
>>>>>> requirement of 1.3.0)
>>>>>> [INFO] [Management$] Inspecting storage backend connections...
>>>>>> [INFO] [Storage$] Verifying Meta Data Backend (Source:
>>>>>> ELASTICSEARCH)...
>>>>>> [INFO] [Storage$] Verifying Model Data Backend (Source: LOCALFS)...
>>>>>> [INFO] [Storage$] Verifying Event Data Backend (Source: HBASE)...
>>>>>> [INFO] [Storage$] Test writing to Event Store (App Id 0)...
>>>>>> [INFO] [HBLEvents] The table pio_event:events_0 doesn't exist yet.
>>>>>> Creating now...
>>>>>> [INFO] [HBLEvents] Removing table pio_event:events_0...
>>>>>> [INFO] [Management$] Your system is all ready to go.
>>>>>>
>>>>>>
>>>>>> Here are my ENV variables in IntelliJ and in pio-env.sh (where I have
>>>>>> a working environment):
>>>>>>
>>>>>> IntelliJ:
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> PIO Local Setup: All services are running correctly. I have also
>>>>>> built the project before working out of IntelliJ.
>>>>>>
>>>>>> # SPARK_HOME: Apache Spark is a hard dependency and must be
>>>>>> configured.
>>>>>> # SPARK_HOME=$PIO_HOME/vendors/spark-2.0.2-bin-hadoop2.7
>>>>>> SPARK_HOME=$PIO_HOME/vendors/spark-2.1.1-bin-hadoop2.6
>>>>>>
>>>>>> POSTGRES_JDBC_DRIVER=$PIO_HOME/lib/postgresql-42.0.0.jar
>>>>>> MYSQL_JDBC_DRIVER=$PIO_HOME/lib/mysql-connector-java-5.1.41.jar
>>>>>>
>>>>>> # ES_CONF_DIR: You must configure this if you have advanced
>>>>>> configuration for
>>>>>> #              your Elasticsearch setup.
>>>>>> # ES_CONF_DIR=/opt/elasticsearch
>>>>>>
>>>>>> # HADOOP_CONF_DIR: You must configure this if you intend to run
>>>>>> PredictionIO
>>>>>> #                  with Hadoop 2.
>>>>>> # HADOOP_CONF_DIR=/opt/hadoop
>>>>>>
>>>>>> # HBASE_CONF_DIR: You must configure this if you intend to run
>>>>>> PredictionIO
>>>>>> #                 with HBase on a remote cluster.
>>>>>> # HBASE_CONF_DIR=$PIO_HOME/vendors/hbase-1.0.0/conf
>>>>>>
>>>>>> # Filesystem paths where PredictionIO uses as block storage.
>>>>>> PIO_FS_BASEDIR=$HOME/.pio_store
>>>>>> PIO_FS_ENGINESDIR=$PIO_FS_BASEDIR/engines
>>>>>> PIO_FS_TMPDIR=$PIO_FS_BASEDIR/tmp
>>>>>>
>>>>>> # PredictionIO Storage Configuration
>>>>>> #
>>>>>> # This section controls programs that make use of PredictionIO's
>>>>>> built-in
>>>>>> # storage facilities. Default values are shown below.
>>>>>> #
>>>>>> # For more information on storage configuration please refer to
>>>>>> # http://predictionio.apache.org/system/anotherdatastore/
>>>>>>
>>>>>> # Storage Repositories
>>>>>>
>>>>>> # Default is to use PostgreSQL
>>>>>> PIO_STORAGE_REPOSITORIES_METADATA_NAME=pio_meta
>>>>>> PIO_STORAGE_REPOSITORIES_METADATA_SOURCE=ELASTICSEARCH
>>>>>>
>>>>>> PIO_STORAGE_REPOSITORIES_EVENTDATA_NAME=pio_event
>>>>>> PIO_STORAGE_REPOSITORIES_EVENTDATA_SOURCE=HBASE
>>>>>>
>>>>>> PIO_STORAGE_REPOSITORIES_MODELDATA_NAME=pio_model
>>>>>> PIO_STORAGE_REPOSITORIES_MODELDATA_SOURCE=LOCALFS
>>>>>>
>>>>>> # Storage Data Sources
>>>>>>
>>>>>> # PostgreSQL Default Settings
>>>>>> # Please change "pio" to your database name in
>>>>>> PIO_STORAGE_SOURCES_PGSQL_URL
>>>>>> # Please change PIO_STORAGE_SOURCES_PGSQL_USERNAME and
>>>>>> # PIO_STORAGE_SOURCES_PGSQL_PASSWORD accordingly
>>>>>> PIO_STORAGE_SOURCES_PGSQL_TYPE=jdbc
>>>>>> PIO_STORAGE_SOURCES_PGSQL_URL=jdbc:postgresql://localhost/pio
>>>>>> PIO_STORAGE_SOURCES_PGSQL_USERNAME=pio
>>>>>> PIO_STORAGE_SOURCES_PGSQL_PASSWORD=pio
>>>>>>
>>>>>> # MySQL Example
>>>>>> # PIO_STORAGE_SOURCES_MYSQL_TYPE=jdbc
>>>>>> # PIO_STORAGE_SOURCES_MYSQL_URL=jdbc:mysql://localhost/pio
>>>>>> # PIO_STORAGE_SOURCES_MYSQL_USERNAME=pio
>>>>>> # PIO_STORAGE_SOURCES_MYSQL_PASSWORD=pio
>>>>>>
>>>>>> # Elasticsearch Example
>>>>>> PIO_STORAGE_SOURCES_ELASTICSEARCH_TYPE=elasticsearch
>>>>>> PIO_STORAGE_SOURCES_ELASTICSEARCH_HOSTS=localhost
>>>>>> PIO_STORAGE_SOURCES_ELASTICSEARCH_PORTS=9200
>>>>>> PIO_STORAGE_SOURCES_ELASTICSEARCH_SCHEMES=http
>>>>>>
>>>>>> PIO_STORAGE_SOURCES_ELASTICSEARCH_HOME=$PIO_HOME/vendors/elasticsearch-5.5.2
>>>>>> # Optional basic HTTP auth
>>>>>> # PIO_STORAGE_SOURCES_ELASTICSEARCH_USERNAME=my-name
>>>>>> # PIO_STORAGE_SOURCES_ELASTICSEARCH_PASSWORD=my-secret
>>>>>> # Elasticsearch 1.x Example
>>>>>> # PIO_STORAGE_SOURCES_ELASTICSEARCH_TYPE=elasticsearch
>>>>>> #
>>>>>> PIO_STORAGE_SOURCES_ELASTICSEARCH_CLUSTERNAME=<elasticsearch_cluster_name>
>>>>>> # PIO_STORAGE_SOURCES_ELASTICSEARCH_HOSTS=localhost
>>>>>> # PIO_STORAGE_SOURCES_ELASTICSEARCH_PORTS=9300
>>>>>> #
>>>>>> PIO_STORAGE_SOURCES_ELASTICSEARCH_HOME=$PIO_HOME/vendors/elasticsearch-1.7.6
>>>>>>
>>>>>> # Local File System Example
>>>>>> PIO_STORAGE_SOURCES_LOCALFS_TYPE=localfs
>>>>>> PIO_STORAGE_SOURCES_LOCALFS_PATH=$PIO_FS_BASEDIR/models
>>>>>>
>>>>>> # HBase Example
>>>>>> PIO_STORAGE_SOURCES_HBASE_TYPE=hbase
>>>>>> PIO_STORAGE_SOURCES_HBASE_HOME=$PIO_HOME/vendors/hbase-1.2.6
>>>>>>
>>>>>> Thanks in advance for your help.
>>>>>>
>>>>>> Best,
>>>>>>
>>>>>> *Shane Johnson | LIFT IQ*
>>>>>> *Founder | CEO*
>>>>>>
>>>>>> *www.liftiq.com <http://www.liftiq.com/>* or *shane@liftiq.com
>>>>>> <sh...@liftiq.com>*
>>>>>> mobile: (801) 360-3350
>>>>>> LinkedIn <https://www.linkedin.com/in/shanewjohnson/>  |  Twitter
>>>>>> <https://twitter.com/SWaldenJ> |  Facebook
>>>>>> <https://www.facebook.com/shane.johnson.71653>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>

Re: Setting up PredictionIO workflow in IntelliJ?

Posted by Donald Szeto <do...@apache.org>.
I took a look at your other screenshots. I see that in your main run
configuration, `Use classpath of module` is set to
`template-scala-parallel-liftscoring`. That means the JARs that get added
to `pio-runtime-jars` should actually go to
`template-scala-parallel-liftscoring` module. The new guide that I am
working on will remove `pio-runtime-jars` completely.

Let me know if that works for you.

On Sat, Aug 25, 2018 at 7:02 AM Shane Johnson <sh...@liftiq.com> wrote:

> Thanks Donald,
>
> I added in the dependency in the pio-runtime-jars but did not see any
> changes. Still the same error, any thoughts? I also verified that pio
> status was working and that I could run the engine from terminal with pio
> build, train and deploy.
>
>
>
>
>
>
>
> *Shane Johnson | LIFT IQ*
> *Founder | CEO*
>
> *www.liftiq.com <http://www.liftiq.com/>* or *shane@liftiq.com
> <sh...@liftiq.com>*
> mobile: (801) 360-3350
> LinkedIn <https://www.linkedin.com/in/shanewjohnson/>  |  Twitter
> <https://twitter.com/SWaldenJ> |  Facebook
> <https://www.facebook.com/shane.johnson.71653>
>
>
>
> On Fri, Aug 24, 2018 at 4:27 PM, Donald Szeto <do...@apache.org> wrote:
>
>> Hey Shane,
>>
>> While in the process of brushing up the guide, a quick solution to your
>> problem is to also add the pio-data-elasticsearch-assembly-0.12.1.jar in
>> your $PIO_HOME/lib/spark directory to pio-runtime-jars. Let me know if that
>> works for you.
>>
>> The most recent version of IntelliJ has made a lot of steps in the guide
>> obsolete. It will be updated shortly.
>>
>> Regards,
>> Donald
>>
>> On Fri, Aug 24, 2018 at 12:05 PM Donald Szeto <do...@apache.org> wrote:
>>
>>> I’m in the process of reviving that section. Stay tuned.
>>>
>>> On Fri, Aug 24, 2018 at 11:38 AM Shane Johnson <sh...@liftiq.com> wrote:
>>>
>>>> Thanks Pat.
>>>>
>>>> *Shane Johnson | LIFT IQ*
>>>> *Founder | CEO*
>>>>
>>>> *www.liftiq.com <http://www.liftiq.com/>* or *shane@liftiq.com
>>>> <sh...@liftiq.com>*
>>>> mobile: (801) 360-3350
>>>> LinkedIn <https://www.linkedin.com/in/shanewjohnson/>  |  Twitter
>>>> <https://twitter.com/SWaldenJ> |  Facebook
>>>> <https://www.facebook.com/shane.johnson.71653>
>>>>
>>>>
>>>>
>>>> On Fri, Aug 24, 2018 at 12:07 PM, Pat Ferrel <pa...@occamsmachete.com>
>>>> wrote:
>>>>
>>>>> Those instructions no longer work and should be removed from the docs.
>>>>> I don’t know anyone who has been able to use the debugger on templates for
>>>>> that Apache releases. Using it on the PIO code is possible but not
>>>>> templates afaik.
>>>>>
>>>>>
>>>>> From: Shane Johnson <sh...@liftiq.com> <sh...@liftiq.com>
>>>>> Reply: user@predictionio.apache.org <us...@predictionio.apache.org>
>>>>> <us...@predictionio.apache.org>
>>>>> Date: August 24, 2018 at 7:17:07 AM
>>>>> To: user@predictionio.apache.org <us...@predictionio.apache.org>
>>>>> <us...@predictionio.apache.org>
>>>>> Subject:  Setting up PredictionIO workflow in IntelliJ?
>>>>>
>>>>> Hi team,
>>>>>
>>>>> Is anyone out there using IntelliJ to manage their PIO projects. I
>>>>> have been using Zeppelin and Atom but wanted to try to manage my workflow
>>>>> through IntelliJ.
>>>>>
>>>>> I followed this setup
>>>>> https://predictionio.apache.org/resources/intellij/ but am getting
>>>>> the following error when trying to run the pio train application.
>>>>>
>>>>>
>>>>>
>>>>> [INFO] [Engine] Extracting datasource params...
>>>>> [INFO] [Engine] Datasource params: (,DataSourceParams(Some(5)))
>>>>> [INFO] [Engine] Extracting preparator params...
>>>>> [WARN] [WorkflowUtils$] Non-empty parameters supplied to
>>>>> org.template.liftscoring.Preparator, but its constructor does not accept
>>>>> any arguments. Stubbing with empty parameters.
>>>>> [INFO] [Engine] Preparator params: (,Empty)
>>>>> [INFO] [Engine] Extracting serving params...
>>>>> [WARN] [WorkflowUtils$] Non-empty parameters supplied to
>>>>> org.template.liftscoring.Serving, but its constructor does not accept any
>>>>> arguments. Stubbing with empty parameters.
>>>>> [INFO] [Engine] Serving params: (,Empty)
>>>>> Exception in thread "main"
>>>>> org.apache.predictionio.data.storage.StorageClientException: Data source
>>>>> ELASTICSEARCH was not properly initialized.
>>>>> at
>>>>> org.apache.predictionio.data.storage.Storage$$anonfun$10.apply(Storage.scala:316)
>>>>> at
>>>>> org.apache.predictionio.data.storage.Storage$$anonfun$10.apply(Storage.scala:316)
>>>>> at scala.Option.getOrElse(Option.scala:121)
>>>>> at
>>>>> org.apache.predictionio.data.storage.Storage$.getDataObject(Storage.scala:315)
>>>>> at
>>>>> org.apache.predictionio.data.storage.Storage$.getDataObjectFromRepo(Storage.scala:300)
>>>>> at
>>>>> org.apache.predictionio.data.storage.Storage$.getMetaDataEngineInstances(Storage.scala:402)
>>>>> at
>>>>> org.apache.predictionio.workflow.CreateWorkflow$.main(CreateWorkflow.scala:248)
>>>>> at
>>>>> org.apache.predictionio.workflow.CreateWorkflow.main(CreateWorkflow.scala)
>>>>> [ERROR] [Storage$] Error initializing storage client for source
>>>>> ELASTICSEARCH.
>>>>> java.lang.ClassNotFoundException: elasticsearch.StorageClient
>>>>> at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>>>>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335)
>>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>>>>> at java.lang.Class.forName0(Native Method)
>>>>> at java.lang.Class.forName(Class.java:264)
>>>>> at
>>>>> org.apache.predictionio.data.storage.Storage$.getClient(Storage.scala:257)
>>>>> at
>>>>> org.apache.predictionio.data.storage.Storage$.org$apache$predictionio$data$storage$Storage$$updateS2CM(Storage.scala:283)
>>>>> at
>>>>> org.apache.predictionio.data.storage.Storage$$anonfun$sourcesToClientMeta$1.apply(Storage.scala:244)
>>>>> at
>>>>> org.apache.predictionio.data.storage.Storage$$anonfun$sourcesToClientMeta$1.apply(Storage.scala:244)
>>>>> at
>>>>> scala.collection.mutable.MapLike$class.getOrElseUpdate(MapLike.scala:194)
>>>>> at scala.collection.mutable.AbstractMap.getOrElseUpdate(Map.scala:80)
>>>>> at
>>>>> org.apache.predictionio.data.storage.Storage$.sourcesToClientMeta(Storage.scala:244)
>>>>> at
>>>>> org.apache.predictionio.data.storage.Storage$.getDataObject(Storage.scala:315)
>>>>> at
>>>>> org.apache.predictionio.data.storage.Storage$.getDataObjectFromRepo(Storage.scala:300)
>>>>> at
>>>>> org.apache.predictionio.data.storage.Storage$.getMetaDataEngineInstances(Storage.scala:402)
>>>>> at
>>>>> org.apache.predictionio.workflow.CreateWorkflow$.main(CreateWorkflow.scala:248)
>>>>> at
>>>>> org.apache.predictionio.workflow.CreateWorkflow.main(CreateWorkflow.scala)
>>>>>
>>>>> When I run pio status in the terminal I get the following which makes
>>>>> me think the services are all running but the error has something to do
>>>>> with not loading certain PredictionIO dependencies within IntelliJ. I have
>>>>> added the pio-runtime-jars. However I did notice that the
>>>>> spark-assembly-2.1.1-hadoop2.4.0.jar was not in the Apache Spark
>>>>> installation directory under assembly or lib.
>>>>>
>>>>> spark-assembly-2.1.1-hadoop2.4.0.jar
>>>>>
>>>>> [INFO] [Management$] Inspecting PredictionIO...
>>>>> [INFO] [Management$] PredictionIO 0.12.1 is installed at
>>>>> /Users/shanejohnson/Desktop/Apps/apache-predictionio-0.12.1/PredictionIO-0.12.1
>>>>> [INFO] [Management$] Inspecting Apache Spark...
>>>>> [INFO] [Management$] Apache Spark is installed at
>>>>> /Users/shanejohnson/Desktop/Apps/apache-predictionio-0.12.1/PredictionIO-0.12.1/vendors/spark-2.1.1-bin-hadoop2.6
>>>>> [INFO] [Management$] Apache Spark 2.1.1 detected (meets minimum
>>>>> requirement of 1.3.0)
>>>>> [INFO] [Management$] Inspecting storage backend connections...
>>>>> [INFO] [Storage$] Verifying Meta Data Backend (Source:
>>>>> ELASTICSEARCH)...
>>>>> [INFO] [Storage$] Verifying Model Data Backend (Source: LOCALFS)...
>>>>> [INFO] [Storage$] Verifying Event Data Backend (Source: HBASE)...
>>>>> [INFO] [Storage$] Test writing to Event Store (App Id 0)...
>>>>> [INFO] [HBLEvents] The table pio_event:events_0 doesn't exist yet.
>>>>> Creating now...
>>>>> [INFO] [HBLEvents] Removing table pio_event:events_0...
>>>>> [INFO] [Management$] Your system is all ready to go.
>>>>>
>>>>>
>>>>> Here are my ENV variables in IntelliJ and in pio-env.sh (where I have
>>>>> a working environment):
>>>>>
>>>>> IntelliJ:
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> PIO Local Setup: All services are running correctly. I have also built
>>>>> the project before working out of IntelliJ.
>>>>>
>>>>> # SPARK_HOME: Apache Spark is a hard dependency and must be configured.
>>>>> # SPARK_HOME=$PIO_HOME/vendors/spark-2.0.2-bin-hadoop2.7
>>>>> SPARK_HOME=$PIO_HOME/vendors/spark-2.1.1-bin-hadoop2.6
>>>>>
>>>>> POSTGRES_JDBC_DRIVER=$PIO_HOME/lib/postgresql-42.0.0.jar
>>>>> MYSQL_JDBC_DRIVER=$PIO_HOME/lib/mysql-connector-java-5.1.41.jar
>>>>>
>>>>> # ES_CONF_DIR: You must configure this if you have advanced
>>>>> configuration for
>>>>> #              your Elasticsearch setup.
>>>>> # ES_CONF_DIR=/opt/elasticsearch
>>>>>
>>>>> # HADOOP_CONF_DIR: You must configure this if you intend to run
>>>>> PredictionIO
>>>>> #                  with Hadoop 2.
>>>>> # HADOOP_CONF_DIR=/opt/hadoop
>>>>>
>>>>> # HBASE_CONF_DIR: You must configure this if you intend to run
>>>>> PredictionIO
>>>>> #                 with HBase on a remote cluster.
>>>>> # HBASE_CONF_DIR=$PIO_HOME/vendors/hbase-1.0.0/conf
>>>>>
>>>>> # Filesystem paths where PredictionIO uses as block storage.
>>>>> PIO_FS_BASEDIR=$HOME/.pio_store
>>>>> PIO_FS_ENGINESDIR=$PIO_FS_BASEDIR/engines
>>>>> PIO_FS_TMPDIR=$PIO_FS_BASEDIR/tmp
>>>>>
>>>>> # PredictionIO Storage Configuration
>>>>> #
>>>>> # This section controls programs that make use of PredictionIO's
>>>>> built-in
>>>>> # storage facilities. Default values are shown below.
>>>>> #
>>>>> # For more information on storage configuration please refer to
>>>>> # http://predictionio.apache.org/system/anotherdatastore/
>>>>>
>>>>> # Storage Repositories
>>>>>
>>>>> # Default is to use PostgreSQL
>>>>> PIO_STORAGE_REPOSITORIES_METADATA_NAME=pio_meta
>>>>> PIO_STORAGE_REPOSITORIES_METADATA_SOURCE=ELASTICSEARCH
>>>>>
>>>>> PIO_STORAGE_REPOSITORIES_EVENTDATA_NAME=pio_event
>>>>> PIO_STORAGE_REPOSITORIES_EVENTDATA_SOURCE=HBASE
>>>>>
>>>>> PIO_STORAGE_REPOSITORIES_MODELDATA_NAME=pio_model
>>>>> PIO_STORAGE_REPOSITORIES_MODELDATA_SOURCE=LOCALFS
>>>>>
>>>>> # Storage Data Sources
>>>>>
>>>>> # PostgreSQL Default Settings
>>>>> # Please change "pio" to your database name in
>>>>> PIO_STORAGE_SOURCES_PGSQL_URL
>>>>> # Please change PIO_STORAGE_SOURCES_PGSQL_USERNAME and
>>>>> # PIO_STORAGE_SOURCES_PGSQL_PASSWORD accordingly
>>>>> PIO_STORAGE_SOURCES_PGSQL_TYPE=jdbc
>>>>> PIO_STORAGE_SOURCES_PGSQL_URL=jdbc:postgresql://localhost/pio
>>>>> PIO_STORAGE_SOURCES_PGSQL_USERNAME=pio
>>>>> PIO_STORAGE_SOURCES_PGSQL_PASSWORD=pio
>>>>>
>>>>> # MySQL Example
>>>>> # PIO_STORAGE_SOURCES_MYSQL_TYPE=jdbc
>>>>> # PIO_STORAGE_SOURCES_MYSQL_URL=jdbc:mysql://localhost/pio
>>>>> # PIO_STORAGE_SOURCES_MYSQL_USERNAME=pio
>>>>> # PIO_STORAGE_SOURCES_MYSQL_PASSWORD=pio
>>>>>
>>>>> # Elasticsearch Example
>>>>> PIO_STORAGE_SOURCES_ELASTICSEARCH_TYPE=elasticsearch
>>>>> PIO_STORAGE_SOURCES_ELASTICSEARCH_HOSTS=localhost
>>>>> PIO_STORAGE_SOURCES_ELASTICSEARCH_PORTS=9200
>>>>> PIO_STORAGE_SOURCES_ELASTICSEARCH_SCHEMES=http
>>>>>
>>>>> PIO_STORAGE_SOURCES_ELASTICSEARCH_HOME=$PIO_HOME/vendors/elasticsearch-5.5.2
>>>>> # Optional basic HTTP auth
>>>>> # PIO_STORAGE_SOURCES_ELASTICSEARCH_USERNAME=my-name
>>>>> # PIO_STORAGE_SOURCES_ELASTICSEARCH_PASSWORD=my-secret
>>>>> # Elasticsearch 1.x Example
>>>>> # PIO_STORAGE_SOURCES_ELASTICSEARCH_TYPE=elasticsearch
>>>>> #
>>>>> PIO_STORAGE_SOURCES_ELASTICSEARCH_CLUSTERNAME=<elasticsearch_cluster_name>
>>>>> # PIO_STORAGE_SOURCES_ELASTICSEARCH_HOSTS=localhost
>>>>> # PIO_STORAGE_SOURCES_ELASTICSEARCH_PORTS=9300
>>>>> #
>>>>> PIO_STORAGE_SOURCES_ELASTICSEARCH_HOME=$PIO_HOME/vendors/elasticsearch-1.7.6
>>>>>
>>>>> # Local File System Example
>>>>> PIO_STORAGE_SOURCES_LOCALFS_TYPE=localfs
>>>>> PIO_STORAGE_SOURCES_LOCALFS_PATH=$PIO_FS_BASEDIR/models
>>>>>
>>>>> # HBase Example
>>>>> PIO_STORAGE_SOURCES_HBASE_TYPE=hbase
>>>>> PIO_STORAGE_SOURCES_HBASE_HOME=$PIO_HOME/vendors/hbase-1.2.6
>>>>>
>>>>> Thanks in advance for your help.
>>>>>
>>>>> Best,
>>>>>
>>>>> *Shane Johnson | LIFT IQ*
>>>>> *Founder | CEO*
>>>>>
>>>>> *www.liftiq.com <http://www.liftiq.com/>* or *shane@liftiq.com
>>>>> <sh...@liftiq.com>*
>>>>> mobile: (801) 360-3350
>>>>> LinkedIn <https://www.linkedin.com/in/shanewjohnson/>  |  Twitter
>>>>> <https://twitter.com/SWaldenJ> |  Facebook
>>>>> <https://www.facebook.com/shane.johnson.71653>
>>>>>
>>>>>
>>>>>
>>>>
>

Re: Setting up PredictionIO workflow in IntelliJ?

Posted by Shane Johnson <sh...@liftiq.com>.
Thanks Donald,

I added in the dependency in the pio-runtime-jars but did not see any
changes. Still the same error, any thoughts? I also verified that pio
status was working and that I could run the engine from terminal with pio
build, train and deploy.







*Shane Johnson | LIFT IQ*
*Founder | CEO*

*www.liftiq.com <http://www.liftiq.com/>* or *shane@liftiq.com
<sh...@liftiq.com>*
mobile: (801) 360-3350
LinkedIn <https://www.linkedin.com/in/shanewjohnson/>  |  Twitter
<https://twitter.com/SWaldenJ> |  Facebook
<https://www.facebook.com/shane.johnson.71653>



On Fri, Aug 24, 2018 at 4:27 PM, Donald Szeto <do...@apache.org> wrote:

> Hey Shane,
>
> While in the process of brushing up the guide, a quick solution to your
> problem is to also add the pio-data-elasticsearch-assembly-0.12.1.jar in
> your $PIO_HOME/lib/spark directory to pio-runtime-jars. Let me know if that
> works for you.
>
> The most recent version of IntelliJ has made a lot of steps in the guide
> obsolete. It will be updated shortly.
>
> Regards,
> Donald
>
> On Fri, Aug 24, 2018 at 12:05 PM Donald Szeto <do...@apache.org> wrote:
>
>> I’m in the process of reviving that section. Stay tuned.
>>
>> On Fri, Aug 24, 2018 at 11:38 AM Shane Johnson <sh...@liftiq.com> wrote:
>>
>>> Thanks Pat.
>>>
>>> *Shane Johnson | LIFT IQ*
>>> *Founder | CEO*
>>>
>>> *www.liftiq.com <http://www.liftiq.com/>* or *shane@liftiq.com
>>> <sh...@liftiq.com>*
>>> mobile: (801) 360-3350
>>> LinkedIn <https://www.linkedin.com/in/shanewjohnson/>  |  Twitter
>>> <https://twitter.com/SWaldenJ> |  Facebook
>>> <https://www.facebook.com/shane.johnson.71653>
>>>
>>>
>>>
>>> On Fri, Aug 24, 2018 at 12:07 PM, Pat Ferrel <pa...@occamsmachete.com>
>>> wrote:
>>>
>>>> Those instructions no longer work and should be removed from the docs.
>>>> I don’t know anyone who has been able to use the debugger on templates for
>>>> that Apache releases. Using it on the PIO code is possible but not
>>>> templates afaik.
>>>>
>>>>
>>>> From: Shane Johnson <sh...@liftiq.com> <sh...@liftiq.com>
>>>> Reply: user@predictionio.apache.org <us...@predictionio.apache.org>
>>>> <us...@predictionio.apache.org>
>>>> Date: August 24, 2018 at 7:17:07 AM
>>>> To: user@predictionio.apache.org <us...@predictionio.apache.org>
>>>> <us...@predictionio.apache.org>
>>>> Subject:  Setting up PredictionIO workflow in IntelliJ?
>>>>
>>>> Hi team,
>>>>
>>>> Is anyone out there using IntelliJ to manage their PIO projects. I have
>>>> been using Zeppelin and Atom but wanted to try to manage my workflow
>>>> through IntelliJ.
>>>>
>>>> I followed this setup https://predictionio.
>>>> apache.org/resources/intellij/ but am getting the following error when
>>>> trying to run the pio train application.
>>>>
>>>>
>>>>
>>>> [INFO] [Engine] Extracting datasource params...
>>>> [INFO] [Engine] Datasource params: (,DataSourceParams(Some(5)))
>>>> [INFO] [Engine] Extracting preparator params...
>>>> [WARN] [WorkflowUtils$] Non-empty parameters supplied to
>>>> org.template.liftscoring.Preparator, but its constructor does not
>>>> accept any arguments. Stubbing with empty parameters.
>>>> [INFO] [Engine] Preparator params: (,Empty)
>>>> [INFO] [Engine] Extracting serving params...
>>>> [WARN] [WorkflowUtils$] Non-empty parameters supplied to
>>>> org.template.liftscoring.Serving, but its constructor does not accept
>>>> any arguments. Stubbing with empty parameters.
>>>> [INFO] [Engine] Serving params: (,Empty)
>>>> Exception in thread "main" org.apache.predictionio.data.storage.StorageClientException:
>>>> Data source ELASTICSEARCH was not properly initialized.
>>>> at org.apache.predictionio.data.storage.Storage$$anonfun$10.
>>>> apply(Storage.scala:316)
>>>> at org.apache.predictionio.data.storage.Storage$$anonfun$10.
>>>> apply(Storage.scala:316)
>>>> at scala.Option.getOrElse(Option.scala:121)
>>>> at org.apache.predictionio.data.storage.Storage$.
>>>> getDataObject(Storage.scala:315)
>>>> at org.apache.predictionio.data.storage.Storage$.
>>>> getDataObjectFromRepo(Storage.scala:300)
>>>> at org.apache.predictionio.data.storage.Storage$.
>>>> getMetaDataEngineInstances(Storage.scala:402)
>>>> at org.apache.predictionio.workflow.CreateWorkflow$.main(
>>>> CreateWorkflow.scala:248)
>>>> at org.apache.predictionio.workflow.CreateWorkflow.main(
>>>> CreateWorkflow.scala)
>>>> [ERROR] [Storage$] Error initializing storage client for source
>>>> ELASTICSEARCH.
>>>> java.lang.ClassNotFoundException: elasticsearch.StorageClient
>>>> at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>>>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335)
>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>>>> at java.lang.Class.forName0(Native Method)
>>>> at java.lang.Class.forName(Class.java:264)
>>>> at org.apache.predictionio.data.storage.Storage$.getClient(
>>>> Storage.scala:257)
>>>> at org.apache.predictionio.data.storage.Storage$.org$apache$
>>>> predictionio$data$storage$Storage$$updateS2CM(Storage.scala:283)
>>>> at org.apache.predictionio.data.storage.Storage$$anonfun$
>>>> sourcesToClientMeta$1.apply(Storage.scala:244)
>>>> at org.apache.predictionio.data.storage.Storage$$anonfun$
>>>> sourcesToClientMeta$1.apply(Storage.scala:244)
>>>> at scala.collection.mutable.MapLike$class.getOrElseUpdate(
>>>> MapLike.scala:194)
>>>> at scala.collection.mutable.AbstractMap.getOrElseUpdate(Map.scala:80)
>>>> at org.apache.predictionio.data.storage.Storage$.
>>>> sourcesToClientMeta(Storage.scala:244)
>>>> at org.apache.predictionio.data.storage.Storage$.
>>>> getDataObject(Storage.scala:315)
>>>> at org.apache.predictionio.data.storage.Storage$.
>>>> getDataObjectFromRepo(Storage.scala:300)
>>>> at org.apache.predictionio.data.storage.Storage$.
>>>> getMetaDataEngineInstances(Storage.scala:402)
>>>> at org.apache.predictionio.workflow.CreateWorkflow$.main(
>>>> CreateWorkflow.scala:248)
>>>> at org.apache.predictionio.workflow.CreateWorkflow.main(
>>>> CreateWorkflow.scala)
>>>>
>>>> When I run pio status in the terminal I get the following which makes
>>>> me think the services are all running but the error has something to do
>>>> with not loading certain PredictionIO dependencies within IntelliJ. I have
>>>> added the pio-runtime-jars. However I did notice that the
>>>> spark-assembly-2.1.1-hadoop2.4.0.jar was not in the Apache Spark
>>>> installation directory under assembly or lib.
>>>>
>>>> spark-assembly-2.1.1-hadoop2.4.0.jar
>>>>
>>>> [INFO] [Management$] Inspecting PredictionIO...
>>>> [INFO] [Management$] PredictionIO 0.12.1 is installed at
>>>> /Users/shanejohnson/Desktop/Apps/apache-predictionio-0.12.
>>>> 1/PredictionIO-0.12.1
>>>> [INFO] [Management$] Inspecting Apache Spark...
>>>> [INFO] [Management$] Apache Spark is installed at
>>>> /Users/shanejohnson/Desktop/Apps/apache-predictionio-0.12.
>>>> 1/PredictionIO-0.12.1/vendors/spark-2.1.1-bin-hadoop2.6
>>>> [INFO] [Management$] Apache Spark 2.1.1 detected (meets minimum
>>>> requirement of 1.3.0)
>>>> [INFO] [Management$] Inspecting storage backend connections...
>>>> [INFO] [Storage$] Verifying Meta Data Backend (Source: ELASTICSEARCH)...
>>>> [INFO] [Storage$] Verifying Model Data Backend (Source: LOCALFS)...
>>>> [INFO] [Storage$] Verifying Event Data Backend (Source: HBASE)...
>>>> [INFO] [Storage$] Test writing to Event Store (App Id 0)...
>>>> [INFO] [HBLEvents] The table pio_event:events_0 doesn't exist yet.
>>>> Creating now...
>>>> [INFO] [HBLEvents] Removing table pio_event:events_0...
>>>> [INFO] [Management$] Your system is all ready to go.
>>>>
>>>>
>>>> Here are my ENV variables in IntelliJ and in pio-env.sh (where I have a
>>>> working environment):
>>>>
>>>> IntelliJ:
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> PIO Local Setup: All services are running correctly. I have also built
>>>> the project before working out of IntelliJ.
>>>>
>>>> # SPARK_HOME: Apache Spark is a hard dependency and must be configured.
>>>> # SPARK_HOME=$PIO_HOME/vendors/spark-2.0.2-bin-hadoop2.7
>>>> SPARK_HOME=$PIO_HOME/vendors/spark-2.1.1-bin-hadoop2.6
>>>>
>>>> POSTGRES_JDBC_DRIVER=$PIO_HOME/lib/postgresql-42.0.0.jar
>>>> MYSQL_JDBC_DRIVER=$PIO_HOME/lib/mysql-connector-java-5.1.41.jar
>>>>
>>>> # ES_CONF_DIR: You must configure this if you have advanced
>>>> configuration for
>>>> #              your Elasticsearch setup.
>>>> # ES_CONF_DIR=/opt/elasticsearch
>>>>
>>>> # HADOOP_CONF_DIR: You must configure this if you intend to run
>>>> PredictionIO
>>>> #                  with Hadoop 2.
>>>> # HADOOP_CONF_DIR=/opt/hadoop
>>>>
>>>> # HBASE_CONF_DIR: You must configure this if you intend to run
>>>> PredictionIO
>>>> #                 with HBase on a remote cluster.
>>>> # HBASE_CONF_DIR=$PIO_HOME/vendors/hbase-1.0.0/conf
>>>>
>>>> # Filesystem paths where PredictionIO uses as block storage.
>>>> PIO_FS_BASEDIR=$HOME/.pio_store
>>>> PIO_FS_ENGINESDIR=$PIO_FS_BASEDIR/engines
>>>> PIO_FS_TMPDIR=$PIO_FS_BASEDIR/tmp
>>>>
>>>> # PredictionIO Storage Configuration
>>>> #
>>>> # This section controls programs that make use of PredictionIO's
>>>> built-in
>>>> # storage facilities. Default values are shown below.
>>>> #
>>>> # For more information on storage configuration please refer to
>>>> # http://predictionio.apache.org/system/anotherdatastore/
>>>>
>>>> # Storage Repositories
>>>>
>>>> # Default is to use PostgreSQL
>>>> PIO_STORAGE_REPOSITORIES_METADATA_NAME=pio_meta
>>>> PIO_STORAGE_REPOSITORIES_METADATA_SOURCE=ELASTICSEARCH
>>>>
>>>> PIO_STORAGE_REPOSITORIES_EVENTDATA_NAME=pio_event
>>>> PIO_STORAGE_REPOSITORIES_EVENTDATA_SOURCE=HBASE
>>>>
>>>> PIO_STORAGE_REPOSITORIES_MODELDATA_NAME=pio_model
>>>> PIO_STORAGE_REPOSITORIES_MODELDATA_SOURCE=LOCALFS
>>>>
>>>> # Storage Data Sources
>>>>
>>>> # PostgreSQL Default Settings
>>>> # Please change "pio" to your database name in
>>>> PIO_STORAGE_SOURCES_PGSQL_URL
>>>> # Please change PIO_STORAGE_SOURCES_PGSQL_USERNAME and
>>>> # PIO_STORAGE_SOURCES_PGSQL_PASSWORD accordingly
>>>> PIO_STORAGE_SOURCES_PGSQL_TYPE=jdbc
>>>> PIO_STORAGE_SOURCES_PGSQL_URL=jdbc:postgresql://localhost/pio
>>>> PIO_STORAGE_SOURCES_PGSQL_USERNAME=pio
>>>> PIO_STORAGE_SOURCES_PGSQL_PASSWORD=pio
>>>>
>>>> # MySQL Example
>>>> # PIO_STORAGE_SOURCES_MYSQL_TYPE=jdbc
>>>> # PIO_STORAGE_SOURCES_MYSQL_URL=jdbc:mysql://localhost/pio
>>>> # PIO_STORAGE_SOURCES_MYSQL_USERNAME=pio
>>>> # PIO_STORAGE_SOURCES_MYSQL_PASSWORD=pio
>>>>
>>>> # Elasticsearch Example
>>>> PIO_STORAGE_SOURCES_ELASTICSEARCH_TYPE=elasticsearch
>>>> PIO_STORAGE_SOURCES_ELASTICSEARCH_HOSTS=localhost
>>>> PIO_STORAGE_SOURCES_ELASTICSEARCH_PORTS=9200
>>>> PIO_STORAGE_SOURCES_ELASTICSEARCH_SCHEMES=http
>>>> PIO_STORAGE_SOURCES_ELASTICSEARCH_HOME=$PIO_HOME/
>>>> vendors/elasticsearch-5.5.2
>>>> # Optional basic HTTP auth
>>>> # PIO_STORAGE_SOURCES_ELASTICSEARCH_USERNAME=my-name
>>>> # PIO_STORAGE_SOURCES_ELASTICSEARCH_PASSWORD=my-secret
>>>> # Elasticsearch 1.x Example
>>>> # PIO_STORAGE_SOURCES_ELASTICSEARCH_TYPE=elasticsearch
>>>> # PIO_STORAGE_SOURCES_ELASTICSEARCH_CLUSTERNAME=<
>>>> elasticsearch_cluster_name>
>>>> # PIO_STORAGE_SOURCES_ELASTICSEARCH_HOSTS=localhost
>>>> # PIO_STORAGE_SOURCES_ELASTICSEARCH_PORTS=9300
>>>> # PIO_STORAGE_SOURCES_ELASTICSEARCH_HOME=$PIO_HOME/
>>>> vendors/elasticsearch-1.7.6
>>>>
>>>> # Local File System Example
>>>> PIO_STORAGE_SOURCES_LOCALFS_TYPE=localfs
>>>> PIO_STORAGE_SOURCES_LOCALFS_PATH=$PIO_FS_BASEDIR/models
>>>>
>>>> # HBase Example
>>>> PIO_STORAGE_SOURCES_HBASE_TYPE=hbase
>>>> PIO_STORAGE_SOURCES_HBASE_HOME=$PIO_HOME/vendors/hbase-1.2.6
>>>>
>>>> Thanks in advance for your help.
>>>>
>>>> Best,
>>>>
>>>> *Shane Johnson | LIFT IQ*
>>>> *Founder | CEO*
>>>>
>>>> *www.liftiq.com <http://www.liftiq.com/>* or *shane@liftiq.com
>>>> <sh...@liftiq.com>*
>>>> mobile: (801) 360-3350
>>>> LinkedIn <https://www.linkedin.com/in/shanewjohnson/>  |  Twitter
>>>> <https://twitter.com/SWaldenJ> |  Facebook
>>>> <https://www.facebook.com/shane.johnson.71653>
>>>>
>>>>
>>>>
>>>

Re: Setting up PredictionIO workflow in IntelliJ?

Posted by Donald Szeto <do...@apache.org>.
Hey Shane,

While in the process of brushing up the guide, a quick solution to your
problem is to also add the pio-data-elasticsearch-assembly-0.12.1.jar in
your $PIO_HOME/lib/spark directory to pio-runtime-jars. Let me know if that
works for you.

The most recent version of IntelliJ has made a lot of steps in the guide
obsolete. It will be updated shortly.

Regards,
Donald

On Fri, Aug 24, 2018 at 12:05 PM Donald Szeto <do...@apache.org> wrote:

> I’m in the process of reviving that section. Stay tuned.
>
> On Fri, Aug 24, 2018 at 11:38 AM Shane Johnson <sh...@liftiq.com> wrote:
>
>> Thanks Pat.
>>
>> *Shane Johnson | LIFT IQ*
>> *Founder | CEO*
>>
>> *www.liftiq.com <http://www.liftiq.com/>* or *shane@liftiq.com
>> <sh...@liftiq.com>*
>> mobile: (801) 360-3350
>> LinkedIn <https://www.linkedin.com/in/shanewjohnson/>  |  Twitter
>> <https://twitter.com/SWaldenJ> |  Facebook
>> <https://www.facebook.com/shane.johnson.71653>
>>
>>
>>
>> On Fri, Aug 24, 2018 at 12:07 PM, Pat Ferrel <pa...@occamsmachete.com>
>> wrote:
>>
>>> Those instructions no longer work and should be removed from the docs. I
>>> don’t know anyone who has been able to use the debugger on templates for
>>> that Apache releases. Using it on the PIO code is possible but not
>>> templates afaik.
>>>
>>>
>>> From: Shane Johnson <sh...@liftiq.com> <sh...@liftiq.com>
>>> Reply: user@predictionio.apache.org <us...@predictionio.apache.org>
>>> <us...@predictionio.apache.org>
>>> Date: August 24, 2018 at 7:17:07 AM
>>> To: user@predictionio.apache.org <us...@predictionio.apache.org>
>>> <us...@predictionio.apache.org>
>>> Subject:  Setting up PredictionIO workflow in IntelliJ?
>>>
>>> Hi team,
>>>
>>> Is anyone out there using IntelliJ to manage their PIO projects. I have
>>> been using Zeppelin and Atom but wanted to try to manage my workflow
>>> through IntelliJ.
>>>
>>> I followed this setup
>>> https://predictionio.apache.org/resources/intellij/ but am getting the
>>> following error when trying to run the pio train application.
>>>
>>>
>>>
>>> [INFO] [Engine] Extracting datasource params...
>>> [INFO] [Engine] Datasource params: (,DataSourceParams(Some(5)))
>>> [INFO] [Engine] Extracting preparator params...
>>> [WARN] [WorkflowUtils$] Non-empty parameters supplied to
>>> org.template.liftscoring.Preparator, but its constructor does not accept
>>> any arguments. Stubbing with empty parameters.
>>> [INFO] [Engine] Preparator params: (,Empty)
>>> [INFO] [Engine] Extracting serving params...
>>> [WARN] [WorkflowUtils$] Non-empty parameters supplied to
>>> org.template.liftscoring.Serving, but its constructor does not accept any
>>> arguments. Stubbing with empty parameters.
>>> [INFO] [Engine] Serving params: (,Empty)
>>> Exception in thread "main"
>>> org.apache.predictionio.data.storage.StorageClientException: Data source
>>> ELASTICSEARCH was not properly initialized.
>>> at
>>> org.apache.predictionio.data.storage.Storage$$anonfun$10.apply(Storage.scala:316)
>>> at
>>> org.apache.predictionio.data.storage.Storage$$anonfun$10.apply(Storage.scala:316)
>>> at scala.Option.getOrElse(Option.scala:121)
>>> at
>>> org.apache.predictionio.data.storage.Storage$.getDataObject(Storage.scala:315)
>>> at
>>> org.apache.predictionio.data.storage.Storage$.getDataObjectFromRepo(Storage.scala:300)
>>> at
>>> org.apache.predictionio.data.storage.Storage$.getMetaDataEngineInstances(Storage.scala:402)
>>> at
>>> org.apache.predictionio.workflow.CreateWorkflow$.main(CreateWorkflow.scala:248)
>>> at
>>> org.apache.predictionio.workflow.CreateWorkflow.main(CreateWorkflow.scala)
>>> [ERROR] [Storage$] Error initializing storage client for source
>>> ELASTICSEARCH.
>>> java.lang.ClassNotFoundException: elasticsearch.StorageClient
>>> at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335)
>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>>> at java.lang.Class.forName0(Native Method)
>>> at java.lang.Class.forName(Class.java:264)
>>> at
>>> org.apache.predictionio.data.storage.Storage$.getClient(Storage.scala:257)
>>> at
>>> org.apache.predictionio.data.storage.Storage$.org$apache$predictionio$data$storage$Storage$$updateS2CM(Storage.scala:283)
>>> at
>>> org.apache.predictionio.data.storage.Storage$$anonfun$sourcesToClientMeta$1.apply(Storage.scala:244)
>>> at
>>> org.apache.predictionio.data.storage.Storage$$anonfun$sourcesToClientMeta$1.apply(Storage.scala:244)
>>> at
>>> scala.collection.mutable.MapLike$class.getOrElseUpdate(MapLike.scala:194)
>>> at scala.collection.mutable.AbstractMap.getOrElseUpdate(Map.scala:80)
>>> at
>>> org.apache.predictionio.data.storage.Storage$.sourcesToClientMeta(Storage.scala:244)
>>> at
>>> org.apache.predictionio.data.storage.Storage$.getDataObject(Storage.scala:315)
>>> at
>>> org.apache.predictionio.data.storage.Storage$.getDataObjectFromRepo(Storage.scala:300)
>>> at
>>> org.apache.predictionio.data.storage.Storage$.getMetaDataEngineInstances(Storage.scala:402)
>>> at
>>> org.apache.predictionio.workflow.CreateWorkflow$.main(CreateWorkflow.scala:248)
>>> at
>>> org.apache.predictionio.workflow.CreateWorkflow.main(CreateWorkflow.scala)
>>>
>>> When I run pio status in the terminal I get the following which makes me
>>> think the services are all running but the error has something to do with
>>> not loading certain PredictionIO dependencies within IntelliJ. I have added
>>> the pio-runtime-jars. However I did notice that the
>>> spark-assembly-2.1.1-hadoop2.4.0.jar was not in the Apache Spark
>>> installation directory under assembly or lib.
>>>
>>> spark-assembly-2.1.1-hadoop2.4.0.jar
>>>
>>> [INFO] [Management$] Inspecting PredictionIO...
>>> [INFO] [Management$] PredictionIO 0.12.1 is installed at
>>> /Users/shanejohnson/Desktop/Apps/apache-predictionio-0.12.1/PredictionIO-0.12.1
>>> [INFO] [Management$] Inspecting Apache Spark...
>>> [INFO] [Management$] Apache Spark is installed at
>>> /Users/shanejohnson/Desktop/Apps/apache-predictionio-0.12.1/PredictionIO-0.12.1/vendors/spark-2.1.1-bin-hadoop2.6
>>> [INFO] [Management$] Apache Spark 2.1.1 detected (meets minimum
>>> requirement of 1.3.0)
>>> [INFO] [Management$] Inspecting storage backend connections...
>>> [INFO] [Storage$] Verifying Meta Data Backend (Source: ELASTICSEARCH)...
>>> [INFO] [Storage$] Verifying Model Data Backend (Source: LOCALFS)...
>>> [INFO] [Storage$] Verifying Event Data Backend (Source: HBASE)...
>>> [INFO] [Storage$] Test writing to Event Store (App Id 0)...
>>> [INFO] [HBLEvents] The table pio_event:events_0 doesn't exist yet.
>>> Creating now...
>>> [INFO] [HBLEvents] Removing table pio_event:events_0...
>>> [INFO] [Management$] Your system is all ready to go.
>>>
>>>
>>> Here are my ENV variables in IntelliJ and in pio-env.sh (where I have a
>>> working environment):
>>>
>>> IntelliJ:
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> PIO Local Setup: All services are running correctly. I have also built
>>> the project before working out of IntelliJ.
>>>
>>> # SPARK_HOME: Apache Spark is a hard dependency and must be configured.
>>> # SPARK_HOME=$PIO_HOME/vendors/spark-2.0.2-bin-hadoop2.7
>>> SPARK_HOME=$PIO_HOME/vendors/spark-2.1.1-bin-hadoop2.6
>>>
>>> POSTGRES_JDBC_DRIVER=$PIO_HOME/lib/postgresql-42.0.0.jar
>>> MYSQL_JDBC_DRIVER=$PIO_HOME/lib/mysql-connector-java-5.1.41.jar
>>>
>>> # ES_CONF_DIR: You must configure this if you have advanced
>>> configuration for
>>> #              your Elasticsearch setup.
>>> # ES_CONF_DIR=/opt/elasticsearch
>>>
>>> # HADOOP_CONF_DIR: You must configure this if you intend to run
>>> PredictionIO
>>> #                  with Hadoop 2.
>>> # HADOOP_CONF_DIR=/opt/hadoop
>>>
>>> # HBASE_CONF_DIR: You must configure this if you intend to run
>>> PredictionIO
>>> #                 with HBase on a remote cluster.
>>> # HBASE_CONF_DIR=$PIO_HOME/vendors/hbase-1.0.0/conf
>>>
>>> # Filesystem paths where PredictionIO uses as block storage.
>>> PIO_FS_BASEDIR=$HOME/.pio_store
>>> PIO_FS_ENGINESDIR=$PIO_FS_BASEDIR/engines
>>> PIO_FS_TMPDIR=$PIO_FS_BASEDIR/tmp
>>>
>>> # PredictionIO Storage Configuration
>>> #
>>> # This section controls programs that make use of PredictionIO's built-in
>>> # storage facilities. Default values are shown below.
>>> #
>>> # For more information on storage configuration please refer to
>>> # http://predictionio.apache.org/system/anotherdatastore/
>>>
>>> # Storage Repositories
>>>
>>> # Default is to use PostgreSQL
>>> PIO_STORAGE_REPOSITORIES_METADATA_NAME=pio_meta
>>> PIO_STORAGE_REPOSITORIES_METADATA_SOURCE=ELASTICSEARCH
>>>
>>> PIO_STORAGE_REPOSITORIES_EVENTDATA_NAME=pio_event
>>> PIO_STORAGE_REPOSITORIES_EVENTDATA_SOURCE=HBASE
>>>
>>> PIO_STORAGE_REPOSITORIES_MODELDATA_NAME=pio_model
>>> PIO_STORAGE_REPOSITORIES_MODELDATA_SOURCE=LOCALFS
>>>
>>> # Storage Data Sources
>>>
>>> # PostgreSQL Default Settings
>>> # Please change "pio" to your database name in
>>> PIO_STORAGE_SOURCES_PGSQL_URL
>>> # Please change PIO_STORAGE_SOURCES_PGSQL_USERNAME and
>>> # PIO_STORAGE_SOURCES_PGSQL_PASSWORD accordingly
>>> PIO_STORAGE_SOURCES_PGSQL_TYPE=jdbc
>>> PIO_STORAGE_SOURCES_PGSQL_URL=jdbc:postgresql://localhost/pio
>>> PIO_STORAGE_SOURCES_PGSQL_USERNAME=pio
>>> PIO_STORAGE_SOURCES_PGSQL_PASSWORD=pio
>>>
>>> # MySQL Example
>>> # PIO_STORAGE_SOURCES_MYSQL_TYPE=jdbc
>>> # PIO_STORAGE_SOURCES_MYSQL_URL=jdbc:mysql://localhost/pio
>>> # PIO_STORAGE_SOURCES_MYSQL_USERNAME=pio
>>> # PIO_STORAGE_SOURCES_MYSQL_PASSWORD=pio
>>>
>>> # Elasticsearch Example
>>> PIO_STORAGE_SOURCES_ELASTICSEARCH_TYPE=elasticsearch
>>> PIO_STORAGE_SOURCES_ELASTICSEARCH_HOSTS=localhost
>>> PIO_STORAGE_SOURCES_ELASTICSEARCH_PORTS=9200
>>> PIO_STORAGE_SOURCES_ELASTICSEARCH_SCHEMES=http
>>>
>>> PIO_STORAGE_SOURCES_ELASTICSEARCH_HOME=$PIO_HOME/vendors/elasticsearch-5.5.2
>>> # Optional basic HTTP auth
>>> # PIO_STORAGE_SOURCES_ELASTICSEARCH_USERNAME=my-name
>>> # PIO_STORAGE_SOURCES_ELASTICSEARCH_PASSWORD=my-secret
>>> # Elasticsearch 1.x Example
>>> # PIO_STORAGE_SOURCES_ELASTICSEARCH_TYPE=elasticsearch
>>> #
>>> PIO_STORAGE_SOURCES_ELASTICSEARCH_CLUSTERNAME=<elasticsearch_cluster_name>
>>> # PIO_STORAGE_SOURCES_ELASTICSEARCH_HOSTS=localhost
>>> # PIO_STORAGE_SOURCES_ELASTICSEARCH_PORTS=9300
>>> #
>>> PIO_STORAGE_SOURCES_ELASTICSEARCH_HOME=$PIO_HOME/vendors/elasticsearch-1.7.6
>>>
>>> # Local File System Example
>>> PIO_STORAGE_SOURCES_LOCALFS_TYPE=localfs
>>> PIO_STORAGE_SOURCES_LOCALFS_PATH=$PIO_FS_BASEDIR/models
>>>
>>> # HBase Example
>>> PIO_STORAGE_SOURCES_HBASE_TYPE=hbase
>>> PIO_STORAGE_SOURCES_HBASE_HOME=$PIO_HOME/vendors/hbase-1.2.6
>>>
>>> Thanks in advance for your help.
>>>
>>> Best,
>>>
>>> *Shane Johnson | LIFT IQ*
>>> *Founder | CEO*
>>>
>>> *www.liftiq.com <http://www.liftiq.com/>* or *shane@liftiq.com
>>> <sh...@liftiq.com>*
>>> mobile: (801) 360-3350
>>> LinkedIn <https://www.linkedin.com/in/shanewjohnson/>  |  Twitter
>>> <https://twitter.com/SWaldenJ> |  Facebook
>>> <https://www.facebook.com/shane.johnson.71653>
>>>
>>>
>>>
>>

Re: Setting up PredictionIO workflow in IntelliJ?

Posted by Donald Szeto <do...@apache.org>.
I’m in the process of reviving that section. Stay tuned.

On Fri, Aug 24, 2018 at 11:38 AM Shane Johnson <sh...@liftiq.com> wrote:

> Thanks Pat.
>
> *Shane Johnson | LIFT IQ*
> *Founder | CEO*
>
> *www.liftiq.com <http://www.liftiq.com/>* or *shane@liftiq.com
> <sh...@liftiq.com>*
> mobile: (801) 360-3350
> LinkedIn <https://www.linkedin.com/in/shanewjohnson/>  |  Twitter
> <https://twitter.com/SWaldenJ> |  Facebook
> <https://www.facebook.com/shane.johnson.71653>
>
>
>
> On Fri, Aug 24, 2018 at 12:07 PM, Pat Ferrel <pa...@occamsmachete.com>
> wrote:
>
>> Those instructions no longer work and should be removed from the docs. I
>> don’t know anyone who has been able to use the debugger on templates for
>> that Apache releases. Using it on the PIO code is possible but not
>> templates afaik.
>>
>>
>> From: Shane Johnson <sh...@liftiq.com> <sh...@liftiq.com>
>> Reply: user@predictionio.apache.org <us...@predictionio.apache.org>
>> <us...@predictionio.apache.org>
>> Date: August 24, 2018 at 7:17:07 AM
>> To: user@predictionio.apache.org <us...@predictionio.apache.org>
>> <us...@predictionio.apache.org>
>> Subject:  Setting up PredictionIO workflow in IntelliJ?
>>
>> Hi team,
>>
>> Is anyone out there using IntelliJ to manage their PIO projects. I have
>> been using Zeppelin and Atom but wanted to try to manage my workflow
>> through IntelliJ.
>>
>> I followed this setup https://predictionio.apache.org/resources/intellij/
>> but am getting the following error when trying to run the pio train
>> application.
>>
>>
>>
>> [INFO] [Engine] Extracting datasource params...
>> [INFO] [Engine] Datasource params: (,DataSourceParams(Some(5)))
>> [INFO] [Engine] Extracting preparator params...
>> [WARN] [WorkflowUtils$] Non-empty parameters supplied to
>> org.template.liftscoring.Preparator, but its constructor does not accept
>> any arguments. Stubbing with empty parameters.
>> [INFO] [Engine] Preparator params: (,Empty)
>> [INFO] [Engine] Extracting serving params...
>> [WARN] [WorkflowUtils$] Non-empty parameters supplied to
>> org.template.liftscoring.Serving, but its constructor does not accept any
>> arguments. Stubbing with empty parameters.
>> [INFO] [Engine] Serving params: (,Empty)
>> Exception in thread "main"
>> org.apache.predictionio.data.storage.StorageClientException: Data source
>> ELASTICSEARCH was not properly initialized.
>> at
>> org.apache.predictionio.data.storage.Storage$$anonfun$10.apply(Storage.scala:316)
>> at
>> org.apache.predictionio.data.storage.Storage$$anonfun$10.apply(Storage.scala:316)
>> at scala.Option.getOrElse(Option.scala:121)
>> at
>> org.apache.predictionio.data.storage.Storage$.getDataObject(Storage.scala:315)
>> at
>> org.apache.predictionio.data.storage.Storage$.getDataObjectFromRepo(Storage.scala:300)
>> at
>> org.apache.predictionio.data.storage.Storage$.getMetaDataEngineInstances(Storage.scala:402)
>> at
>> org.apache.predictionio.workflow.CreateWorkflow$.main(CreateWorkflow.scala:248)
>> at
>> org.apache.predictionio.workflow.CreateWorkflow.main(CreateWorkflow.scala)
>> [ERROR] [Storage$] Error initializing storage client for source
>> ELASTICSEARCH.
>> java.lang.ClassNotFoundException: elasticsearch.StorageClient
>> at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335)
>> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>> at java.lang.Class.forName0(Native Method)
>> at java.lang.Class.forName(Class.java:264)
>> at
>> org.apache.predictionio.data.storage.Storage$.getClient(Storage.scala:257)
>> at
>> org.apache.predictionio.data.storage.Storage$.org$apache$predictionio$data$storage$Storage$$updateS2CM(Storage.scala:283)
>> at
>> org.apache.predictionio.data.storage.Storage$$anonfun$sourcesToClientMeta$1.apply(Storage.scala:244)
>> at
>> org.apache.predictionio.data.storage.Storage$$anonfun$sourcesToClientMeta$1.apply(Storage.scala:244)
>> at
>> scala.collection.mutable.MapLike$class.getOrElseUpdate(MapLike.scala:194)
>> at scala.collection.mutable.AbstractMap.getOrElseUpdate(Map.scala:80)
>> at
>> org.apache.predictionio.data.storage.Storage$.sourcesToClientMeta(Storage.scala:244)
>> at
>> org.apache.predictionio.data.storage.Storage$.getDataObject(Storage.scala:315)
>> at
>> org.apache.predictionio.data.storage.Storage$.getDataObjectFromRepo(Storage.scala:300)
>> at
>> org.apache.predictionio.data.storage.Storage$.getMetaDataEngineInstances(Storage.scala:402)
>> at
>> org.apache.predictionio.workflow.CreateWorkflow$.main(CreateWorkflow.scala:248)
>> at
>> org.apache.predictionio.workflow.CreateWorkflow.main(CreateWorkflow.scala)
>>
>> When I run pio status in the terminal I get the following which makes me
>> think the services are all running but the error has something to do with
>> not loading certain PredictionIO dependencies within IntelliJ. I have added
>> the pio-runtime-jars. However I did notice that the
>> spark-assembly-2.1.1-hadoop2.4.0.jar was not in the Apache Spark
>> installation directory under assembly or lib.
>>
>> spark-assembly-2.1.1-hadoop2.4.0.jar
>>
>> [INFO] [Management$] Inspecting PredictionIO...
>> [INFO] [Management$] PredictionIO 0.12.1 is installed at
>> /Users/shanejohnson/Desktop/Apps/apache-predictionio-0.12.1/PredictionIO-0.12.1
>> [INFO] [Management$] Inspecting Apache Spark...
>> [INFO] [Management$] Apache Spark is installed at
>> /Users/shanejohnson/Desktop/Apps/apache-predictionio-0.12.1/PredictionIO-0.12.1/vendors/spark-2.1.1-bin-hadoop2.6
>> [INFO] [Management$] Apache Spark 2.1.1 detected (meets minimum
>> requirement of 1.3.0)
>> [INFO] [Management$] Inspecting storage backend connections...
>> [INFO] [Storage$] Verifying Meta Data Backend (Source: ELASTICSEARCH)...
>> [INFO] [Storage$] Verifying Model Data Backend (Source: LOCALFS)...
>> [INFO] [Storage$] Verifying Event Data Backend (Source: HBASE)...
>> [INFO] [Storage$] Test writing to Event Store (App Id 0)...
>> [INFO] [HBLEvents] The table pio_event:events_0 doesn't exist yet.
>> Creating now...
>> [INFO] [HBLEvents] Removing table pio_event:events_0...
>> [INFO] [Management$] Your system is all ready to go.
>>
>>
>> Here are my ENV variables in IntelliJ and in pio-env.sh (where I have a
>> working environment):
>>
>> IntelliJ:
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> PIO Local Setup: All services are running correctly. I have also built
>> the project before working out of IntelliJ.
>>
>> # SPARK_HOME: Apache Spark is a hard dependency and must be configured.
>> # SPARK_HOME=$PIO_HOME/vendors/spark-2.0.2-bin-hadoop2.7
>> SPARK_HOME=$PIO_HOME/vendors/spark-2.1.1-bin-hadoop2.6
>>
>> POSTGRES_JDBC_DRIVER=$PIO_HOME/lib/postgresql-42.0.0.jar
>> MYSQL_JDBC_DRIVER=$PIO_HOME/lib/mysql-connector-java-5.1.41.jar
>>
>> # ES_CONF_DIR: You must configure this if you have advanced configuration
>> for
>> #              your Elasticsearch setup.
>> # ES_CONF_DIR=/opt/elasticsearch
>>
>> # HADOOP_CONF_DIR: You must configure this if you intend to run
>> PredictionIO
>> #                  with Hadoop 2.
>> # HADOOP_CONF_DIR=/opt/hadoop
>>
>> # HBASE_CONF_DIR: You must configure this if you intend to run
>> PredictionIO
>> #                 with HBase on a remote cluster.
>> # HBASE_CONF_DIR=$PIO_HOME/vendors/hbase-1.0.0/conf
>>
>> # Filesystem paths where PredictionIO uses as block storage.
>> PIO_FS_BASEDIR=$HOME/.pio_store
>> PIO_FS_ENGINESDIR=$PIO_FS_BASEDIR/engines
>> PIO_FS_TMPDIR=$PIO_FS_BASEDIR/tmp
>>
>> # PredictionIO Storage Configuration
>> #
>> # This section controls programs that make use of PredictionIO's built-in
>> # storage facilities. Default values are shown below.
>> #
>> # For more information on storage configuration please refer to
>> # http://predictionio.apache.org/system/anotherdatastore/
>>
>> # Storage Repositories
>>
>> # Default is to use PostgreSQL
>> PIO_STORAGE_REPOSITORIES_METADATA_NAME=pio_meta
>> PIO_STORAGE_REPOSITORIES_METADATA_SOURCE=ELASTICSEARCH
>>
>> PIO_STORAGE_REPOSITORIES_EVENTDATA_NAME=pio_event
>> PIO_STORAGE_REPOSITORIES_EVENTDATA_SOURCE=HBASE
>>
>> PIO_STORAGE_REPOSITORIES_MODELDATA_NAME=pio_model
>> PIO_STORAGE_REPOSITORIES_MODELDATA_SOURCE=LOCALFS
>>
>> # Storage Data Sources
>>
>> # PostgreSQL Default Settings
>> # Please change "pio" to your database name in
>> PIO_STORAGE_SOURCES_PGSQL_URL
>> # Please change PIO_STORAGE_SOURCES_PGSQL_USERNAME and
>> # PIO_STORAGE_SOURCES_PGSQL_PASSWORD accordingly
>> PIO_STORAGE_SOURCES_PGSQL_TYPE=jdbc
>> PIO_STORAGE_SOURCES_PGSQL_URL=jdbc:postgresql://localhost/pio
>> PIO_STORAGE_SOURCES_PGSQL_USERNAME=pio
>> PIO_STORAGE_SOURCES_PGSQL_PASSWORD=pio
>>
>> # MySQL Example
>> # PIO_STORAGE_SOURCES_MYSQL_TYPE=jdbc
>> # PIO_STORAGE_SOURCES_MYSQL_URL=jdbc:mysql://localhost/pio
>> # PIO_STORAGE_SOURCES_MYSQL_USERNAME=pio
>> # PIO_STORAGE_SOURCES_MYSQL_PASSWORD=pio
>>
>> # Elasticsearch Example
>> PIO_STORAGE_SOURCES_ELASTICSEARCH_TYPE=elasticsearch
>> PIO_STORAGE_SOURCES_ELASTICSEARCH_HOSTS=localhost
>> PIO_STORAGE_SOURCES_ELASTICSEARCH_PORTS=9200
>> PIO_STORAGE_SOURCES_ELASTICSEARCH_SCHEMES=http
>>
>> PIO_STORAGE_SOURCES_ELASTICSEARCH_HOME=$PIO_HOME/vendors/elasticsearch-5.5.2
>> # Optional basic HTTP auth
>> # PIO_STORAGE_SOURCES_ELASTICSEARCH_USERNAME=my-name
>> # PIO_STORAGE_SOURCES_ELASTICSEARCH_PASSWORD=my-secret
>> # Elasticsearch 1.x Example
>> # PIO_STORAGE_SOURCES_ELASTICSEARCH_TYPE=elasticsearch
>> #
>> PIO_STORAGE_SOURCES_ELASTICSEARCH_CLUSTERNAME=<elasticsearch_cluster_name>
>> # PIO_STORAGE_SOURCES_ELASTICSEARCH_HOSTS=localhost
>> # PIO_STORAGE_SOURCES_ELASTICSEARCH_PORTS=9300
>> #
>> PIO_STORAGE_SOURCES_ELASTICSEARCH_HOME=$PIO_HOME/vendors/elasticsearch-1.7.6
>>
>> # Local File System Example
>> PIO_STORAGE_SOURCES_LOCALFS_TYPE=localfs
>> PIO_STORAGE_SOURCES_LOCALFS_PATH=$PIO_FS_BASEDIR/models
>>
>> # HBase Example
>> PIO_STORAGE_SOURCES_HBASE_TYPE=hbase
>> PIO_STORAGE_SOURCES_HBASE_HOME=$PIO_HOME/vendors/hbase-1.2.6
>>
>> Thanks in advance for your help.
>>
>> Best,
>>
>> *Shane Johnson | LIFT IQ*
>> *Founder | CEO*
>>
>> *www.liftiq.com <http://www.liftiq.com/>* or *shane@liftiq.com
>> <sh...@liftiq.com>*
>> mobile: (801) 360-3350
>> LinkedIn <https://www.linkedin.com/in/shanewjohnson/>  |  Twitter
>> <https://twitter.com/SWaldenJ> |  Facebook
>> <https://www.facebook.com/shane.johnson.71653>
>>
>>
>>
>

Re: Setting up PredictionIO workflow in IntelliJ?

Posted by Shane Johnson <sh...@liftiq.com>.
Thanks Pat.

*Shane Johnson | LIFT IQ*
*Founder | CEO*

*www.liftiq.com <http://www.liftiq.com/>* or *shane@liftiq.com
<sh...@liftiq.com>*
mobile: (801) 360-3350
LinkedIn <https://www.linkedin.com/in/shanewjohnson/>  |  Twitter
<https://twitter.com/SWaldenJ> |  Facebook
<https://www.facebook.com/shane.johnson.71653>



On Fri, Aug 24, 2018 at 12:07 PM, Pat Ferrel <pa...@occamsmachete.com> wrote:

> Those instructions no longer work and should be removed from the docs. I
> don’t know anyone who has been able to use the debugger on templates for
> that Apache releases. Using it on the PIO code is possible but not
> templates afaik.
>
>
> From: Shane Johnson <sh...@liftiq.com> <sh...@liftiq.com>
> Reply: user@predictionio.apache.org <us...@predictionio.apache.org>
> <us...@predictionio.apache.org>
> Date: August 24, 2018 at 7:17:07 AM
> To: user@predictionio.apache.org <us...@predictionio.apache.org>
> <us...@predictionio.apache.org>
> Subject:  Setting up PredictionIO workflow in IntelliJ?
>
> Hi team,
>
> Is anyone out there using IntelliJ to manage their PIO projects. I have
> been using Zeppelin and Atom but wanted to try to manage my workflow
> through IntelliJ.
>
> I followed this setup https://predictionio.apache.org/resources/intellij/
> but am getting the following error when trying to run the pio train
> application.
>
>
>
> [INFO] [Engine] Extracting datasource params...
> [INFO] [Engine] Datasource params: (,DataSourceParams(Some(5)))
> [INFO] [Engine] Extracting preparator params...
> [WARN] [WorkflowUtils$] Non-empty parameters supplied to
> org.template.liftscoring.Preparator, but its constructor does not accept
> any arguments. Stubbing with empty parameters.
> [INFO] [Engine] Preparator params: (,Empty)
> [INFO] [Engine] Extracting serving params...
> [WARN] [WorkflowUtils$] Non-empty parameters supplied to
> org.template.liftscoring.Serving, but its constructor does not accept any
> arguments. Stubbing with empty parameters.
> [INFO] [Engine] Serving params: (,Empty)
> Exception in thread "main" org.apache.predictionio.data.storage.StorageClientException:
> Data source ELASTICSEARCH was not properly initialized.
> at org.apache.predictionio.data.storage.Storage$$anonfun$10.
> apply(Storage.scala:316)
> at org.apache.predictionio.data.storage.Storage$$anonfun$10.
> apply(Storage.scala:316)
> at scala.Option.getOrElse(Option.scala:121)
> at org.apache.predictionio.data.storage.Storage$.
> getDataObject(Storage.scala:315)
> at org.apache.predictionio.data.storage.Storage$.
> getDataObjectFromRepo(Storage.scala:300)
> at org.apache.predictionio.data.storage.Storage$.
> getMetaDataEngineInstances(Storage.scala:402)
> at org.apache.predictionio.workflow.CreateWorkflow$.main(
> CreateWorkflow.scala:248)
> at org.apache.predictionio.workflow.CreateWorkflow.main(
> CreateWorkflow.scala)
> [ERROR] [Storage$] Error initializing storage client for source
> ELASTICSEARCH.
> java.lang.ClassNotFoundException: elasticsearch.StorageClient
> at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
> at java.lang.Class.forName0(Native Method)
> at java.lang.Class.forName(Class.java:264)
> at org.apache.predictionio.data.storage.Storage$.getClient(
> Storage.scala:257)
> at org.apache.predictionio.data.storage.Storage$.org$apache$
> predictionio$data$storage$Storage$$updateS2CM(Storage.scala:283)
> at org.apache.predictionio.data.storage.Storage$$anonfun$
> sourcesToClientMeta$1.apply(Storage.scala:244)
> at org.apache.predictionio.data.storage.Storage$$anonfun$
> sourcesToClientMeta$1.apply(Storage.scala:244)
> at scala.collection.mutable.MapLike$class.getOrElseUpdate(
> MapLike.scala:194)
> at scala.collection.mutable.AbstractMap.getOrElseUpdate(Map.scala:80)
> at org.apache.predictionio.data.storage.Storage$.
> sourcesToClientMeta(Storage.scala:244)
> at org.apache.predictionio.data.storage.Storage$.
> getDataObject(Storage.scala:315)
> at org.apache.predictionio.data.storage.Storage$.
> getDataObjectFromRepo(Storage.scala:300)
> at org.apache.predictionio.data.storage.Storage$.
> getMetaDataEngineInstances(Storage.scala:402)
> at org.apache.predictionio.workflow.CreateWorkflow$.main(
> CreateWorkflow.scala:248)
> at org.apache.predictionio.workflow.CreateWorkflow.main(
> CreateWorkflow.scala)
>
> When I run pio status in the terminal I get the following which makes me
> think the services are all running but the error has something to do with
> not loading certain PredictionIO dependencies within IntelliJ. I have added
> the pio-runtime-jars. However I did notice that the
> spark-assembly-2.1.1-hadoop2.4.0.jar was not in the Apache Spark
> installation directory under assembly or lib.
>
> spark-assembly-2.1.1-hadoop2.4.0.jar
>
> [INFO] [Management$] Inspecting PredictionIO...
> [INFO] [Management$] PredictionIO 0.12.1 is installed at
> /Users/shanejohnson/Desktop/Apps/apache-predictionio-0.12.
> 1/PredictionIO-0.12.1
> [INFO] [Management$] Inspecting Apache Spark...
> [INFO] [Management$] Apache Spark is installed at
> /Users/shanejohnson/Desktop/Apps/apache-predictionio-0.12.
> 1/PredictionIO-0.12.1/vendors/spark-2.1.1-bin-hadoop2.6
> [INFO] [Management$] Apache Spark 2.1.1 detected (meets minimum
> requirement of 1.3.0)
> [INFO] [Management$] Inspecting storage backend connections...
> [INFO] [Storage$] Verifying Meta Data Backend (Source: ELASTICSEARCH)...
> [INFO] [Storage$] Verifying Model Data Backend (Source: LOCALFS)...
> [INFO] [Storage$] Verifying Event Data Backend (Source: HBASE)...
> [INFO] [Storage$] Test writing to Event Store (App Id 0)...
> [INFO] [HBLEvents] The table pio_event:events_0 doesn't exist yet.
> Creating now...
> [INFO] [HBLEvents] Removing table pio_event:events_0...
> [INFO] [Management$] Your system is all ready to go.
>
>
> Here are my ENV variables in IntelliJ and in pio-env.sh (where I have a
> working environment):
>
> IntelliJ:
>
>
>
>
>
>
>
>
>
>
> PIO Local Setup: All services are running correctly. I have also built the
> project before working out of IntelliJ.
>
> # SPARK_HOME: Apache Spark is a hard dependency and must be configured.
> # SPARK_HOME=$PIO_HOME/vendors/spark-2.0.2-bin-hadoop2.7
> SPARK_HOME=$PIO_HOME/vendors/spark-2.1.1-bin-hadoop2.6
>
> POSTGRES_JDBC_DRIVER=$PIO_HOME/lib/postgresql-42.0.0.jar
> MYSQL_JDBC_DRIVER=$PIO_HOME/lib/mysql-connector-java-5.1.41.jar
>
> # ES_CONF_DIR: You must configure this if you have advanced configuration
> for
> #              your Elasticsearch setup.
> # ES_CONF_DIR=/opt/elasticsearch
>
> # HADOOP_CONF_DIR: You must configure this if you intend to run
> PredictionIO
> #                  with Hadoop 2.
> # HADOOP_CONF_DIR=/opt/hadoop
>
> # HBASE_CONF_DIR: You must configure this if you intend to run PredictionIO
> #                 with HBase on a remote cluster.
> # HBASE_CONF_DIR=$PIO_HOME/vendors/hbase-1.0.0/conf
>
> # Filesystem paths where PredictionIO uses as block storage.
> PIO_FS_BASEDIR=$HOME/.pio_store
> PIO_FS_ENGINESDIR=$PIO_FS_BASEDIR/engines
> PIO_FS_TMPDIR=$PIO_FS_BASEDIR/tmp
>
> # PredictionIO Storage Configuration
> #
> # This section controls programs that make use of PredictionIO's built-in
> # storage facilities. Default values are shown below.
> #
> # For more information on storage configuration please refer to
> # http://predictionio.apache.org/system/anotherdatastore/
>
> # Storage Repositories
>
> # Default is to use PostgreSQL
> PIO_STORAGE_REPOSITORIES_METADATA_NAME=pio_meta
> PIO_STORAGE_REPOSITORIES_METADATA_SOURCE=ELASTICSEARCH
>
> PIO_STORAGE_REPOSITORIES_EVENTDATA_NAME=pio_event
> PIO_STORAGE_REPOSITORIES_EVENTDATA_SOURCE=HBASE
>
> PIO_STORAGE_REPOSITORIES_MODELDATA_NAME=pio_model
> PIO_STORAGE_REPOSITORIES_MODELDATA_SOURCE=LOCALFS
>
> # Storage Data Sources
>
> # PostgreSQL Default Settings
> # Please change "pio" to your database name in
> PIO_STORAGE_SOURCES_PGSQL_URL
> # Please change PIO_STORAGE_SOURCES_PGSQL_USERNAME and
> # PIO_STORAGE_SOURCES_PGSQL_PASSWORD accordingly
> PIO_STORAGE_SOURCES_PGSQL_TYPE=jdbc
> PIO_STORAGE_SOURCES_PGSQL_URL=jdbc:postgresql://localhost/pio
> PIO_STORAGE_SOURCES_PGSQL_USERNAME=pio
> PIO_STORAGE_SOURCES_PGSQL_PASSWORD=pio
>
> # MySQL Example
> # PIO_STORAGE_SOURCES_MYSQL_TYPE=jdbc
> # PIO_STORAGE_SOURCES_MYSQL_URL=jdbc:mysql://localhost/pio
> # PIO_STORAGE_SOURCES_MYSQL_USERNAME=pio
> # PIO_STORAGE_SOURCES_MYSQL_PASSWORD=pio
>
> # Elasticsearch Example
> PIO_STORAGE_SOURCES_ELASTICSEARCH_TYPE=elasticsearch
> PIO_STORAGE_SOURCES_ELASTICSEARCH_HOSTS=localhost
> PIO_STORAGE_SOURCES_ELASTICSEARCH_PORTS=9200
> PIO_STORAGE_SOURCES_ELASTICSEARCH_SCHEMES=http
> PIO_STORAGE_SOURCES_ELASTICSEARCH_HOME=$PIO_HOME/
> vendors/elasticsearch-5.5.2
> # Optional basic HTTP auth
> # PIO_STORAGE_SOURCES_ELASTICSEARCH_USERNAME=my-name
> # PIO_STORAGE_SOURCES_ELASTICSEARCH_PASSWORD=my-secret
> # Elasticsearch 1.x Example
> # PIO_STORAGE_SOURCES_ELASTICSEARCH_TYPE=elasticsearch
> # PIO_STORAGE_SOURCES_ELASTICSEARCH_CLUSTERNAME=<
> elasticsearch_cluster_name>
> # PIO_STORAGE_SOURCES_ELASTICSEARCH_HOSTS=localhost
> # PIO_STORAGE_SOURCES_ELASTICSEARCH_PORTS=9300
> # PIO_STORAGE_SOURCES_ELASTICSEARCH_HOME=$PIO_HOME/
> vendors/elasticsearch-1.7.6
>
> # Local File System Example
> PIO_STORAGE_SOURCES_LOCALFS_TYPE=localfs
> PIO_STORAGE_SOURCES_LOCALFS_PATH=$PIO_FS_BASEDIR/models
>
> # HBase Example
> PIO_STORAGE_SOURCES_HBASE_TYPE=hbase
> PIO_STORAGE_SOURCES_HBASE_HOME=$PIO_HOME/vendors/hbase-1.2.6
>
> Thanks in advance for your help.
>
> Best,
>
> *Shane Johnson | LIFT IQ*
> *Founder | CEO*
>
> *www.liftiq.com <http://www.liftiq.com/>* or *shane@liftiq.com
> <sh...@liftiq.com>*
> mobile: (801) 360-3350
> LinkedIn <https://www.linkedin.com/in/shanewjohnson/>  |  Twitter
> <https://twitter.com/SWaldenJ> |  Facebook
> <https://www.facebook.com/shane.johnson.71653>
>
>
>

Re: Setting up PredictionIO workflow in IntelliJ?

Posted by Pat Ferrel <pa...@occamsmachete.com>.
Those instructions no longer work and should be removed from the docs. I
don’t know anyone who has been able to use the debugger on templates for
that Apache releases. Using it on the PIO code is possible but not
templates afaik.


From: Shane Johnson <sh...@liftiq.com> <sh...@liftiq.com>
Reply: user@predictionio.apache.org <us...@predictionio.apache.org>
<us...@predictionio.apache.org>
Date: August 24, 2018 at 7:17:07 AM
To: user@predictionio.apache.org <us...@predictionio.apache.org>
<us...@predictionio.apache.org>
Subject:  Setting up PredictionIO workflow in IntelliJ?

Hi team,

Is anyone out there using IntelliJ to manage their PIO projects. I have
been using Zeppelin and Atom but wanted to try to manage my workflow
through IntelliJ.

I followed this setup https://predictionio.apache.org/resources/intellij/
but am getting the following error when trying to run the pio train
application.



[INFO] [Engine] Extracting datasource params...
[INFO] [Engine] Datasource params: (,DataSourceParams(Some(5)))
[INFO] [Engine] Extracting preparator params...
[WARN] [WorkflowUtils$] Non-empty parameters supplied to
org.template.liftscoring.Preparator, but its constructor does not accept
any arguments. Stubbing with empty parameters.
[INFO] [Engine] Preparator params: (,Empty)
[INFO] [Engine] Extracting serving params...
[WARN] [WorkflowUtils$] Non-empty parameters supplied to
org.template.liftscoring.Serving, but its constructor does not accept any
arguments. Stubbing with empty parameters.
[INFO] [Engine] Serving params: (,Empty)
Exception in thread "main"
org.apache.predictionio.data.storage.StorageClientException: Data source
ELASTICSEARCH was not properly initialized.
at
org.apache.predictionio.data.storage.Storage$$anonfun$10.apply(Storage.scala:316)
at
org.apache.predictionio.data.storage.Storage$$anonfun$10.apply(Storage.scala:316)
at scala.Option.getOrElse(Option.scala:121)
at
org.apache.predictionio.data.storage.Storage$.getDataObject(Storage.scala:315)
at
org.apache.predictionio.data.storage.Storage$.getDataObjectFromRepo(Storage.scala:300)
at
org.apache.predictionio.data.storage.Storage$.getMetaDataEngineInstances(Storage.scala:402)
at
org.apache.predictionio.workflow.CreateWorkflow$.main(CreateWorkflow.scala:248)
at
org.apache.predictionio.workflow.CreateWorkflow.main(CreateWorkflow.scala)
[ERROR] [Storage$] Error initializing storage client for source
ELASTICSEARCH.
java.lang.ClassNotFoundException: elasticsearch.StorageClient
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:264)
at
org.apache.predictionio.data.storage.Storage$.getClient(Storage.scala:257)
at
org.apache.predictionio.data.storage.Storage$.org$apache$predictionio$data$storage$Storage$$updateS2CM(Storage.scala:283)
at
org.apache.predictionio.data.storage.Storage$$anonfun$sourcesToClientMeta$1.apply(Storage.scala:244)
at
org.apache.predictionio.data.storage.Storage$$anonfun$sourcesToClientMeta$1.apply(Storage.scala:244)
at scala.collection.mutable.MapLike$class.getOrElseUpdate(MapLike.scala:194)
at scala.collection.mutable.AbstractMap.getOrElseUpdate(Map.scala:80)
at
org.apache.predictionio.data.storage.Storage$.sourcesToClientMeta(Storage.scala:244)
at
org.apache.predictionio.data.storage.Storage$.getDataObject(Storage.scala:315)
at
org.apache.predictionio.data.storage.Storage$.getDataObjectFromRepo(Storage.scala:300)
at
org.apache.predictionio.data.storage.Storage$.getMetaDataEngineInstances(Storage.scala:402)
at
org.apache.predictionio.workflow.CreateWorkflow$.main(CreateWorkflow.scala:248)
at
org.apache.predictionio.workflow.CreateWorkflow.main(CreateWorkflow.scala)

When I run pio status in the terminal I get the following which makes me
think the services are all running but the error has something to do with
not loading certain PredictionIO dependencies within IntelliJ. I have added
the pio-runtime-jars. However I did notice that the
spark-assembly-2.1.1-hadoop2.4.0.jar was not in the Apache Spark
installation directory under assembly or lib.

spark-assembly-2.1.1-hadoop2.4.0.jar

[INFO] [Management$] Inspecting PredictionIO...
[INFO] [Management$] PredictionIO 0.12.1 is installed at
/Users/shanejohnson/Desktop/Apps/apache-predictionio-0.12.1/PredictionIO-0.12.1
[INFO] [Management$] Inspecting Apache Spark...
[INFO] [Management$] Apache Spark is installed at
/Users/shanejohnson/Desktop/Apps/apache-predictionio-0.12.1/PredictionIO-0.12.1/vendors/spark-2.1.1-bin-hadoop2.6
[INFO] [Management$] Apache Spark 2.1.1 detected (meets minimum requirement
of 1.3.0)
[INFO] [Management$] Inspecting storage backend connections...
[INFO] [Storage$] Verifying Meta Data Backend (Source: ELASTICSEARCH)...
[INFO] [Storage$] Verifying Model Data Backend (Source: LOCALFS)...
[INFO] [Storage$] Verifying Event Data Backend (Source: HBASE)...
[INFO] [Storage$] Test writing to Event Store (App Id 0)...
[INFO] [HBLEvents] The table pio_event:events_0 doesn't exist yet. Creating
now...
[INFO] [HBLEvents] Removing table pio_event:events_0...
[INFO] [Management$] Your system is all ready to go.


Here are my ENV variables in IntelliJ and in pio-env.sh (where I have a
working environment):

IntelliJ:










PIO Local Setup: All services are running correctly. I have also built the
project before working out of IntelliJ.

# SPARK_HOME: Apache Spark is a hard dependency and must be configured.
# SPARK_HOME=$PIO_HOME/vendors/spark-2.0.2-bin-hadoop2.7
SPARK_HOME=$PIO_HOME/vendors/spark-2.1.1-bin-hadoop2.6

POSTGRES_JDBC_DRIVER=$PIO_HOME/lib/postgresql-42.0.0.jar
MYSQL_JDBC_DRIVER=$PIO_HOME/lib/mysql-connector-java-5.1.41.jar

# ES_CONF_DIR: You must configure this if you have advanced configuration
for
#              your Elasticsearch setup.
# ES_CONF_DIR=/opt/elasticsearch

# HADOOP_CONF_DIR: You must configure this if you intend to run PredictionIO
#                  with Hadoop 2.
# HADOOP_CONF_DIR=/opt/hadoop

# HBASE_CONF_DIR: You must configure this if you intend to run PredictionIO
#                 with HBase on a remote cluster.
# HBASE_CONF_DIR=$PIO_HOME/vendors/hbase-1.0.0/conf

# Filesystem paths where PredictionIO uses as block storage.
PIO_FS_BASEDIR=$HOME/.pio_store
PIO_FS_ENGINESDIR=$PIO_FS_BASEDIR/engines
PIO_FS_TMPDIR=$PIO_FS_BASEDIR/tmp

# PredictionIO Storage Configuration
#
# This section controls programs that make use of PredictionIO's built-in
# storage facilities. Default values are shown below.
#
# For more information on storage configuration please refer to
# http://predictionio.apache.org/system/anotherdatastore/

# Storage Repositories

# Default is to use PostgreSQL
PIO_STORAGE_REPOSITORIES_METADATA_NAME=pio_meta
PIO_STORAGE_REPOSITORIES_METADATA_SOURCE=ELASTICSEARCH

PIO_STORAGE_REPOSITORIES_EVENTDATA_NAME=pio_event
PIO_STORAGE_REPOSITORIES_EVENTDATA_SOURCE=HBASE

PIO_STORAGE_REPOSITORIES_MODELDATA_NAME=pio_model
PIO_STORAGE_REPOSITORIES_MODELDATA_SOURCE=LOCALFS

# Storage Data Sources

# PostgreSQL Default Settings
# Please change "pio" to your database name in PIO_STORAGE_SOURCES_PGSQL_URL
# Please change PIO_STORAGE_SOURCES_PGSQL_USERNAME and
# PIO_STORAGE_SOURCES_PGSQL_PASSWORD accordingly
PIO_STORAGE_SOURCES_PGSQL_TYPE=jdbc
PIO_STORAGE_SOURCES_PGSQL_URL=jdbc:postgresql://localhost/pio
PIO_STORAGE_SOURCES_PGSQL_USERNAME=pio
PIO_STORAGE_SOURCES_PGSQL_PASSWORD=pio

# MySQL Example
# PIO_STORAGE_SOURCES_MYSQL_TYPE=jdbc
# PIO_STORAGE_SOURCES_MYSQL_URL=jdbc:mysql://localhost/pio
# PIO_STORAGE_SOURCES_MYSQL_USERNAME=pio
# PIO_STORAGE_SOURCES_MYSQL_PASSWORD=pio

# Elasticsearch Example
PIO_STORAGE_SOURCES_ELASTICSEARCH_TYPE=elasticsearch
PIO_STORAGE_SOURCES_ELASTICSEARCH_HOSTS=localhost
PIO_STORAGE_SOURCES_ELASTICSEARCH_PORTS=9200
PIO_STORAGE_SOURCES_ELASTICSEARCH_SCHEMES=http
PIO_STORAGE_SOURCES_ELASTICSEARCH_HOME=$PIO_HOME/vendors/elasticsearch-5.5.2
# Optional basic HTTP auth
# PIO_STORAGE_SOURCES_ELASTICSEARCH_USERNAME=my-name
# PIO_STORAGE_SOURCES_ELASTICSEARCH_PASSWORD=my-secret
# Elasticsearch 1.x Example
# PIO_STORAGE_SOURCES_ELASTICSEARCH_TYPE=elasticsearch
# PIO_STORAGE_SOURCES_ELASTICSEARCH_CLUSTERNAME=<elasticsearch_cluster_name>
# PIO_STORAGE_SOURCES_ELASTICSEARCH_HOSTS=localhost
# PIO_STORAGE_SOURCES_ELASTICSEARCH_PORTS=9300
#
PIO_STORAGE_SOURCES_ELASTICSEARCH_HOME=$PIO_HOME/vendors/elasticsearch-1.7.6

# Local File System Example
PIO_STORAGE_SOURCES_LOCALFS_TYPE=localfs
PIO_STORAGE_SOURCES_LOCALFS_PATH=$PIO_FS_BASEDIR/models

# HBase Example
PIO_STORAGE_SOURCES_HBASE_TYPE=hbase
PIO_STORAGE_SOURCES_HBASE_HOME=$PIO_HOME/vendors/hbase-1.2.6

Thanks in advance for your help.

Best,

*Shane Johnson | LIFT IQ*
*Founder | CEO*

*www.liftiq.com <http://www.liftiq.com/>* or *shane@liftiq.com
<sh...@liftiq.com>*
mobile: (801) 360-3350
LinkedIn <https://www.linkedin.com/in/shanewjohnson/>  |  Twitter
<https://twitter.com/SWaldenJ> |  Facebook
<https://www.facebook.com/shane.johnson.71653>