You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@predictionio.apache.org by Ashutosh Banerjee <as...@juggernaut.in> on 2017/01/17 15:16:38 UTC

ElasticSearch Nodes not Found - Universal Recommender Template

Hi,

We are facing an intermittent issue while using the universal recommender
template on our production servers. Every once in a while we get an error
that none of the ElasticSearch nodes are reachable
(org.elasticsearch.client.transport.NoNodeAvailableException: None of the
configured nodes are available: [])

Our elasticsearch cluster is hosted on a elasticsearch hosting
provider(Qbox) and at the time of receiving this error all the shards were
green and reachable from the server where predictio-io template is hosted.
There was no shortage of RAM or disk space on that server when this issue
occurred.
The issue gets resolved on performing the following steps:
1) pio undeploy
2) pio-stop-all
3) pio-start-all
4) pio deploy

Pio version used is 0.10.0-incubating and universal recommender version is
0.5.0
We did face this issue earlier as well and could not figure out why the ES
nodes became unreachable after a particular period.
The pio-env config for ES is below:

PIO_STORAGE_SOURCES_ELASTICSEARCH_TYPE=elasticsearch
PIO_STORAGE_SOURCES_ELASTICSEARCH_CLUSTERNAME=<cluster-name>
PIO_STORAGE_SOURCES_ELASTICSEARCH_HOSTS=<hosted-es-domain>
PIO_STORAGE_SOURCES_ELASTICSEARCH_PORTS=<port> // Transport Layer Port Used
Here
PIO_STORAGE_SOURCES_ELASTICSEARCH_HOME=<local-es-path>

We are currently using ElasticSearch version 1.5.2 on our hosted servers as
version 1.4.0 used by the pio is not supported on the hosted service. We
are using the Transport Layer port to communicate with the ES nodes as
opposed to calling the https endpoint.

Please help us figure out the underlying cause for this issue.

Thanks,
Ashutosh

Re: ElasticSearch Nodes not Found - Universal Recommender Template

Posted by Pat Ferrel <pa...@occamsmachete.com>.
The UR uses the REST API for most operations. The servers should be listed in the engine.json for this in the sparkConf section. Like this:

    "sparkConf": {
      "spark.serializer": "org.apache.spark.serializer.KryoSerializer",
      "spark.kryo.registrator": "org.apache.mahout.sparkbindings.io.MahoutKryoRegistrator",
      "spark.kryo.referenceTracking": "false",
      "spark.kryoserializer.buffer": "300m",
      "es.index.auto.create": “true”,
      "es.nodes": “ip-xxx.ec2.internal,ip-xxx.ec2.internal,xxx.ec2.internal",
      “es.username”: “some-user”,
      “es.password”: “some-passowrd
    },

the ES nodes should be IPs or dns names that are reachable from your PIO machine. BTW I may have the uname/pword param wrong so check the ES docs.

This is how the REST client needs to be configured, not using pio-env.sh. PIO itself never uses the REST client, only the UR, and we will shortly move PredictionIO to completely using the REST API. In this case the config will be moved the pio-env.sh. For now use the above to change anything needed by the Elaticsearch-Hadoop library, s\which writes the model to ES, and the REST API, which create indexes and aliases.

We use v1.7.6 and it works fine. 

Also make sure to go through the entire workflow after engine.json changes: build, train, deploy



On Jan 17, 2017, at 7:16 AM, Ashutosh Banerjee <as...@juggernaut.in> wrote:

Hi,

We are facing an intermittent issue while using the universal recommender template on our production servers. Every once in a while we get an error that none of the ElasticSearch nodes are reachable 
(org.elasticsearch.client.transport.NoNodeAvailableException: None of the configured nodes are available: [])

Our elasticsearch cluster is hosted on a elasticsearch hosting provider(Qbox) and at the time of receiving this error all the shards were green and reachable from the server where predictio-io template is hosted. There was no shortage of RAM or disk space on that server when this issue occurred.
The issue gets resolved on performing the following steps:
1) pio undeploy 
2) pio-stop-all
3) pio-start-all
4) pio deploy

Pio version used is 0.10.0-incubating and universal recommender version is 0.5.0
We did face this issue earlier as well and could not figure out why the ES nodes became unreachable after a particular period.
The pio-env config for ES is below:

PIO_STORAGE_SOURCES_ELASTICSEARCH_TYPE=elasticsearch
PIO_STORAGE_SOURCES_ELASTICSEARCH_CLUSTERNAME=<cluster-name>
PIO_STORAGE_SOURCES_ELASTICSEARCH_HOSTS=<hosted-es-domain>
PIO_STORAGE_SOURCES_ELASTICSEARCH_PORTS=<port> // Transport Layer Port Used Here
PIO_STORAGE_SOURCES_ELASTICSEARCH_HOME=<local-es-path>

We are currently using ElasticSearch version 1.5.2 on our hosted servers as version 1.4.0 used by the pio is not supported on the hosted service. We are using the Transport Layer port to communicate with the ES nodes as opposed to calling the https endpoint.

Please help us figure out the underlying cause for this issue.

Thanks,
Ashutosh