You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@predictionio.apache.org by bala vivek <ba...@gmail.com> on 2018/04/12 12:34:31 UTC

Hbase issue

Hi,

I use PIO 0.10.0 version and hbase 1.2.4. The setup was working fine till
today morning. I saw PIO was down as the mount space issue was present on
the server and cleared the unwanted files.

After doing a pio-stop-all and pio-start-all the HMaster service is not
working. I tried multiple times the pio restart.

I can see whenever I do a pio-stop-all and check the service using jps, the
Hmaster seems running. Similarly I tried to run the ./start-hbase.sh script
but still pio status is not showing as success.

pio error log :

[INFO] [Console$] Inspecting PredictionIO...
[INFO] [Console$] PredictionIO 0.10.0-incubating is installed at
/opt/tools/PredictionIO-0.10.0-incubating
[INFO] [Console$] Inspecting Apache Spark...
[INFO] [Console$] Apache Spark is installed at
/opt/tools/PredictionIO-0.10.0-incubating/vendors/spark-1.6.3-bin-hadoop2.6
[INFO] [Console$] Apache Spark 1.6.3 detected (meets minimum requirement of
1.3.0)
[INFO] [Console$] Inspecting storage backend connections...
[INFO] [Storage$] Verifying Meta Data Backend (Source: ELASTICSEARCH)...
[INFO] [Storage$] Verifying Model Data Backend (Source: LOCALFS)...
[INFO] [Storage$] Verifying Event Data Backend (Source: HBASE)...
[ERROR] [RecoverableZooKeeper] ZooKeeper exists failed after 1 attempts
[ERROR] [ZooKeeperWatcher] hconnection-0x7c891ba7, quorum=localhost:2181,
baseZNode=/hbase Received unexpected KeeperException, re-throwing exception
[WARN] [ZooKeeperRegistry] Can't retrieve clusterId from Zookeeper
[ERROR] [StorageClient] Cannot connect to ZooKeeper (ZooKeeper ensemble:
localhost). Please make sure that the configuration is pointing at the
correct ZooKeeper ensemble. By default, HBase manages its own ZooKeeper, so
if you have not configured HBase to use an external ZooKeeper, that means
your HBase is not started or configured properly.
[ERROR] [Storage$] Error initializing storage client for source HBASE
[ERROR] [Console$] Unable to connect to all storage backends successfully.
The following shows the error message from the storage backend.
[ERROR] [Console$] Data source HBASE was not properly initialized.
(org.apache.predictionio.data.storage.StorageClientException)
[ERROR] [Console$] Dumping configuration of initialized storage backend
sources. Please make sure they are correct.
[ERROR] [Console$] Source Name: ELASTICSEARCH; Type: elasticsearch;
Configuration: TYPE -> elasticsearch, HOME ->
/opt/tools/PredictionIO-0.10.0-incubating/vendors/elasticsearch-1.7.3
[ERROR] [Console$] Source Name: LOCALFS; Type: localfs; Configuration: PATH
-> /root/.pio_store/models, TYPE -> localfs
[ERROR] [Console$] Source Name: HBASE; Type: (error); Configuration: (error)


Regards,
Bala

Re: Hbase issue

Posted by bala vivek <ba...@gmail.com>.
Hi Donald,

Yes, I'm running on the single machine. PIO, hbase , elasticsearch, spark
everything works on the same server. Let me know which file I need to
remove because I have client data present in PIO.

I have tried adding the entries in hbase-site.xml using the following link,
after which I can see the Hmaster seems active but still, the error remains
the same.

https://medium.com/@tjosepraveen/cant-get-connection-to-zookeeper-keepererrorcode-connectionloss-for-hbase-63746fbcdbe7


Hbase Error logs :- ( I have commented the server name)

2018-04-13 04:31:28,246 INFO
[RS:0;VD500042:49584-SendThread(localhost:2182)] zookeeper.ClientCnxn:
Opening socket connection to server localhost/0:0:0:0:0:0:0:1:2182. Will
not attempt to authenticate using SASL (unknown error)
2018-04-13 04:31:28,247 WARN
[RS:0;XXXXXX:49584-SendThread(localhost:2182)] zookeeper.ClientCnxn:
Session 0x162be5554b90003 for server null, unexpected error, closing socket
connection and attempting reconnect
java.net.ConnectException: Connection refused
        at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
        at
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
        at
org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
        at
org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
2018-04-13 04:31:28,553 ERROR [main] master.HMasterCommandLine: Master
exiting
java.lang.RuntimeException: Master not initialized after 200000ms seconds
        at
org.apache.hadoop.hbase.util.JVMClusterUtil.startup(JVMClusterUtil.java:225)
        at
org.apache.hadoop.hbase.LocalHBaseCluster.startup(LocalHBaseCluster.java:449)
        at
org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:225)
        at
org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:137)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
        at
org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:126)
        at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:2436)
(END)

I have tried multiple time pio-stop-all and pio-start-all but no luck the
service is not up.
If I install the hbase alone in the existing setup let me know what things
I should consider. If anyone faced this issue please provide me the
solution steps.

On Thu, Apr 12, 2018 at 9:13 PM, Donald Szeto <do...@apache.org> wrote:

> Hi Bala,
>
> Are you running a single-machine HBase setup? The ZooKeeper embedded in
> such a setup is pretty fragile to disk space issue and your ZNode might
> have corrupted.
>
> If that’s indeed your setup, please take a look at HBase log files,
> specifically on messages from ZooKeeper. In this situation, one way to
> recover is to remove ZooKeeper files and let HBase recreate them, assuming
> from your log output that you don’t have other services depend on the same
> ZK.
>
> Regards,
> Donald
>
> On Thu, Apr 12, 2018 at 5:34 AM bala vivek <ba...@gmail.com>
> wrote:
>
>> Hi,
>>
>> I use PIO 0.10.0 version and hbase 1.2.4. The setup was working fine till
>> today morning. I saw PIO was down as the mount space issue was present on
>> the server and cleared the unwanted files.
>>
>> After doing a pio-stop-all and pio-start-all the HMaster service is not
>> working. I tried multiple times the pio restart.
>>
>> I can see whenever I do a pio-stop-all and check the service using jps,
>> the Hmaster seems running. Similarly I tried to run the ./start-hbase.sh
>> script but still pio status is not showing as success.
>>
>> pio error log :
>>
>> [INFO] [Console$] Inspecting PredictionIO...
>> [INFO] [Console$] PredictionIO 0.10.0-incubating is installed at
>> /opt/tools/PredictionIO-0.10.0-incubating
>> [INFO] [Console$] Inspecting Apache Spark...
>> [INFO] [Console$] Apache Spark is installed at
>> /opt/tools/PredictionIO-0.10.0-incubating/vendors/spark-1.
>> 6.3-bin-hadoop2.6
>> [INFO] [Console$] Apache Spark 1.6.3 detected (meets minimum requirement
>> of 1.3.0)
>> [INFO] [Console$] Inspecting storage backend connections...
>> [INFO] [Storage$] Verifying Meta Data Backend (Source: ELASTICSEARCH)...
>> [INFO] [Storage$] Verifying Model Data Backend (Source: LOCALFS)...
>> [INFO] [Storage$] Verifying Event Data Backend (Source: HBASE)...
>> [ERROR] [RecoverableZooKeeper] ZooKeeper exists failed after 1 attempts
>> [ERROR] [ZooKeeperWatcher] hconnection-0x7c891ba7, quorum=localhost:2181,
>> baseZNode=/hbase Received unexpected KeeperException, re-throwing exception
>> [WARN] [ZooKeeperRegistry] Can't retrieve clusterId from Zookeeper
>> [ERROR] [StorageClient] Cannot connect to ZooKeeper (ZooKeeper ensemble:
>> localhost). Please make sure that the configuration is pointing at the
>> correct ZooKeeper ensemble. By default, HBase manages its own ZooKeeper, so
>> if you have not configured HBase to use an external ZooKeeper, that means
>> your HBase is not started or configured properly.
>> [ERROR] [Storage$] Error initializing storage client for source HBASE
>> [ERROR] [Console$] Unable to connect to all storage backends
>> successfully. The following shows the error message from the storage
>> backend.
>> [ERROR] [Console$] Data source HBASE was not properly initialized.
>> (org.apache.predictionio.data.storage.StorageClientException)
>> [ERROR] [Console$] Dumping configuration of initialized storage backend
>> sources. Please make sure they are correct.
>> [ERROR] [Console$] Source Name: ELASTICSEARCH; Type: elasticsearch;
>> Configuration: TYPE -> elasticsearch, HOME -> /opt/tools/PredictionIO-0.10.
>> 0-incubating/vendors/elasticsearch-1.7.3
>> [ERROR] [Console$] Source Name: LOCALFS; Type: localfs; Configuration:
>> PATH -> /root/.pio_store/models, TYPE -> localfs
>> [ERROR] [Console$] Source Name: HBASE; Type: (error); Configuration:
>> (error)
>>
>>
>> Regards,
>> Bala
>>
>

Re: Hbase issue

Posted by bala vivek <ba...@gmail.com>.
'Hi Donald,

The link was good, but here is what I observe.

 > I dont see any of the zookeeper folder present seperately inside Hbase.
So the "find" command in unix bring many output for the term 'snapshot' and
'log'
 >  hbase hbck -repair and  hbase hbck -repairHoles commands works fine
without any error but on running  'hbase hbck'
I can see the following errors.

2018-04-15 12:29:04,158 ERROR [main]
client.ConnectionManager$HConnectionImplementation: Can't get connection to
ZooKeeper: KeeperErrorCode = ConnectionLoss for /hbase
2018-04-15 12:29:04,158 WARN  [main-SendThread(localhost:2182)]
zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error,
closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
   at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
   at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
   at
org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
2018-04-15 12:29:04,158 INFO  [main] client.RpcRetryingCaller: Call
exception, tries=18, retries=35, started=518330 ms ago, cancelled=false,
msg=
2018-04-15 12:29:05,259 INFO  [main-SendThread(localhost:2182)]
zookeeper.ClientCnxn: Opening socket connection to server localhost/
127.0.0.1:2182. Will not attempt to authenticate using SASL (unkn
own error)
2018-04-15 12:29:05,259 WARN  [main-SendThread(localhost:2182)]
zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error,
closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
   at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
   at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
   at
org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
2018-04-15 12:29:05,359 INFO  [main-SendThread(localhost:2182)]
zookeeper.ClientCnxn: Opening socket connection to server
localhost/0:0:0:0:0:0:0:1:2182. Will not attempt to authenticate using SASL
 (unknown error)
2018-04-15 12:29:05,360 WARN  [main-SendThread(localhost:2182)]
zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error,
closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
   at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
   at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
   at
org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
^C2018-04-15 12:29:05,455 INFO  [Thread-3]
client.ConnectionManager$HConnectionImplementation: Closing zookeeper
sessionid=0x0
2018-04-15 12:29:05,461 INFO  [Thread-3] zookeeper.ZooKeeper: Session: 0x0
closed
2018-04-15 12:29:05,461 INFO  [main-EventThread] zookeeper.ClientCnxn:
EventThread shut down


Few errors like,

2018-04-15 12:29:02,954 INFO  [main-SendThread(localhost:2182)]
zookeeper.ClientCnxn: Opening socket connection to server
localhost/0:0:0:0:0:0:0:1:2182. Will not attempt to authenticate using SASL
 (unknown error)
2018-04-15 12:29:02,954 WARN  [main-SendThread(localhost:2182)]
zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error,
closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
   at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
   at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
   at
org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)


I can see the hostname/ip is correctly configured on the server which I use
PIO.

Regards,

Bala


On Sat, Apr 14, 2018 at 2:09 AM, Donald Szeto <do...@apache.org> wrote:

> Hi Bala,
>
> Please take a look at http://predictionio.apache.
> org/resources/faq/#running-hbase, specifically on "Q: How to fix HBase
> issues after cleaning up a disk that was full?".
>
> Regards,
> Donald
>
> On Fri, Apr 13, 2018 at 9:34 AM, Pat Ferrel <pa...@occamsmachete.com> wrote:
>
>> This may seem unhelpful now but for others it might be useful to mention
>> some minimum PIO in production best practices:
>>
>> 1) PIO should IMO never be run in production on a single node. When all
>> services share the same memory, cpu, and disk, it is very difficult to find
>> the root cause to a problem.
>> 2) backup data with pio export periodically
>> 3) install monitoring for disk used, as well as response times and other
>> factors so you get warnings before you get wedged.
>> 4) PIO will store data forever. It is designed as an input only system.
>> Nothing is dropped ever. This is clearly unworkable in real life so a
>> feature was added to trim the event stream in a safe way in PIO 0.12.0.
>> There is a separate Template for trimming the DB and doing other things
>> like deduplication and other compression on some schedule that can and
>> should be different than training. Do not use this template until you
>> upgrade and make sure it is compatible with your template:
>> https://github.com/actionml/db-cleaner
>>
>>
>> From: bala vivek <ba...@gmail.com> <ba...@gmail.com>
>> Reply: user@predictionio.apache.org <us...@predictionio.apache.org>
>> <us...@predictionio.apache.org>
>> Date: April 13, 2018 at 2:50:26 AM
>> To: user@predictionio.apache.org <us...@predictionio.apache.org>
>> <us...@predictionio.apache.org>
>> Subject:  Re: Hbase issue
>>
>> Hi Donald,
>>
>> Yes, I'm running on the single machine. PIO, hbase , elasticsearch, spark
>> everything works on the same server. Let me know which file I need to
>> remove because I have client data present in PIO.
>>
>> I have tried adding the entries in hbase-site.xml using the following
>> link, after which I can see the Hmaster seems active but still, the error
>> remains the same.
>>
>> https://medium.com/@tjosepraveen/cant-get-connection-to-
>> zookeeper-keepererrorcode-connectionloss-for-hbase-63746fbcdbe7
>>
>>
>> Hbase Error logs :- ( I have commented the server name)
>>
>> 2018-04-13 04:31:28,246 INFO  [RS:0;VD500042:49584-SendThread(localhost:2182)]
>> zookeeper.ClientCnxn: Opening socket connection to server
>> localhost/0:0:0:0:0:0:0:1:2182. Will not attempt to authenticate using
>> SASL (unknown error)
>> 2018-04-13 04:31:28,247 WARN  [RS:0;XXXXXX:49584-SendThread(localhost:2182)]
>> zookeeper.ClientCnxn: Session 0x162be5554b90003 for server null, unexpected
>> error, closing socket connection and attempting reconnect
>> java.net.ConnectException: Connection refused
>>         at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>         at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl
>> .java:717)
>>         at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientC
>> nxnSocketNIO.java:361)
>>         at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.
>> java:1081)
>> 2018-04-13 04:31:28,553 ERROR [main] master.HMasterCommandLine: Master
>> exiting
>> java.lang.RuntimeException: Master not initialized after 200000ms seconds
>>         at org.apache.hadoop.hbase.util.JVMClusterUtil.startup(JVMClust
>> erUtil.java:225)
>>         at org.apache.hadoop.hbase.LocalHBaseCluster.startup(LocalHBase
>> Cluster.java:449)
>>         at org.apache.hadoop.hbase.master.HMasterCommandLine.startMaste
>> r(HMasterCommandLine.java:225)
>>         at org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMaste
>> rCommandLine.java:137)
>>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>>         at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(Server
>> CommandLine.java:126)
>>         at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:2436)
>> (END)
>>
>> I have tried multiple time pio-stop-all and pio-start-all but no luck the
>> service is not up.
>> If I install the hbase alone in the existing setup let me know what
>> things I should consider. If anyone faced this issue please provide me the
>> solution steps.
>>
>> On Thu, Apr 12, 2018 at 9:13 PM, Donald Szeto <do...@apache.org> wrote:
>>
>>> Hi Bala,
>>>
>>> Are you running a single-machine HBase setup? The ZooKeeper embedded in
>>> such a setup is pretty fragile to disk space issue and your ZNode might
>>> have corrupted.
>>>
>>> If that’s indeed your setup, please take a look at HBase log files,
>>> specifically on messages from ZooKeeper. In this situation, one way to
>>> recover is to remove ZooKeeper files and let HBase recreate them, assuming
>>> from your log output that you don’t have other services depend on the same
>>> ZK.
>>>
>>> Regards,
>>> Donald
>>>
>>> On Thu, Apr 12, 2018 at 5:34 AM bala vivek <ba...@gmail.com>
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> I use PIO 0.10.0 version and hbase 1.2.4. The setup was working fine
>>>> till today morning. I saw PIO was down as the mount space issue was present
>>>> on the server and cleared the unwanted files.
>>>>
>>>> After doing a pio-stop-all and pio-start-all the HMaster service is not
>>>> working. I tried multiple times the pio restart.
>>>>
>>>> I can see whenever I do a pio-stop-all and check the service using jps,
>>>> the Hmaster seems running. Similarly I tried to run the ./start-hbase.sh
>>>> script but still pio status is not showing as success.
>>>>
>>>> pio error log :
>>>>
>>>> [INFO] [Console$] Inspecting PredictionIO...
>>>> [INFO] [Console$] PredictionIO 0.10.0-incubating is installed at
>>>> /opt/tools/PredictionIO-0.10.0-incubating
>>>> [INFO] [Console$] Inspecting Apache Spark...
>>>> [INFO] [Console$] Apache Spark is installed at
>>>> /opt/tools/PredictionIO-0.10.0-incubating/vendors/spark-1.6.
>>>> 3-bin-hadoop2.6
>>>> [INFO] [Console$] Apache Spark 1.6.3 detected (meets minimum
>>>> requirement of 1.3.0)
>>>> [INFO] [Console$] Inspecting storage backend connections...
>>>> [INFO] [Storage$] Verifying Meta Data Backend (Source: ELASTICSEARCH)...
>>>> [INFO] [Storage$] Verifying Model Data Backend (Source: LOCALFS)...
>>>> [INFO] [Storage$] Verifying Event Data Backend (Source: HBASE)...
>>>> [ERROR] [RecoverableZooKeeper] ZooKeeper exists failed after 1 attempts
>>>> [ERROR] [ZooKeeperWatcher] hconnection-0x7c891ba7,
>>>> quorum=localhost:2181, baseZNode=/hbase Received unexpected
>>>> KeeperException, re-throwing exception
>>>> [WARN] [ZooKeeperRegistry] Can't retrieve clusterId from Zookeeper
>>>> [ERROR] [StorageClient] Cannot connect to ZooKeeper (ZooKeeper
>>>> ensemble: localhost). Please make sure that the configuration is pointing
>>>> at the correct ZooKeeper ensemble. By default, HBase manages its own
>>>> ZooKeeper, so if you have not configured HBase to use an external
>>>> ZooKeeper, that means your HBase is not started or configured properly.
>>>> [ERROR] [Storage$] Error initializing storage client for source HBASE
>>>> [ERROR] [Console$] Unable to connect to all storage backends
>>>> successfully. The following shows the error message from the storage
>>>> backend.
>>>> [ERROR] [Console$] Data source HBASE was not properly initialized.
>>>> (org.apache.predictionio.data.storage.StorageClientException)
>>>> [ERROR] [Console$] Dumping configuration of initialized storage backend
>>>> sources. Please make sure they are correct.
>>>> [ERROR] [Console$] Source Name: ELASTICSEARCH; Type: elasticsearch;
>>>> Configuration: TYPE -> elasticsearch, HOME -> /opt/tools/PredictionIO-0.10.0
>>>> -incubating/vendors/elasticsearch-1.7.3
>>>> [ERROR] [Console$] Source Name: LOCALFS; Type: localfs; Configuration:
>>>> PATH -> /root/.pio_store/models, TYPE -> localfs
>>>> [ERROR] [Console$] Source Name: HBASE; Type: (error); Configuration:
>>>> (error)
>>>>
>>>>
>>>> Regards,
>>>> Bala
>>>>
>>>
>>
>

Re: Hbase issue

Posted by Donald Szeto <do...@apache.org>.
Hi Bala,

Please take a look at
http://predictionio.apache.org/resources/faq/#running-hbase, specifically
on "Q: How to fix HBase issues after cleaning up a disk that was full?".

Regards,
Donald

On Fri, Apr 13, 2018 at 9:34 AM, Pat Ferrel <pa...@occamsmachete.com> wrote:

> This may seem unhelpful now but for others it might be useful to mention
> some minimum PIO in production best practices:
>
> 1) PIO should IMO never be run in production on a single node. When all
> services share the same memory, cpu, and disk, it is very difficult to find
> the root cause to a problem.
> 2) backup data with pio export periodically
> 3) install monitoring for disk used, as well as response times and other
> factors so you get warnings before you get wedged.
> 4) PIO will store data forever. It is designed as an input only system.
> Nothing is dropped ever. This is clearly unworkable in real life so a
> feature was added to trim the event stream in a safe way in PIO 0.12.0.
> There is a separate Template for trimming the DB and doing other things
> like deduplication and other compression on some schedule that can and
> should be different than training. Do not use this template until you
> upgrade and make sure it is compatible with your template:
> https://github.com/actionml/db-cleaner
>
>
> From: bala vivek <ba...@gmail.com> <ba...@gmail.com>
> Reply: user@predictionio.apache.org <us...@predictionio.apache.org>
> <us...@predictionio.apache.org>
> Date: April 13, 2018 at 2:50:26 AM
> To: user@predictionio.apache.org <us...@predictionio.apache.org>
> <us...@predictionio.apache.org>
> Subject:  Re: Hbase issue
>
> Hi Donald,
>
> Yes, I'm running on the single machine. PIO, hbase , elasticsearch, spark
> everything works on the same server. Let me know which file I need to
> remove because I have client data present in PIO.
>
> I have tried adding the entries in hbase-site.xml using the following
> link, after which I can see the Hmaster seems active but still, the error
> remains the same.
>
> https://medium.com/@tjosepraveen/cant-get-connection-to-zookeeper-
> keepererrorcode-connectionloss-for-hbase-63746fbcdbe7
>
>
> Hbase Error logs :- ( I have commented the server name)
>
> 2018-04-13 04:31:28,246 INFO  [RS:0;VD500042:49584-SendThread(localhost:2182)]
> zookeeper.ClientCnxn: Opening socket connection to server
> localhost/0:0:0:0:0:0:0:1:2182. Will not attempt to authenticate using
> SASL (unknown error)
> 2018-04-13 04:31:28,247 WARN  [RS:0;XXXXXX:49584-SendThread(localhost:2182)]
> zookeeper.ClientCnxn: Session 0x162be5554b90003 for server null, unexpected
> error, closing socket connection and attempting reconnect
> java.net.ConnectException: Connection refused
>         at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>         at sun.nio.ch.SocketChannelImpl.finishConnect(
> SocketChannelImpl.java:717)
>         at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(
> ClientCnxnSocketNIO.java:361)
>         at org.apache.zookeeper.ClientCnxn$SendThread.run(
> ClientCnxn.java:1081)
> 2018-04-13 04:31:28,553 ERROR [main] master.HMasterCommandLine: Master
> exiting
> java.lang.RuntimeException: Master not initialized after 200000ms seconds
>         at org.apache.hadoop.hbase.util.JVMClusterUtil.startup(
> JVMClusterUtil.java:225)
>         at org.apache.hadoop.hbase.LocalHBaseCluster.startup(
> LocalHBaseCluster.java:449)
>         at org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(
> HMasterCommandLine.java:225)
>         at org.apache.hadoop.hbase.master.HMasterCommandLine.run(
> HMasterCommandLine.java:137)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>         at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(
> ServerCommandLine.java:126)
>         at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:2436)
> (END)
>
> I have tried multiple time pio-stop-all and pio-start-all but no luck the
> service is not up.
> If I install the hbase alone in the existing setup let me know what things
> I should consider. If anyone faced this issue please provide me the
> solution steps.
>
> On Thu, Apr 12, 2018 at 9:13 PM, Donald Szeto <do...@apache.org> wrote:
>
>> Hi Bala,
>>
>> Are you running a single-machine HBase setup? The ZooKeeper embedded in
>> such a setup is pretty fragile to disk space issue and your ZNode might
>> have corrupted.
>>
>> If that’s indeed your setup, please take a look at HBase log files,
>> specifically on messages from ZooKeeper. In this situation, one way to
>> recover is to remove ZooKeeper files and let HBase recreate them, assuming
>> from your log output that you don’t have other services depend on the same
>> ZK.
>>
>> Regards,
>> Donald
>>
>> On Thu, Apr 12, 2018 at 5:34 AM bala vivek <ba...@gmail.com>
>> wrote:
>>
>>> Hi,
>>>
>>> I use PIO 0.10.0 version and hbase 1.2.4. The setup was working fine
>>> till today morning. I saw PIO was down as the mount space issue was present
>>> on the server and cleared the unwanted files.
>>>
>>> After doing a pio-stop-all and pio-start-all the HMaster service is not
>>> working. I tried multiple times the pio restart.
>>>
>>> I can see whenever I do a pio-stop-all and check the service using jps,
>>> the Hmaster seems running. Similarly I tried to run the ./start-hbase.sh
>>> script but still pio status is not showing as success.
>>>
>>> pio error log :
>>>
>>> [INFO] [Console$] Inspecting PredictionIO...
>>> [INFO] [Console$] PredictionIO 0.10.0-incubating is installed at
>>> /opt/tools/PredictionIO-0.10.0-incubating
>>> [INFO] [Console$] Inspecting Apache Spark...
>>> [INFO] [Console$] Apache Spark is installed at
>>> /opt/tools/PredictionIO-0.10.0-incubating/vendors/spark-1.6.
>>> 3-bin-hadoop2.6
>>> [INFO] [Console$] Apache Spark 1.6.3 detected (meets minimum requirement
>>> of 1.3.0)
>>> [INFO] [Console$] Inspecting storage backend connections...
>>> [INFO] [Storage$] Verifying Meta Data Backend (Source: ELASTICSEARCH)...
>>> [INFO] [Storage$] Verifying Model Data Backend (Source: LOCALFS)...
>>> [INFO] [Storage$] Verifying Event Data Backend (Source: HBASE)...
>>> [ERROR] [RecoverableZooKeeper] ZooKeeper exists failed after 1 attempts
>>> [ERROR] [ZooKeeperWatcher] hconnection-0x7c891ba7,
>>> quorum=localhost:2181, baseZNode=/hbase Received unexpected
>>> KeeperException, re-throwing exception
>>> [WARN] [ZooKeeperRegistry] Can't retrieve clusterId from Zookeeper
>>> [ERROR] [StorageClient] Cannot connect to ZooKeeper (ZooKeeper ensemble:
>>> localhost). Please make sure that the configuration is pointing at the
>>> correct ZooKeeper ensemble. By default, HBase manages its own ZooKeeper, so
>>> if you have not configured HBase to use an external ZooKeeper, that means
>>> your HBase is not started or configured properly.
>>> [ERROR] [Storage$] Error initializing storage client for source HBASE
>>> [ERROR] [Console$] Unable to connect to all storage backends
>>> successfully. The following shows the error message from the storage
>>> backend.
>>> [ERROR] [Console$] Data source HBASE was not properly initialized.
>>> (org.apache.predictionio.data.storage.StorageClientException)
>>> [ERROR] [Console$] Dumping configuration of initialized storage backend
>>> sources. Please make sure they are correct.
>>> [ERROR] [Console$] Source Name: ELASTICSEARCH; Type: elasticsearch;
>>> Configuration: TYPE -> elasticsearch, HOME -> /opt/tools/PredictionIO-0.10.0
>>> -incubating/vendors/elasticsearch-1.7.3
>>> [ERROR] [Console$] Source Name: LOCALFS; Type: localfs; Configuration:
>>> PATH -> /root/.pio_store/models, TYPE -> localfs
>>> [ERROR] [Console$] Source Name: HBASE; Type: (error); Configuration:
>>> (error)
>>>
>>>
>>> Regards,
>>> Bala
>>>
>>
>

Re: Hbase issue

Posted by Pat Ferrel <pa...@occamsmachete.com>.
This may seem unhelpful now but for others it might be useful to mention some minimum PIO in production best practices:

1) PIO should IMO never be run in production on a single node. When all services share the same memory, cpu, and disk, it is very difficult to find the root cause to a problem.
2) backup data with pio export periodically
3) install monitoring for disk used, as well as response times and other factors so you get warnings before you get wedged.
4) PIO will store data forever. It is designed as an input only system. Nothing is dropped ever. This is clearly unworkable in real life so a feature was added to trim the event stream in a safe way in PIO 0.12.0. There is a separate Template for trimming the DB and doing other things like deduplication and other compression on some schedule that can and should be different than training. Do not use this template until you upgrade and make sure it is compatible with your template: https://github.com/actionml/db-cleaner


From: bala vivek <ba...@gmail.com>
Reply: user@predictionio.apache.org <us...@predictionio.apache.org>
Date: April 13, 2018 at 2:50:26 AM
To: user@predictionio.apache.org <us...@predictionio.apache.org>
Subject:  Re: Hbase issue  

Hi Donald,

Yes, I'm running on the single machine. PIO, hbase , elasticsearch, spark everything works on the same server. Let me know which file I need to remove because I have client data present in PIO. 

I have tried adding the entries in hbase-site.xml using the following link, after which I can see the Hmaster seems active but still, the error remains the same.

https://medium.com/@tjosepraveen/cant-get-connection-to-zookeeper-keepererrorcode-connectionloss-for-hbase-63746fbcdbe7


Hbase Error logs :- ( I have commented the server name)

2018-04-13 04:31:28,246 INFO  [RS:0;VD500042:49584-SendThread(localhost:2182)] zookeeper.ClientCnxn: Opening socket connection to server localhost/0:0:0:0:0:0:0:1:2182. Will not attempt to authenticate using SASL (unknown error)
2018-04-13 04:31:28,247 WARN  [RS:0;XXXXXX:49584-SendThread(localhost:2182)] zookeeper.ClientCnxn: Session 0x162be5554b90003 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
        at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
        at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
        at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
        at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
2018-04-13 04:31:28,553 ERROR [main] master.HMasterCommandLine: Master exiting
java.lang.RuntimeException: Master not initialized after 200000ms seconds
        at org.apache.hadoop.hbase.util.JVMClusterUtil.startup(JVMClusterUtil.java:225)
        at org.apache.hadoop.hbase.LocalHBaseCluster.startup(LocalHBaseCluster.java:449)
        at org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:225)
        at org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:137)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
        at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:126)
        at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:2436)
(END)

I have tried multiple time pio-stop-all and pio-start-all but no luck the service is not up.
If I install the hbase alone in the existing setup let me know what things I should consider. If anyone faced this issue please provide me the solution steps.

On Thu, Apr 12, 2018 at 9:13 PM, Donald Szeto <do...@apache.org> wrote:
Hi Bala,

Are you running a single-machine HBase setup? The ZooKeeper embedded in such a setup is pretty fragile to disk space issue and your ZNode might have corrupted.

If that’s indeed your setup, please take a look at HBase log files, specifically on messages from ZooKeeper. In this situation, one way to recover is to remove ZooKeeper files and let HBase recreate them, assuming from your log output that you don’t have other services depend on the same ZK.

Regards,
Donald

On Thu, Apr 12, 2018 at 5:34 AM bala vivek <ba...@gmail.com> wrote:
Hi,

I use PIO 0.10.0 version and hbase 1.2.4. The setup was working fine till today morning. I saw PIO was down as the mount space issue was present on the server and cleared the unwanted files.

After doing a pio-stop-all and pio-start-all the HMaster service is not working. I tried multiple times the pio restart.

I can see whenever I do a pio-stop-all and check the service using jps, the Hmaster seems running. Similarly I tried to run the ./start-hbase.sh script but still pio status is not showing as success.

pio error log :

[INFO] [Console$] Inspecting PredictionIO...
[INFO] [Console$] PredictionIO 0.10.0-incubating is installed at /opt/tools/PredictionIO-0.10.0-incubating
[INFO] [Console$] Inspecting Apache Spark...
[INFO] [Console$] Apache Spark is installed at /opt/tools/PredictionIO-0.10.0-incubating/vendors/spark-1.6.3-bin-hadoop2.6
[INFO] [Console$] Apache Spark 1.6.3 detected (meets minimum requirement of 1.3.0)
[INFO] [Console$] Inspecting storage backend connections...
[INFO] [Storage$] Verifying Meta Data Backend (Source: ELASTICSEARCH)...
[INFO] [Storage$] Verifying Model Data Backend (Source: LOCALFS)...
[INFO] [Storage$] Verifying Event Data Backend (Source: HBASE)...
[ERROR] [RecoverableZooKeeper] ZooKeeper exists failed after 1 attempts
[ERROR] [ZooKeeperWatcher] hconnection-0x7c891ba7, quorum=localhost:2181, baseZNode=/hbase Received unexpected KeeperException, re-throwing exception
[WARN] [ZooKeeperRegistry] Can't retrieve clusterId from Zookeeper
[ERROR] [StorageClient] Cannot connect to ZooKeeper (ZooKeeper ensemble: localhost). Please make sure that the configuration is pointing at the correct ZooKeeper ensemble. By default, HBase manages its own ZooKeeper, so if you have not configured HBase to use an external ZooKeeper, that means your HBase is not started or configured properly.
[ERROR] [Storage$] Error initializing storage client for source HBASE
[ERROR] [Console$] Unable to connect to all storage backends successfully. The following shows the error message from the storage backend.
[ERROR] [Console$] Data source HBASE was not properly initialized. (org.apache.predictionio.data.storage.StorageClientException)
[ERROR] [Console$] Dumping configuration of initialized storage backend sources. Please make sure they are correct.
[ERROR] [Console$] Source Name: ELASTICSEARCH; Type: elasticsearch; Configuration: TYPE -> elasticsearch, HOME -> /opt/tools/PredictionIO-0.10.0-incubating/vendors/elasticsearch-1.7.3
[ERROR] [Console$] Source Name: LOCALFS; Type: localfs; Configuration: PATH -> /root/.pio_store/models, TYPE -> localfs
[ERROR] [Console$] Source Name: HBASE; Type: (error); Configuration: (error)


Regards,
Bala


Re: Hbase issue

Posted by Donald Szeto <do...@apache.org>.
Hi Bala,

Are you running a single-machine HBase setup? The ZooKeeper embedded in
such a setup is pretty fragile to disk space issue and your ZNode might
have corrupted.

If that’s indeed your setup, please take a look at HBase log files,
specifically on messages from ZooKeeper. In this situation, one way to
recover is to remove ZooKeeper files and let HBase recreate them, assuming
from your log output that you don’t have other services depend on the same
ZK.

Regards,
Donald

On Thu, Apr 12, 2018 at 5:34 AM bala vivek <ba...@gmail.com> wrote:

> Hi,
>
> I use PIO 0.10.0 version and hbase 1.2.4. The setup was working fine till
> today morning. I saw PIO was down as the mount space issue was present on
> the server and cleared the unwanted files.
>
> After doing a pio-stop-all and pio-start-all the HMaster service is not
> working. I tried multiple times the pio restart.
>
> I can see whenever I do a pio-stop-all and check the service using jps,
> the Hmaster seems running. Similarly I tried to run the ./start-hbase.sh
> script but still pio status is not showing as success.
>
> pio error log :
>
> [INFO] [Console$] Inspecting PredictionIO...
> [INFO] [Console$] PredictionIO 0.10.0-incubating is installed at
> /opt/tools/PredictionIO-0.10.0-incubating
> [INFO] [Console$] Inspecting Apache Spark...
> [INFO] [Console$] Apache Spark is installed at
> /opt/tools/PredictionIO-0.10.0-incubating/vendors/spark-1.6.3-bin-hadoop2.6
> [INFO] [Console$] Apache Spark 1.6.3 detected (meets minimum requirement
> of 1.3.0)
> [INFO] [Console$] Inspecting storage backend connections...
> [INFO] [Storage$] Verifying Meta Data Backend (Source: ELASTICSEARCH)...
> [INFO] [Storage$] Verifying Model Data Backend (Source: LOCALFS)...
> [INFO] [Storage$] Verifying Event Data Backend (Source: HBASE)...
> [ERROR] [RecoverableZooKeeper] ZooKeeper exists failed after 1 attempts
> [ERROR] [ZooKeeperWatcher] hconnection-0x7c891ba7, quorum=localhost:2181,
> baseZNode=/hbase Received unexpected KeeperException, re-throwing exception
> [WARN] [ZooKeeperRegistry] Can't retrieve clusterId from Zookeeper
> [ERROR] [StorageClient] Cannot connect to ZooKeeper (ZooKeeper ensemble:
> localhost). Please make sure that the configuration is pointing at the
> correct ZooKeeper ensemble. By default, HBase manages its own ZooKeeper, so
> if you have not configured HBase to use an external ZooKeeper, that means
> your HBase is not started or configured properly.
> [ERROR] [Storage$] Error initializing storage client for source HBASE
> [ERROR] [Console$] Unable to connect to all storage backends successfully.
> The following shows the error message from the storage backend.
> [ERROR] [Console$] Data source HBASE was not properly initialized.
> (org.apache.predictionio.data.storage.StorageClientException)
> [ERROR] [Console$] Dumping configuration of initialized storage backend
> sources. Please make sure they are correct.
> [ERROR] [Console$] Source Name: ELASTICSEARCH; Type: elasticsearch;
> Configuration: TYPE -> elasticsearch, HOME ->
> /opt/tools/PredictionIO-0.10.0-incubating/vendors/elasticsearch-1.7.3
> [ERROR] [Console$] Source Name: LOCALFS; Type: localfs; Configuration:
> PATH -> /root/.pio_store/models, TYPE -> localfs
> [ERROR] [Console$] Source Name: HBASE; Type: (error); Configuration:
> (error)
>
>
> Regards,
> Bala
>