You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Giampaolo Trapasso <gi...@radicalbit.io> on 2016/01/07 09:09:23 UTC

Using CCM with Opscenter and manual agent installation

Hi to all,

I installed with CCM a 4 nodes local cluster (127.0.0.1-127.0.0.4).

ccm create -v 2.1.5 -n 4 gp

I'm trying to manually install the agents to use OpsCenter
(5.2.3.2015121015). I started the OpsCenter removing the JMX port, so the
configuration is

[jmx]
username =
password =

[agents]

[cassandra]
username =
seed_hosts = 127.0.0.1,127.0.0.2,127.0.0.3,127.0.0.4
password =
cql_port = 9042

I've configured all the four agents. For example *agent3* configuration is

[Giampaolo]: ~/opscenter/> cat agent3/conf/address.yaml
stomp_interface: "127.0.0.1"

agent_rpc_interface: 127.0.0.3
jmx_host: 127.0.0.3
jmx_port: 7300

I started the OpsCenter in foreground and I have this error. As I start
*agent4* I have this error:

2015-12-30 17:46:21+0100 [gp] ERROR: The state of the following nodes
could not be determined, most likely due to agents on those nodes not
being properly connected: [<Node 127.0.0.4='4611686018427387904'>,
<Node 127.0.0.3='0'>, <Node 127.0.0.2='-4611686018427387904'>, <Node
127.0.0.1='-9223372036854775808'>]
2015-12-30 17:46:24+0100 [gp]  INFO: Agent for ip 127.0.0.4 is version None
2015-12-30 17:46:24+0100 [gp]  INFO: Agent for ip 127.0.0.4 is version u'5.2.3'
2015-12-30 17:46:37+0100 [gp]  INFO: Nodes without agents: 127.0.0.3,
127.0.0.2, 127.0.0.1
2015-12-30 17:46:51+0100 [gp] ERROR: The state of the following nodes
could not be determined, most likely due to agents on those nodes not
being properly connected: [<Node 127.0.0.4='4611686018427387904'>,
<Node 127.0.0.3='0'>, <Node 127.0.0.2='-4611686018427387904'>, <Node
127.0.0.1='-9223372036854775808'>]

I'm a bit confused by the fact that first log tells me that an agent was
found for 4, while after a bit I have the error on console.

The agent 4 log tells me at the beginning that there are no errors (only
INFO severity):

[Giampaolo]: ~/opscenter/>agent4/bin/datastax-agent -f
  INFO [main] 2015-12-30 17:46:20,399 Loading conf files: ./conf/address.yaml
  INFO [main] 2015-12-30 17:46:20,461 Java vendor/version: Java
HotSpot(TM) 64-Bit Server VM/1.8.0_40
  INFO [main] 2015-12-30 17:46:20,462 DataStax Agent version: 5.2.3
  INFO [main] 2015-12-30 17:46:20,489 Default config values:
{:cassandra_port 9042, :rollups300_ttl 2419200,
:finished-request-cache-size 100, :settings_cf "settings",
:agent_rpc_interface "127.0.0.4", :restore_req_update_period 60,
:my_channel_prefix "/agent", :poll_period 60,
:monitored_cassandra_pass "*REDACTED*", :thrift_conn_timeout 10000,
:cassandra_pass "*REDACTED*", :rollups60_ttl 604800, :stomp_port
61620, :shorttime_interval 10, :longtime_interval 300,
:max-seconds-to-sleep 25, :private-conf-props {:cassandra.yaml
#{"broadcast_address" "rpc_address" "broadcast_rpc_address"
"listen_address" "initial_token"}, :cassandra-rackdc.properties #{}},
:thrift_port 9160, :agent-conf-group "global-cluster-agent-group",
:jmx_host "127.0.0.4", :ec2_metadata_api_host "169.254.169.254",
:metrics_enabled 1, :async_queue_size 5000, :backup_staging_dir nil,
:remote_verify_max 30000, :disk_usage_update_period 60,
:throttle-bytes-per-second 500000, :rollups7200_ttl 31536000,
:trace_delay 300, :remote_backup_retries 3, :cassandra_user
"*REDACTED*", :ssl_keystore nil, :rollup_snapshot_period 300,
:is_package false, :monitor_command
"/usr/share/datastax-agent/bin/datastax_agent_monitor",
:thrift_socket_timeout 5000, :remote_verify_initial_delay 1000,
:cassandra_log_location "/var/log/cassandra/system.log",
:restore_on_transfer_failure false, :ssl_keystore_password
"*REDACTED*", :tmp_dir "/var/lib/datastax-agent/tmp/",
:monitored_thrift_port 9160, :config_md5 nil, :jmx_port 7400,
:jmx_metrics_threadpool_size 4, :use_ssl 0, :max_pending_repairs 5,
:rollups86400_ttl 0, :monitored_cassandra_user "*REDACTED*",
:nodedetails_threadpool_size 3, :api_port 61621,
:monitored_ssl_keystore nil, :slow_query_fetch_size 2000,
:kerberos_service nil, :backup_file_queue_max 10000,
:jmx_thread_pool_size 5, :production 1, :monitored_cassandra_port
9042, :runs_sudo 1, :max_file_transfer_attempts 30,
:config_encryption_active false, :running-request-cache-size 500,
:monitored_ssl_keystore_password "*REDACTED*", :stomp_interface
"127.0.0.1", :storage_keyspace "OpsCenter", :hosts ["127.0.0.1"],
:rollup_snapshot_threshold 300, :jmx_retry_timeout 30,
:unthrottled-default 10000000000, :multipart-chunk-size 5000000,
:remote_backup_retry_delay 5000, :sstableloader_max_heap_size nil,
:jmx_operations_pool_size 4, :slow_query_refresh 5,
:remote_backup_timeout 1000, :slow_query_ignore ["OpsCenter"
"dse_perf"], :max_reconnect_time 15000, :seconds-to-read-kill-channel
0.005, :slow_query_past 3600000, :realtime_interval 5, :pdps_ttl
259200}
  INFO [main] 2015-12-30 17:46:20,492 Waiting for the config from OpsCenter
  INFO [main] 2015-12-30 17:46:20,493 Attempting to determine
Cassandra's broadcast address through JMX
  INFO [main] 2015-12-30 17:46:20,494 Starting Stomp
  INFO [main] 2015-12-30 17:46:20,495 Starting up agent communcation
with OpsCenter.
  INFO [main] 2015-12-30 17:46:24,595 Reconnecting to a backup
OpsCenter instance
  INFO [main] 2015-12-30 17:46:24,597 SSL communication is disabled
  INFO [main] 2015-12-30 17:46:24,597 Creating stomp connection to
127.0.0.1:61620
  INFO [StompConnection receiver] 2015-12-30 17:46:24,603 Reconnecting in 0s.
  INFO [StompConnection receiver] 2015-12-30 17:46:24,608 Connected to
127.0.0.1:61620
  INFO [main] 2015-12-30 17:46:24,609 Starting Jetty server: {:join?
false, :ssl? false, :host "127.0.0.4", :port 61621}
  INFO [Initialization] 2015-12-30 17:46:24,608 Sleeping for 2s before
trying to determine IP over JMX again
  INFO [StompConnection receiver] 2015-12-30 17:46:24,681 Got new
config from OpsCenter [note values in address.yaml override those from
OpsCenter]: {:cassandra_port 9042, :rollups300_ttl 2419200,
:destinations [], :restore_req_update_period 1,
:monitored_cassandra_pass "*REDACTED*", :cassandra_pass "*REDACTED*",
:cassandra_rpc_interface "127.0.0.4", :rollups60_ttl 604800, :jmx_pass
"*REDACTED*", :thrift_port 9160, :ec2_metadata_api_host
"169.254.169.254", :metrics_enabled 1, :backup_staging_dir "",
:rollups7200_ttl 31536000, :cassandra_user "*REDACTED*", :jmx_user
"*REDACTED*", :metrics_ignored_column_families "",
:cassandra_log_location "/var/log/cassandra/system.log",
:monitored_thrift_port 9160, :config_md5
"e78e9aaea4de0b15ec94b11c6b2788d5", :provisioning 0, :use_ssl 0,
:max_pending_repairs 5, :rollups86400_ttl -1,
:monitored_cassandra_user "*REDACTED*", :api_port "61621",
:monitored_cassandra_port 9042, :storage_keyspace "OpsCenter", :hosts
["127.0.0.4"], :metrics_ignored_solr_cores "",
:metrics_ignored_keyspaces "system, system_traces, system_auth,
dse_auth, OpsCenter", :rollup_subscriptions [],
:jmx_operations_pool_size 4, :cassandra_install_location ""}
  INFO [StompConnection receiver] 2015-12-30 17:46:24,693 Couldn't get
broadcast address, will retry in five seconds.
  INFO [Jetty] 2015-12-30 17:46:24,715 Jetty server started
  INFO [Initialization] 2015-12-30 17:46:26,615 Sleeping for 4s before
trying to determine IP over JMX again
  INFO [StompConnection receiver] 2015-12-30 17:46:29,696 Couldn't get
broadcast address, will retry in five seconds.

however after a while:

 INFO [qtp153482676-24] 2015-12-30 17:49:07,057 HTTP: :get
/cassandra/conf {:private_props "True"} - 500
 ERROR [qtp153482676-24] 2015-12-30 17:49:09,084 Unhandled route
Exception (:bad-permissions): Unable to locate the cassandra.yaml
configuration file. If your configuration file is not located with the
Cassandra install, please set the 'conf_location' option in the
Cassandra section of the OpsCenter cluster configuration file and
restart opscenterd. Checked the following locations:
/etc/dse/cassandra/cassandra.yaml, /etc/cassandra/conf/cassandra.yaml,
/etc/cassandra/cassandra.yaml
  INFO [qtp153482676-24] 2015-12-30 17:49:09,085 HTTP: :get
/cassandra/conf {:private_props "True"} - 500
  INFO [StompConnection receiver] 2015-12-30 17:49:09,845 Couldn't get
broadcast address, will retry in five seconds.
 ERROR [qtp153482676-19] 2015-12-30 17:49:11,102 Unhandled route
Exception (:bad-permissions): Unable to locate the cassandra.yaml
configuration file. If your configuration file is not located with the
Cassandra install, please set the 'conf_location' option in the
Cassandra section of the OpsCenter cluster configuration file and
restart opscenterd. Checked the following locations:
/etc/dse/cassandra/cassandra.yaml, /etc/cassandra/conf/cassandra.yaml,
/etc/cassandra/cassandra.yaml

I've tried after this to add the other agents, but I got strange results.
Something like two agents are going to the same C* node so I've stopped
trying before I think that this error causes the others.

*Questions:*

   1. What is the error is OpsCenter log?
   2. Is it somehow related to error on agent log?
   3. Am I missing something on configuration (are more details needed?)
   4. Why OpsCenter complains about a missing cassandra.yaml file?
   Shouldn't it deployable on any host even if it has not a local C*
   installation?

Thanks in advance,

giampaolo

ps: excuse me for crossposting the same question on
http://stackoverflow.com/q/34533785/1360888, but on SO i've got no
attention after a week.

Re: Using CCM with Opscenter and manual agent installation

Posted by Nick Bailey <ni...@datastax.com>.
Cassandra switched jmx to only bind to localhost, so I believe you just
need to change jmx_host to localhost for all conf files.

On Thu, Jan 7, 2016 at 4:48 PM, Giampaolo Trapasso <
giampaolo.trapasso@radicalbit.io> wrote:

> Thanks Michael for the reply. I'm quite new to Cassandra, so it make sense
> to explain the use case. I just want to try different choices of data
> modelling and compare number of reads and writes. At the moment I'm not
> interested in a real stress test, I just want to understand implications of
> my choices, and, of course want to see OpsCenter in action. I thought that
> CCM+OpsCenter combo was good as choice. Do you think that there's something
> else that I can try? Thank you in advance.
>
> giampaolo
>
>
>
>
>
> 2016-01-07 19:24 GMT+01:00 Michael Shuler <mi...@pbandjelly.org>:
>
>> On 01/07/2016 12:22 PM, Michael Shuler wrote:
>> >> [<Node 127.0.0.4='4611686018427387904'>, <Node 127.0.0.3='0'>, <Node
>> > 127.0.0.2='-4611686018427387904'>, <Node
>> 127.0.0.1='-9223372036854775808'>]
>> >
>> > A couple of those, .4 and .2 are identical.
>>
>> Sorry, they are signed, so they are unique. (bad me.) Keep digging, I
>> guess.
>>
>> --
>> Michael
>>
>
>

Re: Using CCM with Opscenter and manual agent installation

Posted by Giampaolo Trapasso <gi...@radicalbit.io>.
Thanks Michael for the reply. I'm quite new to Cassandra, so it make sense
to explain the use case. I just want to try different choices of data
modelling and compare number of reads and writes. At the moment I'm not
interested in a real stress test, I just want to understand implications of
my choices, and, of course want to see OpsCenter in action. I thought that
CCM+OpsCenter combo was good as choice. Do you think that there's something
else that I can try? Thank you in advance.

giampaolo





2016-01-07 19:24 GMT+01:00 Michael Shuler <mi...@pbandjelly.org>:

> On 01/07/2016 12:22 PM, Michael Shuler wrote:
> >> [<Node 127.0.0.4='4611686018427387904'>, <Node 127.0.0.3='0'>, <Node
> > 127.0.0.2='-4611686018427387904'>, <Node
> 127.0.0.1='-9223372036854775808'>]
> >
> > A couple of those, .4 and .2 are identical.
>
> Sorry, they are signed, so they are unique. (bad me.) Keep digging, I
> guess.
>
> --
> Michael
>

Re: Using CCM with Opscenter and manual agent installation

Posted by Michael Shuler <mi...@pbandjelly.org>.
On 01/07/2016 12:22 PM, Michael Shuler wrote:
>> [<Node 127.0.0.4='4611686018427387904'>, <Node 127.0.0.3='0'>, <Node
> 127.0.0.2='-4611686018427387904'>, <Node 127.0.0.1='-9223372036854775808'>]
> 
> A couple of those, .4 and .2 are identical.

Sorry, they are signed, so they are unique. (bad me.) Keep digging, I guess.

-- 
Michael

Re: Using CCM with Opscenter and manual agent installation

Posted by Michael Shuler <mi...@pbandjelly.org>.
On 01/07/2016 02:09 AM, Giampaolo Trapasso wrote:
> I installed with CCM a 4 nodes local cluster (127.0.0.1-127.0.0.4).
> 
> I'm trying to manually install the agents to use OpsCenter
<big snippage>

This is not intended to be discouraging, but the lack of response on SO
is likely due to few to no people using these tools together.  OpsCenter
is intended to be a resource/network/cluster monitoring and utility tool
for production clusters. CCM is intended to be a disposable local dev
scratch cluster tool. I'm not sure the use cases align that would make
sense (to me) to use them together.

That said, dig around your configurations in
~/.ccm/mycluster/node*/conf/ to see if there's something colliding that
somehow seems to overlap IP addresses, node names, or something like
that. This makes me think the above:

> [<Node 127.0.0.4='4611686018427387904'>, <Node 127.0.0.3='0'>, <Node
127.0.0.2='-4611686018427387904'>, <Node 127.0.0.1='-9223372036854775808'>]

A couple of those, .4 and .2 are identical.

This is CCM, after all, so blow your disposable test cluster away, clear
out remnants ~/.ccm/, if there are any, start another, and try again!

-- 
Kind regards,
Michael

Re: Using CCM with Opscenter and manual agent installation

Posted by Giampaolo Trapasso <gi...@radicalbit.io>.
> I believe the issue is just jmx_host needing to be set to 'localhost'
Yes, that solved. Thanks!

giampaolo


2016-01-08 5:17 GMT+01:00 Nick Bailey <ni...@datastax.com>:

> stomp_interface is the address to connect back to the central OpsCenter
> daemon with, so 127.0.0.1 should be correct. I believe the issue is just
> jmx_host needing to be set to 'localhost'
>
> On Thu, Jan 7, 2016 at 8:50 PM, Michael Shuler <mi...@pbandjelly.org>
> wrote:
>
>> On 01/07/2016 08:46 PM, Michael Shuler wrote:
>> > I'm not sure exactly what that service is, but if all 4 nodes (which are
>> > all really localhost aliases) are attempting to bind to the same IP:port
>> > for that stomp connection, they could be stepping on one another. Should
>> > those be 127.0.0.1 for node1, 127.0.0.12 for node2, etc.?
>>
>> Since accurate typing is eluding me..
>>
>> Should the stomp connection be 127.0.0.1 for node1, 127.0.0.2 for node2,
>> 127.0.0.3 for node3, 127.0.0.4 for node4?
>>
>> --
>> :)
>> Michael
>>
>
>

Re: Using CCM with Opscenter and manual agent installation

Posted by Michael Shuler <mi...@pbandjelly.org>.
On 01/07/2016 10:17 PM, Nick Bailey wrote:
> stomp_interface is the address to connect back to the central OpsCenter
> daemon with, so 127.0.0.1 should be correct. I believe the issue is just
> jmx_host needing to be set to 'localhost'

This indeed looks promising, thanks Nick!

mshuler@hana:~$ ccm status
Cluster: 'test'
---------------
node1: UP
node3: UP
node2: UP
mshuler@hana:~$ netstat -ltunp|egrep '7.00'|grep java
(Not all processes could be identified, non-owned process info
 will not be shown, you would have to be root to see it all.)
tcp        0      0 127.0.0.1:7100          0.0.0.0:*
LISTEN      19006/java
tcp        0      0 127.0.0.1:7200          0.0.0.0:*
LISTEN      18994/java
tcp        0      0 127.0.0.1:7300          0.0.0.0:*
LISTEN      19021/java
tcp        0      0 127.0.0.3:7000          0.0.0.0:*
LISTEN      19021/java
tcp        0      0 127.0.0.2:7000          0.0.0.0:*
LISTEN      18994/java
tcp        0      0 127.0.0.1:7000          0.0.0.0:*
LISTEN      19006/java

-- 
Kind regards,
Michael

Re: Using CCM with Opscenter and manual agent installation

Posted by Nick Bailey <ni...@datastax.com>.
stomp_interface is the address to connect back to the central OpsCenter
daemon with, so 127.0.0.1 should be correct. I believe the issue is just
jmx_host needing to be set to 'localhost'

On Thu, Jan 7, 2016 at 8:50 PM, Michael Shuler <mi...@pbandjelly.org>
wrote:

> On 01/07/2016 08:46 PM, Michael Shuler wrote:
> > I'm not sure exactly what that service is, but if all 4 nodes (which are
> > all really localhost aliases) are attempting to bind to the same IP:port
> > for that stomp connection, they could be stepping on one another. Should
> > those be 127.0.0.1 for node1, 127.0.0.12 for node2, etc.?
>
> Since accurate typing is eluding me..
>
> Should the stomp connection be 127.0.0.1 for node1, 127.0.0.2 for node2,
> 127.0.0.3 for node3, 127.0.0.4 for node4?
>
> --
> :)
> Michael
>

Re: Using CCM with Opscenter and manual agent installation

Posted by Michael Shuler <mi...@pbandjelly.org>.
On 01/07/2016 08:46 PM, Michael Shuler wrote:
> I'm not sure exactly what that service is, but if all 4 nodes (which are
> all really localhost aliases) are attempting to bind to the same IP:port
> for that stomp connection, they could be stepping on one another. Should
> those be 127.0.0.1 for node1, 127.0.0.12 for node2, etc.?

Since accurate typing is eluding me..

Should the stomp connection be 127.0.0.1 for node1, 127.0.0.2 for node2,
127.0.0.3 for node3, 127.0.0.4 for node4?

-- 
:)
Michael

Re: Using CCM with Opscenter and manual agent installation

Posted by Michael Shuler <mi...@pbandjelly.org>.
On 01/07/2016 02:09 AM, Giampaolo Trapasso wrote:
> I've configured all the four agents. For example /agent3/ configuration is
> 
> |[Giampaolo]: ~/opscenter/> cat agent3/conf/address.yaml stomp_interface:
> "127.0.0.1" agent_rpc_interface: 127.0.0.3 jmx_host: 127.0.0.3 jmx_port:
> 7300 |

This looks suspect. Each agent is configured for
stomp_interface:"127.0.0.1"?

I'm not sure exactly what that service is, but if all 4 nodes (which are
all really localhost aliases) are attempting to bind to the same IP:port
for that stomp connection, they could be stepping on one another. Should
those be 127.0.0.1 for node1, 127.0.0.12 for node2, etc.?

-- 
Michael