You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@cassandra.apache.org by Kyle Quest <kc...@gmail.com> on 2011/10/31 04:21:04 UTC

cassandra.yaml mysteries :-)

I noticed a couple of things about the yaml configs in Cassandra:

seed_provider:
    # Addresses of hosts that are deemed contact points.
    # Cassandra nodes use this list of hosts to find each other and learn
    # the topology of the ring. You must change this if you are running
    # multiple nodes!
    - class_name: org.apache.cassandra.locator.SimpleSeedProvider
      parameters:
          # seeds is actually a comma-delimited list of addresses.
          # Ex: "<ip1>,<ip2>,<ip3>"
          - seeds: "127.0.0.1" <-- question 1 and 2

1. Why use yaml and then resort to manual parsing of the "seeds"
value? Why not let yaml do all of the parsing?
2. If "parameters" is a map (Map<String, String>) then why use the
"list" notation (dash in front of "seeds"), which really makes
"parameters" a list of maps... The actual Cassandra code then tries to
work around this list of maps behavior by explicitly grabbing the
first element in the list.

Re: cassandra.yaml mysteries :-)

Posted by Sylvain Lebresne <sy...@datastax.com>.
On Mon, Oct 31, 2011 at 4:36 PM, Kyle Quest <kc...@gmail.com> wrote:
> Thanks explanation Sylvain!  If we are talking about generic then it
> should be Map<String,Object>. This way you don't restrict the data
> type and you let the yaml lib parse the data. With Map<String,Object>
> my version of SimpleSeedProvider has these kind of configs without
> doing extra text parsing:
>
> version a:
>
> seed_provider:
>    - class_name: org.apache.cassandra.locator.SimpleSeedProvider
>      parameters:
>          seeds: 127.0.0.1
>
> version b:
>
> seed_provider:
>    - class_name: org.apache.cassandra.locator.SimpleSeedProvider
>      parameters:
>          seeds: [127.0.0.1, 127.0.0.2, 127.0.0.3]
>
>
> Map<String,Object> also allows my custom seed providers to have
> complex configuration parameters parsed by the yaml lib too.

Sure, we could do that. But I'm pretty sure we don't want to break
the compatibility of the config file just for that.

--
Sylvain

>
>
> On Mon, Oct 31, 2011 at 2:01 AM, Sylvain Lebresne <sy...@datastax.com> wrote:
>> Because a seed_provider can be custom, you can write your own. The only
>> one we ship by default is the SimpleSeedProvided, but you can create your
>> own that say, query some service over the network to get the list of seeds.
>> So the parameters have to be generic for that to work and having
>> the parameters be a Map<String, String> is simple and generic enough.
>>
>> --
>> Sylvain
>>
>> On Mon, Oct 31, 2011 at 4:21 AM, Kyle Quest <kc...@gmail.com> wrote:
>>> I noticed a couple of things about the yaml configs in Cassandra:
>>>
>>> seed_provider:
>>>    # Addresses of hosts that are deemed contact points.
>>>    # Cassandra nodes use this list of hosts to find each other and learn
>>>    # the topology of the ring. You must change this if you are running
>>>    # multiple nodes!
>>>    - class_name: org.apache.cassandra.locator.SimpleSeedProvider
>>>      parameters:
>>>          # seeds is actually a comma-delimited list of addresses.
>>>          # Ex: "<ip1>,<ip2>,<ip3>"
>>>          - seeds: "127.0.0.1" <-- question 1 and 2
>>>
>>> 1. Why use yaml and then resort to manual parsing of the "seeds"
>>> value? Why not let yaml do all of the parsing?
>>> 2. If "parameters" is a map (Map<String, String>) then why use the
>>> "list" notation (dash in front of "seeds"), which really makes
>>> "parameters" a list of maps... The actual Cassandra code then tries to
>>> work around this list of maps behavior by explicitly grabbing the
>>> first element in the list.
>>>
>>
>

Re: cassandra.yaml mysteries :-)

Posted by Kyle Quest <kc...@gmail.com>.
Thanks explanation Sylvain!  If we are talking about generic then it
should be Map<String,Object>. This way you don't restrict the data
type and you let the yaml lib parse the data. With Map<String,Object>
my version of SimpleSeedProvider has these kind of configs without
doing extra text parsing:

version a:

seed_provider:
    - class_name: org.apache.cassandra.locator.SimpleSeedProvider
      parameters:
          seeds: 127.0.0.1

version b:

seed_provider:
    - class_name: org.apache.cassandra.locator.SimpleSeedProvider
      parameters:
          seeds: [127.0.0.1, 127.0.0.2, 127.0.0.3]


Map<String,Object> also allows my custom seed providers to have
complex configuration parameters parsed by the yaml lib too.


On Mon, Oct 31, 2011 at 2:01 AM, Sylvain Lebresne <sy...@datastax.com> wrote:
> Because a seed_provider can be custom, you can write your own. The only
> one we ship by default is the SimpleSeedProvided, but you can create your
> own that say, query some service over the network to get the list of seeds.
> So the parameters have to be generic for that to work and having
> the parameters be a Map<String, String> is simple and generic enough.
>
> --
> Sylvain
>
> On Mon, Oct 31, 2011 at 4:21 AM, Kyle Quest <kc...@gmail.com> wrote:
>> I noticed a couple of things about the yaml configs in Cassandra:
>>
>> seed_provider:
>>    # Addresses of hosts that are deemed contact points.
>>    # Cassandra nodes use this list of hosts to find each other and learn
>>    # the topology of the ring. You must change this if you are running
>>    # multiple nodes!
>>    - class_name: org.apache.cassandra.locator.SimpleSeedProvider
>>      parameters:
>>          # seeds is actually a comma-delimited list of addresses.
>>          # Ex: "<ip1>,<ip2>,<ip3>"
>>          - seeds: "127.0.0.1" <-- question 1 and 2
>>
>> 1. Why use yaml and then resort to manual parsing of the "seeds"
>> value? Why not let yaml do all of the parsing?
>> 2. If "parameters" is a map (Map<String, String>) then why use the
>> "list" notation (dash in front of "seeds"), which really makes
>> "parameters" a list of maps... The actual Cassandra code then tries to
>> work around this list of maps behavior by explicitly grabbing the
>> first element in the list.
>>
>

Re: cassandra.yaml mysteries :-)

Posted by Sylvain Lebresne <sy...@datastax.com>.
Because a seed_provider can be custom, you can write your own. The only
one we ship by default is the SimpleSeedProvided, but you can create your
own that say, query some service over the network to get the list of seeds.
So the parameters have to be generic for that to work and having
the parameters be a Map<String, String> is simple and generic enough.

--
Sylvain

On Mon, Oct 31, 2011 at 4:21 AM, Kyle Quest <kc...@gmail.com> wrote:
> I noticed a couple of things about the yaml configs in Cassandra:
>
> seed_provider:
>    # Addresses of hosts that are deemed contact points.
>    # Cassandra nodes use this list of hosts to find each other and learn
>    # the topology of the ring. You must change this if you are running
>    # multiple nodes!
>    - class_name: org.apache.cassandra.locator.SimpleSeedProvider
>      parameters:
>          # seeds is actually a comma-delimited list of addresses.
>          # Ex: "<ip1>,<ip2>,<ip3>"
>          - seeds: "127.0.0.1" <-- question 1 and 2
>
> 1. Why use yaml and then resort to manual parsing of the "seeds"
> value? Why not let yaml do all of the parsing?
> 2. If "parameters" is a map (Map<String, String>) then why use the
> "list" notation (dash in front of "seeds"), which really makes
> "parameters" a list of maps... The actual Cassandra code then tries to
> work around this list of maps behavior by explicitly grabbing the
> first element in the list.
>