You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@cassandra.apache.org by wateray <wa...@163.com> on 2015/11/20 18:07:05 UTC

Does the rebuild tools rebuild all each time it start Or rebuild the rest?

we want deploy one more data-center for data safe.
As we rebuild one node's data from the old DC, after some hours rebuild failure due to network fault.
I can restart rebuild surely,but I'm afraid restart rebuild,
is it rebuild all rang of tokens which belong to the node or just rebuild the rest rang of tokens from last rebuild.(since last rebuild we get some data).

As I view the source, I see this code.

class RangeStreamer method getRangeFetchMap

private static Multimap<InetAddress, Range<Token>> getRangeFetchMap(Multimap<Range<Token>, InetAddress> rangesWithSources, Collection<ISourceFilter> sourceFilters, String keyspace)
    {
        Multimap<InetAddress, Range<Token>> rangeFetchMapMap = HashMultimap.create();
        for (Range<Token> range : rangesWithSources.keySet())
        {
            boolean foundSource = false;

            outer:
            for (InetAddress address : rangesWithSources.get(range))
            {
                if (address.equals(FBUtilities.getBroadcastAddress()))
                {
                    // If localhost is a source, we have found one, but we don't add it to the map to avoid streaming locally
                    foundSource = true;
                    continue;
                }

                for (ISourceFilter filter : sourceFilters)
                {
                    if (!filter.shouldInclude(address))
                        continue outer;
                }

                rangeFetchMapMap.put(address, range);
                foundSource = true;
                break; // ensure we only stream from one other node for each range
            }

            if (!foundSource)
                throw new IllegalStateException("unable to find sufficient sources for streaming range " + range + " in keyspace " + keyspace);
        }

        return rangeFetchMapMap;
    }

The bold lines ,when found the address is localhost, It continue to find others and then put into the rangeFetchMapMap。
I think the continue key word should be break, if it just want rebuild the data it doesn't have. Is it right?


Best regards!





 

Re: Streamsession's timeout is not reasonable

Posted by Paulo Motta <pa...@gmail.com>.
Please use the user@cassandra.apache.org list for cassandra-related
questions. This list (dev@cassandra.apache.org) is exclusive for cassandra
development purposes.

Thanks,

Paulo

2015-12-01 0:50 GMT-08:00 wateray <wa...@163.com>:

>
>
> As the preview message, see below, after some hours rebuild failure, we
> found it is due to timeout.
> The transfer side incoming socket read timeout( as
> streaming_socket_timeout_in_ms  default one hours), then the whole
> streamsession fail.
>
>
> As rebuild going the transfer rate will slow down, the transferring file
> can't accomplish in the timeout time. The transfer side didn't receive any
> byte (expected RECEIVED message), then the incoming socket raised timeout.
>
>
>  As incoming and outgoing belong to the streamsession, To determine
> timeout,we can't test incoming alone, as outgoing is streaming(transferring
> file is continue especially large file, low speed). In other words, when
> file is transferring, we can't raise timeout.
>
>
> Question again:
>   Will re-rebuild rebuild all rang of tokens which belong to the node or
> just rebuild the rest rang of tokens from last rebuild.(since last rebuild
> we get some data).
>
> Please excuse me for my poor English.
>
>
>
> ===========================================================================
> At 2015-11-21 01:07:05, "wateray" <wa...@163.com> wrote:
> >we want deploy one more data-center for data safe.
> >As we rebuild one node's data from the old DC, after some hours rebuild
> failure due to network fault.
> >I can restart rebuild surely,but I'm afraid restart rebuild,
> >is it rebuild all rang of tokens which belong to the node or just rebuild
> the rest rang of tokens from last rebuild.(since last rebuild we get some
> data).
> >
> >As I view the source, I see this code.
> >
> >class RangeStreamer method getRangeFetchMap
> >
> >private static Multimap<InetAddress, Range<Token>>
> getRangeFetchMap(Multimap<Range<Token>, InetAddress> rangesWithSources,
> Collection<ISourceFilter> sourceFilters, String keyspace)
> >    {
> >        Multimap<InetAddress, Range<Token>> rangeFetchMapMap =
> HashMultimap.create();
> >        for (Range<Token> range : rangesWithSources.keySet())
> >        {
> >            boolean foundSource = false;
> >
> >            outer:
> >            for (InetAddress address : rangesWithSources.get(range))
> >            {
> >                if (address.equals(FBUtilities.getBroadcastAddress()))
> >                {
> >                    // If localhost is a source, we have found one, but
> we don't add it to the map to avoid streaming locally
> >                    foundSource = true;
> >                    continue;
> >                }
> >
> >                for (ISourceFilter filter : sourceFilters)
> >                {
> >                    if (!filter.shouldInclude(address))
> >                        continue outer;
> >                }
> >
> >                rangeFetchMapMap.put(address, range);
> >                foundSource = true;
> >                break; // ensure we only stream from one other node for
> each range
> >            }
> >
> >            if (!foundSource)
> >                throw new IllegalStateException("unable to find
> sufficient sources for streaming range " + range + " in keyspace " +
> keyspace);
> >        }
> >
> >        return rangeFetchMapMap;
> >    }
> >
> >The bold lines ,when found the address is localhost, It continue to find
> others and then put into the rangeFetchMapMap。
> >I think the continue key word should be break, if it just want rebuild
> the data it doesn't have. Is it right?
> >
> >
> >Best regards!
> >
> >
> >
> >
> >
> >
>

Streamsession's timeout is not reasonable

Posted by wateray <wa...@163.com>.

As the preview message, see below, after some hours rebuild failure, we found it is due to timeout.
The transfer side incoming socket read timeout( as streaming_socket_timeout_in_ms  default one hours), then the whole streamsession fail.


As rebuild going the transfer rate will slow down, the transferring file can't accomplish in the timeout time. The transfer side didn't receive any byte (expected RECEIVED message), then the incoming socket raised timeout.


 As incoming and outgoing belong to the streamsession, To determine timeout,we can't test incoming alone, as outgoing is streaming(transferring file is continue especially large file, low speed). In other words, when file is transferring, we can't raise timeout.


Question again:
  Will re-rebuild rebuild all rang of tokens which belong to the node or just rebuild the rest rang of tokens from last rebuild.(since last rebuild we get some data).

Please excuse me for my poor English.



===========================================================================
At 2015-11-21 01:07:05, "wateray" <wa...@163.com> wrote:
>we want deploy one more data-center for data safe.
>As we rebuild one node's data from the old DC, after some hours rebuild failure due to network fault.
>I can restart rebuild surely,but I'm afraid restart rebuild,
>is it rebuild all rang of tokens which belong to the node or just rebuild the rest rang of tokens from last rebuild.(since last rebuild we get some data).
>
>As I view the source, I see this code.
>
>class RangeStreamer method getRangeFetchMap
>
>private static Multimap<InetAddress, Range<Token>> getRangeFetchMap(Multimap<Range<Token>, InetAddress> rangesWithSources, Collection<ISourceFilter> sourceFilters, String keyspace)
>    {
>        Multimap<InetAddress, Range<Token>> rangeFetchMapMap = HashMultimap.create();
>        for (Range<Token> range : rangesWithSources.keySet())
>        {
>            boolean foundSource = false;
>
>            outer:
>            for (InetAddress address : rangesWithSources.get(range))
>            {
>                if (address.equals(FBUtilities.getBroadcastAddress()))
>                {
>                    // If localhost is a source, we have found one, but we don't add it to the map to avoid streaming locally
>                    foundSource = true;
>                    continue;
>                }
>
>                for (ISourceFilter filter : sourceFilters)
>                {
>                    if (!filter.shouldInclude(address))
>                        continue outer;
>                }
>
>                rangeFetchMapMap.put(address, range);
>                foundSource = true;
>                break; // ensure we only stream from one other node for each range
>            }
>
>            if (!foundSource)
>                throw new IllegalStateException("unable to find sufficient sources for streaming range " + range + " in keyspace " + keyspace);
>        }
>
>        return rangeFetchMapMap;
>    }
>
>The bold lines ,when found the address is localhost, It continue to find others and then put into the rangeFetchMapMap。
>I think the continue key word should be break, if it just want rebuild the data it doesn't have. Is it right?
>
>
>Best regards!
>
>
>
>
>
>