You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@kudu.apache.org by helifu <hz...@corp.netease.com> on 2017/10/26 03:29:21 UTC

答复: How kudu synchronize real-time records?

Hi,

Now the read/write operations are limited to the master replica(record1 on node1), and the copy replica(record1 on node2/node3) can't be read/write by clients directly.


何李夫
2017-04-10 11:24:24

-----邮件原件-----
发件人: user-return-1100-hzhelifu=corp.netease.com@kudu.apache.org [mailto:user-return-1100-hzhelifu=corp.netease.com@kudu.apache.org] 代表 ??
发送时间: 2017年10月26日 10:43
收件人: user@kudu.apache.org
主题: How kudu synchronize real-time records?

Hi!

I read from documents saying 'once kudu receives records from client it write those records into WAL (also does replica)'

And i wonder it can be different time when load those records from WAL in each node.
So let's say node1 load record1 from WAL at t1, node2 t2, node3 t3 (t1 < t2 < t3) then reading client attached node1 can see record but other reading clients attached not node1(node2, node3) have possibilities missing record1.

I think that does not happens in kudu, and i wonder how kudu synchronize real time data.

Thanks!


Re: 答复: 答复: How kudu synchronize real-time records?

Posted by 기준 <0c...@gmail.com>.
@Todd Lipcon

I really appreciate your kind explanation!

Now i clearly understood it.

Have a nice day!



2017-10-27 4:09 GMT+09:00 Todd Lipcon <to...@cloudera.com>:
> What Helifu said is correct that writes are funneled through the leader.
>
> Reads can either be through the leader (which can perform immediately with
> full consistency) or at a follower. On a follower, the client can choose
> between the following:
>
> a) low consistency: read whatever the follower happens to have. Currently
> this mode is called READ_LATEST in the source but it should probably be
> called READ_ANYTHING or READ_INCONSISTENT. It reads "the latest thing that
> this replica has".
> b) snapshot consistency at current time: this may cause the follower to wait
> until it has heard from the leader and knows that it is up-to-date as of the
> time that the scan started. This gives the same guarantee as reading from
> the leader but can add some latency
> c) snapshot consistency in the past: given a timestamp, the follower can
> know whether it is up-to-date as of that timestamp. If so, it can do a
> consistent read immediately. Otherwise, it will have to wait, as above.
>
> You can learn more about this in the recent blog post authored by David
> Alves at: https://kudu.apache.org/2017/09/18/kudu-consistency-pt1.html
> Also please check out the docs at:
> https://kudu.apache.org/docs/transaction_semantics.html
>
>
> Hope that helps
> -Todd
>
> On Thu, Oct 26, 2017 at 3:18 AM, helifu <hz...@corp.netease.com> wrote:
>>
>> Sorry for my mistake.
>> The copy replica could be read by clients with below API in client.h:
>>
>>         Status SetSelection(KuduClient::ReplicaSelection selection)
>>         WARN_UNUSED_RESULT;
>>
>>         enum ReplicaSelection {
>>         LEADER_ONLY,      ///< Select the LEADER replica.
>>
>>             CLOSEST_REPLICA,  ///< Select the closest replica to the
>> client,
>>                           ///< or a random one if all replicas are
>> equidistant.
>>
>>             FIRST_REPLICA     ///< Select the first replica in the list.
>>         };
>>
>>
>> 何李夫
>> 2017-04-10 16:06:24
>>
>> -----邮件原件-----
>> 发件人: user-return-1102-hzhelifu=corp.netease.com@kudu.apache.org
>> [mailto:user-return-1102-hzhelifu=corp.netease.com@kudu.apache.org] 代表 ??
>> 发送时间: 2017年10月26日 13:50
>> 收件人: user@kudu.apache.org
>> 主题: Re: 答复: How kudu synchronize real-time records?
>>
>> Thanks for replying me.
>>
>> It helps a lot.
>>
>> 2017-10-26 12:29 GMT+09:00 helifu <hz...@corp.netease.com>:
>> > Hi,
>> >
>> > Now the read/write operations are limited to the master replica(record1
>> > on node1), and the copy replica(record1 on node2/node3) can't be read/write
>> > by clients directly.
>> >
>> >
>> > 何李夫
>> > 2017-04-10 11:24:24
>> >
>> > -----邮件原件-----
>> > 发件人: user-return-1100-hzhelifu=corp.netease.com@kudu.apache.org
>> > [mailto:user-return-1100-hzhelifu=corp.netease.com@kudu.apache.org] 代表 ??
>> > 发送时间: 2017年10月26日 10:43
>> > 收件人: user@kudu.apache.org
>> > 主题: How kudu synchronize real-time records?
>> >
>> > Hi!
>> >
>> > I read from documents saying 'once kudu receives records from client it
>> > write those records into WAL (also does replica)'
>> >
>> > And i wonder it can be different time when load those records from WAL
>> > in each node.
>> > So let's say node1 load record1 from WAL at t1, node2 t2, node3 t3 (t1 <
>> > t2 < t3) then reading client attached node1 can see record but other reading
>> > clients attached not node1(node2, node3) have possibilities missing record1.
>> >
>> > I think that does not happens in kudu, and i wonder how kudu synchronize
>> > real time data.
>> >
>> > Thanks!
>> >
>>
>
>
>
> --
> Todd Lipcon
> Software Engineer, Cloudera

Re: 答复: 答复: How kudu synchronize real-time records?

Posted by Todd Lipcon <to...@cloudera.com>.
What Helifu said is correct that writes are funneled through the leader.

Reads can either be through the leader (which can perform immediately with
full consistency) or at a follower. On a follower, the client can choose
between the following:

a) low consistency: read whatever the follower happens to have. Currently
this mode is called READ_LATEST in the source but it should probably be
called READ_ANYTHING or READ_INCONSISTENT. It reads "the latest thing that
this replica has".
b) snapshot consistency at current time: this may cause the follower to
wait until it has heard from the leader and knows that it is up-to-date as
of the time that the scan started. This gives the same guarantee as reading
from the leader but can add some latency
c) snapshot consistency in the past: given a timestamp, the follower can
know whether it is up-to-date as of that timestamp. If so, it can do a
consistent read immediately. Otherwise, it will have to wait, as above.

You can learn more about this in the recent blog post authored by David
Alves at: https://kudu.apache.org/2017/09/18/kudu-consistency-pt1.html
Also please check out the docs at:
https://kudu.apache.org/docs/transaction_semantics.html


Hope that helps
-Todd

On Thu, Oct 26, 2017 at 3:18 AM, helifu <hz...@corp.netease.com> wrote:

> Sorry for my mistake.
> The copy replica could be read by clients with below API in client.h:
>
>         Status SetSelection(KuduClient::ReplicaSelection selection)
>         WARN_UNUSED_RESULT;
>
>         enum ReplicaSelection {
>         LEADER_ONLY,      ///< Select the LEADER replica.
>
>             CLOSEST_REPLICA,  ///< Select the closest replica to the
> client,
>                           ///< or a random one if all replicas are
> equidistant.
>
>             FIRST_REPLICA     ///< Select the first replica in the list.
>         };
>
>
> 何李夫
> 2017-04-10 16:06:24
>
> -----邮件原件-----
> 发件人: user-return-1102-hzhelifu=corp.netease.com@kudu.apache.org [mailto:
> user-return-1102-hzhelifu=corp.netease.com@kudu.apache.org] 代表 ??
> 发送时间: 2017年10月26日 13:50
> 收件人: user@kudu.apache.org
> 主题: Re: 答复: How kudu synchronize real-time records?
>
> Thanks for replying me.
>
> It helps a lot.
>
> 2017-10-26 12:29 GMT+09:00 helifu <hz...@corp.netease.com>:
> > Hi,
> >
> > Now the read/write operations are limited to the master replica(record1
> on node1), and the copy replica(record1 on node2/node3) can't be read/write
> by clients directly.
> >
> >
> > 何李夫
> > 2017-04-10 11:24:24
> >
> > -----邮件原件-----
> > 发件人: user-return-1100-hzhelifu=corp.netease.com@kudu.apache.org [mailto:
> user-return-1100-hzhelifu=corp.netease.com@kudu.apache.org] 代表 ??
> > 发送时间: 2017年10月26日 10:43
> > 收件人: user@kudu.apache.org
> > 主题: How kudu synchronize real-time records?
> >
> > Hi!
> >
> > I read from documents saying 'once kudu receives records from client it
> write those records into WAL (also does replica)'
> >
> > And i wonder it can be different time when load those records from WAL
> in each node.
> > So let's say node1 load record1 from WAL at t1, node2 t2, node3 t3 (t1 <
> t2 < t3) then reading client attached node1 can see record but other
> reading clients attached not node1(node2, node3) have possibilities missing
> record1.
> >
> > I think that does not happens in kudu, and i wonder how kudu synchronize
> real time data.
> >
> > Thanks!
> >
>
>


-- 
Todd Lipcon
Software Engineer, Cloudera

答复: 答复: How kudu synchronize real-time records?

Posted by helifu <hz...@corp.netease.com>.
Sorry for my mistake.
The copy replica could be read by clients with below API in client.h:

	Status SetSelection(KuduClient::ReplicaSelection selection)
    	WARN_UNUSED_RESULT;

	enum ReplicaSelection {
    	LEADER_ONLY,      ///< Select the LEADER replica.

	    CLOSEST_REPLICA,  ///< Select the closest replica to the client,
    	                  ///< or a random one if all replicas are equidistant.

	    FIRST_REPLICA     ///< Select the first replica in the list.
  	};


何李夫
2017-04-10 16:06:24

-----邮件原件-----
发件人: user-return-1102-hzhelifu=corp.netease.com@kudu.apache.org [mailto:user-return-1102-hzhelifu=corp.netease.com@kudu.apache.org] 代表 ??
发送时间: 2017年10月26日 13:50
收件人: user@kudu.apache.org
主题: Re: 答复: How kudu synchronize real-time records?

Thanks for replying me.

It helps a lot.

2017-10-26 12:29 GMT+09:00 helifu <hz...@corp.netease.com>:
> Hi,
>
> Now the read/write operations are limited to the master replica(record1 on node1), and the copy replica(record1 on node2/node3) can't be read/write by clients directly.
>
>
> 何李夫
> 2017-04-10 11:24:24
>
> -----邮件原件-----
> 发件人: user-return-1100-hzhelifu=corp.netease.com@kudu.apache.org [mailto:user-return-1100-hzhelifu=corp.netease.com@kudu.apache.org] 代表 ??
> 发送时间: 2017年10月26日 10:43
> 收件人: user@kudu.apache.org
> 主题: How kudu synchronize real-time records?
>
> Hi!
>
> I read from documents saying 'once kudu receives records from client it write those records into WAL (also does replica)'
>
> And i wonder it can be different time when load those records from WAL in each node.
> So let's say node1 load record1 from WAL at t1, node2 t2, node3 t3 (t1 < t2 < t3) then reading client attached node1 can see record but other reading clients attached not node1(node2, node3) have possibilities missing record1.
>
> I think that does not happens in kudu, and i wonder how kudu synchronize real time data.
>
> Thanks!
>


Re: 答复: How kudu synchronize real-time records?

Posted by 기준 <0c...@gmail.com>.
Thanks for replying me.

It helps a lot.

2017-10-26 12:29 GMT+09:00 helifu <hz...@corp.netease.com>:
> Hi,
>
> Now the read/write operations are limited to the master replica(record1 on node1), and the copy replica(record1 on node2/node3) can't be read/write by clients directly.
>
>
> 何李夫
> 2017-04-10 11:24:24
>
> -----邮件原件-----
> 发件人: user-return-1100-hzhelifu=corp.netease.com@kudu.apache.org [mailto:user-return-1100-hzhelifu=corp.netease.com@kudu.apache.org] 代表 ??
> 发送时间: 2017年10月26日 10:43
> 收件人: user@kudu.apache.org
> 主题: How kudu synchronize real-time records?
>
> Hi!
>
> I read from documents saying 'once kudu receives records from client it write those records into WAL (also does replica)'
>
> And i wonder it can be different time when load those records from WAL in each node.
> So let's say node1 load record1 from WAL at t1, node2 t2, node3 t3 (t1 < t2 < t3) then reading client attached node1 can see record but other reading clients attached not node1(node2, node3) have possibilities missing record1.
>
> I think that does not happens in kudu, and i wonder how kudu synchronize real time data.
>
> Thanks!
>