You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@polygene.apache.org by zhuangmz08 <zh...@qq.com> on 2016/04/20 12:04:02 UTC

回复： event streaming data feed?

OK.
"We can query an application's state to find out the current state of the world, and this answers many questions. However there are times when we don't just want to see where we are, we also want to know how we got there.

Event Sourcing ensures that all changes to application state are stored as a sequence of events. Not just can we query these events, we can also use the event log to reconstruct past states, and as a foundation to automatically adjust the state to cope with retroactive changes."
 So, event sourcing is storing all the state of an Entity in an ordered sequence.
My use case is to deal with stock price event streaming. If I use the concept of event sourcing, can I regard StockPrice as an Entity, <price, datetime> as the state.
public interface StockPrice extends EntityComposite{
    Property<Double> price();
    Property<LocalDateTime> datetime();
} 
1. Every time the state changed, it will store the changes and notify the listeners simultaneous? Notifying should be in first emergence level while storage could in a lower emergence level.
2. StockPrice data might be very huge, many Gigabytes every day. Can the [eventsourcing-jdbm] library handle this kind of data?
3. I'm using the test code [org.qi4j.library.eventsourcing.domain.DomainEventTest]. Only changing events are stored? The creating event is not stored? Every time, store the changing part<DomainEventValue> instead of the whole new state. DomainEventValue stores the parameters in JSON format. How could I use replayer? The tutorial is too short to follow...

Thanks a lot.

------------------ 原始邮件 ------------------
发件人: "Niclas Hedhman";<he...@gmail.com>;
发送时间: 2016年4月20日(星期三) 下午4:02
收件人: "dev"<de...@zest.apache.org>; 

主题: Re: event streaming data feed?

I don't know. library-eventsourcing was contributed from a downstream
project, and I haven't worked with it. And perhaps it is not in scope of
what you want to do... See Martin Fowler's and Greg Young's definition of
Event Sourcing.

For general event streaming, we are planning to have more explicit support
in 3.x, similar to the persistence support. But it is currently unclear
what core features and SPI is needed for this, and use cases are most
welcome.

There are many ways you could integrate a Kafka consumer or producer into
Zest. My guess would be that a service listens to Kafka system events and
the service creates/destroys Zest resources when needed.

Note that entities will not be able to be Kafka listeners, as you will not
be able to maintain a valid UnitOfWork while waiting.

Hope that helps
On Apr 20, 2016 13:06, "zhuangmz08" <zh...@qq.com> wrote:

> Hi,
> Is there any mechanism to use event streaming / data feeding with zest.
> I'd like something subscribe-publishing like Apache Kafka. I have a data
> feed server and a set of data feed subscribers.
> Each time a feed client subscribe a new topic (a new Value/Entity
> Composite type), the data feed server will add this subscribe into its
> listener list. As soon as the topic event takes place, it will notify all
> the related listeners.
> 1. Is event sourcing able to subscribe/unsubscribe listeners on demand?
> 1. Does event sourcing have to store every event? I don't need to persist
> every event. In other words, one of the listeners will handle event
> storage, other listeners will pay more attention on real-time event stream
> handling.
>
>
> Thanks a lot.

Re: event streaming data feed?

Posted by Niclas Hedhman <ni...@hedhman.org>.

Sounds about right...

On Wed, Apr 20, 2016 at 10:27 PM, zhuangmz08 <zh...@qq.com> wrote:

> Thank you for you long comment so much.
> I suppose the difference between event sourcing and event streaming might
> be :
> Event sourcing trace how the state is changed, thus the new state will be
> inferred as a consequence. The key word here is tracing path.
> Event streaming cares only the new state, but pays less attention on what
> is the changing chain. The key word here is result state.
> Back to the investment use case, we can use event sourcing in storing all
> the decision making,
> why we make this decision, how we change our position from yesterday...
> Time series streaming should be used as triggers, we care more about how
> to deal with this trigger
> event, rather than how the event takes place.
>
>
>
>
> ------------------ 原始邮件 ------------------
> 发件人: "Niclas Hedhman";<ni...@hedhman.org>;
> 发送时间: 2016年4月20日(星期三) 晚上7:47
> 收件人: "dev"<de...@zest.apache.org>;
>
> 主题: Re: event streaming data feed?
>
>
>
> EventSourcing is not meant for timeseries data, but to replace "state
> snapshotting", so that one can trace all the transitions from one entity
> state to another. Classic examples involves capturing the "Delete Book from
> Shopping Cart". Why was the book deleted? Was it wrong title? Was the total
> price too high and something had to be taken out? Was it because a book
> review was just read, and so on. Or, the address changed. Why? What the
> event sourcing proponents (myself included) stress is that any "UPDATE" or
> "DELETE" in a database is removing information. And instead of snapping the
> "current new state" we should capture all events regarding an entity,
> because we don't know what we could use that for later.
>
> One reason that event sourcing is working reasonably well in the real
> world, is because most entities either have a relatively short lifespan, or
> very slow update cycles. And in the few cases that is not the case,
> injection of "current state" into the event stream is a caching solution in
> the remaining cases.
>
> It may be that the ibrary-eventsourcing is heavily related to the
> library-cqrs. I know a little bit more about that library, enough to know
> that when one calls a Command method, the arguments are validated, then a
> internal Event is emitted and routed to the Event Method and the event can
> be captured, and replayed back later. Not sure if either depend on the
> other, or they are completely independent.
>
>
> For timeseries data, the situation is different. It is effectively sending
> a new snapshot all the time. And they are almost always complete,
> open-ended and not prone to data corruption if individual events are lost
> every now and then.
>
> Timeseries support is therefor orthogonal to Event Sourcing support, and
> both are orthogonal to Event/Messaging support (think; "event processing",
> "JMS", and such). All three are candidates for 3.x, but still very much not
> designed yet.
>
> As for your case, I happen to work for an investment bank, so I know
> exactly how big the quotes and trades streams become, especially for
> heavily traded stocks and currencies. I think we peak at 7million messages
> per second, across all exchange feeds in the world.
> Zest has a relatively high overhead, especially for creating composites
> (compared to "new"), so I would recommend some perf testing before
> embarking on serious dev work.
>
> 1. I am not sure I understand the question. But EntityStores has a built in
> "state change" event, which is used by the Indexing subsystems at the
> moment. That event is triggered after a successful UnitOfWork.complete().
> It might be what you are asking for.
>
> 2. I don't think JDBM is suitable for very large data stores. I would
> recommend to reuse the SPI that it implements and connect up Cassandra,
> HBase or similar big data storage solutions.
>
> 3. I don't know. I think I told you earlier, that I am completely agnostic
> of library-eventsourcing. Don't feel afraid of contributing docs, tests or
> code changes.
>
>
> Cheers
> Niclas
>
>
>
>
> On Wed, Apr 20, 2016 at 6:04 PM, zhuangmz08 <zh...@qq.com> wrote:
>
> > OK.
> > "We can query an application's state to find out the current state of the
> > world, and this answers many questions. However there are times when we
> > don't just want to see where we are, we also want to know how we got
> there.
> >
> > Event Sourcing ensures that all changes to application state are stored
> as
> > a sequence of events. Not just can we query these events, we can also use
> > the event log to reconstruct past states, and as a foundation to
> > automatically adjust the state to cope with retroactive changes."
> >  So, event sourcing is storing all the state of an Entity in an ordered
> > sequence.
> > My use case is to deal with stock price event streaming. If I use the
> > concept of event sourcing, can I regard StockPrice as an Entity, <price,
> > datetime> as the state.
> > public interface StockPrice extends EntityComposite{
> >     Property<Double> price();
> >     Property<LocalDateTime> datetime();
> > }
> > 1. Every time the state changed, it will store the changes and notify the
> > listeners simultaneous? Notifying should be in first emergence level
> while
> > storage could in a lower emergence level.
> > 2. StockPrice data might be very huge, many Gigabytes every day. Can the
> > [eventsourcing-jdbm] library handle this kind of data?
> > 3. I'm using the test code
> > [org.qi4j.library.eventsourcing.domain.DomainEventTest]. Only changing
> > events are stored? The creating event is not stored? Every time, store
> the
> > changing part<DomainEventValue> instead of the whole new state.
> > DomainEventValue stores the parameters in JSON format. How could I use
> > replayer? The tutorial is too short to follow...
> >
> >
> > Thanks a lot.
> >
> >
> > ------------------ 原始邮件 ------------------
> > 发件人: "Niclas Hedhman";<he...@gmail.com>;
> > 发送时间: 2016年4月20日(星期三) 下午4:02
> > 收件人: "dev"<de...@zest.apache.org>;
> >
> > 主题: Re: event streaming data feed?
> >
> >
> >
> > I don't know. library-eventsourcing was contributed from a downstream
> > project, and I haven't worked with it. And perhaps it is not in scope of
> > what you want to do... See Martin Fowler's and Greg Young's definition of
> > Event Sourcing.
> >
> > For general event streaming, we are planning to have more explicit
> support
> > in 3.x, similar to the persistence support. But it is currently unclear
> > what core features and SPI is needed for this, and use cases are most
> > welcome.
> >
> > There are many ways you could integrate a Kafka consumer or producer into
> > Zest. My guess would be that a service listens to Kafka system events and
> > the service creates/destroys Zest resources when needed.
> >
> > Note that entities will not be able to be Kafka listeners, as you will
> not
> > be able to maintain a valid UnitOfWork while waiting.
> >
> > Hope that helps
> > On Apr 20, 2016 13:06, "zhuangmz08" <zh...@qq.com> wrote:
> >
> > > Hi,
> > > Is there any mechanism to use event streaming / data feeding with zest.
> > > I'd like something subscribe-publishing like Apache Kafka. I have a
> data
> > > feed server and a set of data feed subscribers.
> > > Each time a feed client subscribe a new topic (a new Value/Entity
> > > Composite type), the data feed server will add this subscribe into its
> > > listener list. As soon as the topic event takes place, it will notify
> all
> > > the related listeners.
> > > 1. Is event sourcing able to subscribe/unsubscribe listeners on demand?
> > > 1. Does event sourcing have to store every event? I don't need to
> persist
> > > every event. In other words, one of the listeners will handle event
> > > storage, other listeners will pay more attention on real-time event
> > stream
> > > handling.
> > >
> > >
> > > Thanks a lot.
> >
>
>
>
> --
> Niclas Hedhman, Software Developer
> http://zest.apache.org - New Energy for Java
>



-- 
Niclas Hedhman, Software Developer
http://zest.apache.org - New Energy for Java

回复： event streaming data feed?

Posted by zhuangmz08 <zh...@qq.com>.

Thank you for you long comment so much. 
I suppose the difference between event sourcing and event streaming might be :
Event sourcing trace how the state is changed, thus the new state will be inferred as a consequence. The key word here is tracing path.
Event streaming cares only the new state, but pays less attention on what is the changing chain. The key word here is result state.
Back to the investment use case, we can use event sourcing in storing all the decision making, 
why we make this decision, how we change our position from yesterday...
Time series streaming should be used as triggers, we care more about how to deal with this trigger
event, rather than how the event takes place.

------------------ 原始邮件 ------------------
发件人: "Niclas Hedhman";<ni...@hedhman.org>;
发送时间: 2016年4月20日(星期三) 晚上7:47
收件人: "dev"<de...@zest.apache.org>; 

主题: Re: event streaming data feed?

EventSourcing is not meant for timeseries data, but to replace "state
snapshotting", so that one can trace all the transitions from one entity
state to another. Classic examples involves capturing the "Delete Book from
Shopping Cart". Why was the book deleted? Was it wrong title? Was the total
price too high and something had to be taken out? Was it because a book
review was just read, and so on. Or, the address changed. Why? What the
event sourcing proponents (myself included) stress is that any "UPDATE" or
"DELETE" in a database is removing information. And instead of snapping the
"current new state" we should capture all events regarding an entity,
because we don't know what we could use that for later.

One reason that event sourcing is working reasonably well in the real
world, is because most entities either have a relatively short lifespan, or
very slow update cycles. And in the few cases that is not the case,
injection of "current state" into the event stream is a caching solution in
the remaining cases.

It may be that the ibrary-eventsourcing is heavily related to the
library-cqrs. I know a little bit more about that library, enough to know
that when one calls a Command method, the arguments are validated, then a
internal Event is emitted and routed to the Event Method and the event can
be captured, and replayed back later. Not sure if either depend on the
other, or they are completely independent.

For timeseries data, the situation is different. It is effectively sending
a new snapshot all the time. And they are almost always complete,
open-ended and not prone to data corruption if individual events are lost
every now and then.

Timeseries support is therefor orthogonal to Event Sourcing support, and
both are orthogonal to Event/Messaging support (think; "event processing",
"JMS", and such). All three are candidates for 3.x, but still very much not
designed yet.

As for your case, I happen to work for an investment bank, so I know
exactly how big the quotes and trades streams become, especially for
heavily traded stocks and currencies. I think we peak at 7million messages
per second, across all exchange feeds in the world.
Zest has a relatively high overhead, especially for creating composites
(compared to "new"), so I would recommend some perf testing before
embarking on serious dev work.

1. I am not sure I understand the question. But EntityStores has a built in
"state change" event, which is used by the Indexing subsystems at the
moment. That event is triggered after a successful UnitOfWork.complete().
It might be what you are asking for.

2. I don't think JDBM is suitable for very large data stores. I would
recommend to reuse the SPI that it implements and connect up Cassandra,
HBase or similar big data storage solutions.

3. I don't know. I think I told you earlier, that I am completely agnostic
of library-eventsourcing. Don't feel afraid of contributing docs, tests or
code changes.

Cheers
Niclas

On Wed, Apr 20, 2016 at 6:04 PM, zhuangmz08 <zh...@qq.com> wrote:

> OK.
> "We can query an application's state to find out the current state of the
> world, and this answers many questions. However there are times when we
> don't just want to see where we are, we also want to know how we got there.
>
> Event Sourcing ensures that all changes to application state are stored as
> a sequence of events. Not just can we query these events, we can also use
> the event log to reconstruct past states, and as a foundation to
> automatically adjust the state to cope with retroactive changes."
>  So, event sourcing is storing all the state of an Entity in an ordered
> sequence.
> My use case is to deal with stock price event streaming. If I use the
> concept of event sourcing, can I regard StockPrice as an Entity, <price,
> datetime> as the state.
> public interface StockPrice extends EntityComposite{
>     Property<Double> price();
>     Property<LocalDateTime> datetime();
> }
> 1. Every time the state changed, it will store the changes and notify the
> listeners simultaneous? Notifying should be in first emergence level while
> storage could in a lower emergence level.
> 2. StockPrice data might be very huge, many Gigabytes every day. Can the
> [eventsourcing-jdbm] library handle this kind of data?
> 3. I'm using the test code
> [org.qi4j.library.eventsourcing.domain.DomainEventTest]. Only changing
> events are stored? The creating event is not stored? Every time, store the
> changing part<DomainEventValue> instead of the whole new state.
> DomainEventValue stores the parameters in JSON format. How could I use
> replayer? The tutorial is too short to follow...
>
>
> Thanks a lot.
>
>
> ------------------ 原始邮件 ------------------
> 发件人: "Niclas Hedhman";<he...@gmail.com>;
> 发送时间: 2016年4月20日(星期三) 下午4:02
> 收件人: "dev"<de...@zest.apache.org>;
>
> 主题: Re: event streaming data feed?
>
>
>
> I don't know. library-eventsourcing was contributed from a downstream
> project, and I haven't worked with it. And perhaps it is not in scope of
> what you want to do... See Martin Fowler's and Greg Young's definition of
> Event Sourcing.
>
> For general event streaming, we are planning to have more explicit support
> in 3.x, similar to the persistence support. But it is currently unclear
> what core features and SPI is needed for this, and use cases are most
> welcome.
>
> There are many ways you could integrate a Kafka consumer or producer into
> Zest. My guess would be that a service listens to Kafka system events and
> the service creates/destroys Zest resources when needed.
>
> Note that entities will not be able to be Kafka listeners, as you will not
> be able to maintain a valid UnitOfWork while waiting.
>
> Hope that helps
> On Apr 20, 2016 13:06, "zhuangmz08" <zh...@qq.com> wrote:
>
> > Hi,
> > Is there any mechanism to use event streaming / data feeding with zest.
> > I'd like something subscribe-publishing like Apache Kafka. I have a data
> > feed server and a set of data feed subscribers.
> > Each time a feed client subscribe a new topic (a new Value/Entity
> > Composite type), the data feed server will add this subscribe into its
> > listener list. As soon as the topic event takes place, it will notify all
> > the related listeners.
> > 1. Is event sourcing able to subscribe/unsubscribe listeners on demand?
> > 1. Does event sourcing have to store every event? I don't need to persist
> > every event. In other words, one of the listeners will handle event
> > storage, other listeners will pay more attention on real-time event
> stream
> > handling.
> >
> >
> > Thanks a lot.
>

-- 
Niclas Hedhman, Software Developer
http://zest.apache.org - New Energy for Java

Re: event streaming data feed?

Posted by Niclas Hedhman <ni...@hedhman.org>.

EventSourcing is not meant for timeseries data, but to replace "state
snapshotting", so that one can trace all the transitions from one entity
state to another. Classic examples involves capturing the "Delete Book from
Shopping Cart". Why was the book deleted? Was it wrong title? Was the total
price too high and something had to be taken out? Was it because a book
review was just read, and so on. Or, the address changed. Why? What the
event sourcing proponents (myself included) stress is that any "UPDATE" or
"DELETE" in a database is removing information. And instead of snapping the
"current new state" we should capture all events regarding an entity,
because we don't know what we could use that for later.

One reason that event sourcing is working reasonably well in the real
world, is because most entities either have a relatively short lifespan, or
very slow update cycles. And in the few cases that is not the case,
injection of "current state" into the event stream is a caching solution in
the remaining cases.

It may be that the ibrary-eventsourcing is heavily related to the
library-cqrs. I know a little bit more about that library, enough to know
that when one calls a Command method, the arguments are validated, then a
internal Event is emitted and routed to the Event Method and the event can
be captured, and replayed back later. Not sure if either depend on the
other, or they are completely independent.

For timeseries data, the situation is different. It is effectively sending
a new snapshot all the time. And they are almost always complete,
open-ended and not prone to data corruption if individual events are lost
every now and then.

Timeseries support is therefor orthogonal to Event Sourcing support, and
both are orthogonal to Event/Messaging support (think; "event processing",
"JMS", and such). All three are candidates for 3.x, but still very much not
designed yet.

As for your case, I happen to work for an investment bank, so I know
exactly how big the quotes and trades streams become, especially for
heavily traded stocks and currencies. I think we peak at 7million messages
per second, across all exchange feeds in the world.
Zest has a relatively high overhead, especially for creating composites
(compared to "new"), so I would recommend some perf testing before
embarking on serious dev work.

1. I am not sure I understand the question. But EntityStores has a built in
"state change" event, which is used by the Indexing subsystems at the
moment. That event is triggered after a successful UnitOfWork.complete().
It might be what you are asking for.

2. I don't think JDBM is suitable for very large data stores. I would
recommend to reuse the SPI that it implements and connect up Cassandra,
HBase or similar big data storage solutions.

3. I don't know. I think I told you earlier, that I am completely agnostic
of library-eventsourcing. Don't feel afraid of contributing docs, tests or
code changes.

Cheers
Niclas

On Wed, Apr 20, 2016 at 6:04 PM, zhuangmz08 <zh...@qq.com> wrote:

> OK.
> "We can query an application's state to find out the current state of the
> world, and this answers many questions. However there are times when we
> don't just want to see where we are, we also want to know how we got there.
>
> Event Sourcing ensures that all changes to application state are stored as
> a sequence of events. Not just can we query these events, we can also use
> the event log to reconstruct past states, and as a foundation to
> automatically adjust the state to cope with retroactive changes."
>  So, event sourcing is storing all the state of an Entity in an ordered
> sequence.
> My use case is to deal with stock price event streaming. If I use the
> concept of event sourcing, can I regard StockPrice as an Entity, <price,
> datetime> as the state.
> public interface StockPrice extends EntityComposite{
>     Property<Double> price();
>     Property<LocalDateTime> datetime();
> }
> 1. Every time the state changed, it will store the changes and notify the
> listeners simultaneous? Notifying should be in first emergence level while
> storage could in a lower emergence level.
> 2. StockPrice data might be very huge, many Gigabytes every day. Can the
> [eventsourcing-jdbm] library handle this kind of data?
> 3. I'm using the test code
> [org.qi4j.library.eventsourcing.domain.DomainEventTest]. Only changing
> events are stored? The creating event is not stored? Every time, store the
> changing part<DomainEventValue> instead of the whole new state.
> DomainEventValue stores the parameters in JSON format. How could I use
> replayer? The tutorial is too short to follow...
>
>
> Thanks a lot.
>
>
> ------------------ 原始邮件 ------------------
> 发件人: "Niclas Hedhman";<he...@gmail.com>;
> 发送时间: 2016年4月20日(星期三) 下午4:02
> 收件人: "dev"<de...@zest.apache.org>;
>
> 主题: Re: event streaming data feed?
>
>
>
> I don't know. library-eventsourcing was contributed from a downstream
> project, and I haven't worked with it. And perhaps it is not in scope of
> what you want to do... See Martin Fowler's and Greg Young's definition of
> Event Sourcing.
>
> For general event streaming, we are planning to have more explicit support
> in 3.x, similar to the persistence support. But it is currently unclear
> what core features and SPI is needed for this, and use cases are most
> welcome.
>
> There are many ways you could integrate a Kafka consumer or producer into
> Zest. My guess would be that a service listens to Kafka system events and
> the service creates/destroys Zest resources when needed.
>
> Note that entities will not be able to be Kafka listeners, as you will not
> be able to maintain a valid UnitOfWork while waiting.
>
> Hope that helps
> On Apr 20, 2016 13:06, "zhuangmz08" <zh...@qq.com> wrote:
>
> > Hi,
> > Is there any mechanism to use event streaming / data feeding with zest.
> > I'd like something subscribe-publishing like Apache Kafka. I have a data
> > feed server and a set of data feed subscribers.
> > Each time a feed client subscribe a new topic (a new Value/Entity
> > Composite type), the data feed server will add this subscribe into its
> > listener list. As soon as the topic event takes place, it will notify all
> > the related listeners.
> > 1. Is event sourcing able to subscribe/unsubscribe listeners on demand?
> > 1. Does event sourcing have to store every event? I don't need to persist
> > every event. In other words, one of the listeners will handle event
> > storage, other listeners will pay more attention on real-time event
> stream
> > handling.
> >
> >
> > Thanks a lot.
>

-- 
Niclas Hedhman, Software Developer
http://zest.apache.org - New Energy for Java