You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@storm.apache.org by Sa Li <sa...@gmail.com> on 2015/02/03 00:43:51 UTC

data lost by trident topology

Hi, All

I am testing my trident code, using OpaqueTridentKafkaSpout to receive data
from kafka, then write all we received into postgresDB. All the tests done
are not streaming data, I am testing the existing topics in kafka. Here
what I see, if I send small amount of message, say 1000, I have no data
lost, I can see 1000 in DB. However, if I increase the messages in that
topic, for example, 200000 messages, I see 3.7% data drop.

Which part of topology might lead to data loss. latency time, batch size,
etc? I am using trident state to populate into DB.

thanks


AL

Re: data lost by trident topology

Posted by Harsha <st...@harsha.io>.

How are you calling commits on your DB. Did you tested it multiple times
and data drop always 3.7%? . Any chance that your bolt is written
successfully to db but it didn't call commit and your probably are not
seeing the data. -Harsha

On Mon, Feb 2, 2015, at 03:43 PM, Sa Li wrote:
> Hi, All
>
> I am testing my trident code, using OpaqueTridentKafkaSpout to
> receive data from kafka, then write all we received into postgresDB.
> All the tests done are not streaming data, I am testing the existing
> topics in kafka. Here what I see, if I send small amount of message,
> say 1000, I have no data lost, I can see 1000 in DB. However, if I
> increase the messages in that topic, for example, 200000 messages, I
> see 3.7% data drop.
>
> Which part of topology might lead to data loss. latency time, batch
> size, etc? I am using trident state to populate into DB.
>
> thanks
>
>
> AL
>
>