You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@storm.apache.org by Sa Li <sa...@gmail.com> on 2014/10/28 20:10:13 UTC
replace insert by copy command in trident postgresql state
Hi, all
I have developed a trident KafkaSpout to consume the json data from kafka,
and persistentAggregate does the data writer job, see
topology.newStream("topictestspout", kafkaSpout)
.each(new Fields("str"),
new JsonObjectParse(),
new Fields("userid","event"))
.groupBy(new Fields("userid"))
.persistentAggregate(PostgresqlState.newFactory(config), new
Fields("userid","event"), new EventUpdater(), new Fields( "eventword"))
.parallelismHint(16);
Basically, we want to write the data from kakfa into postgresql, current
writer does this job, but I am assuming it is kinda slow, since it performs
multiple inserts within a transaction, something like this
in the multiPut of PostgresqlState,
query =
WITH
new_values (userid,event) AS (VALUES (?,?), (?,?), (?,?), .....)INSERT
INTO test.state(userid, event) SELECT userid, event FROM new_values
But I think inserts are too slow, but COPY command is way faster, does
anyone have experience to make the stream data as a STDIN file in memory,
and then copy the bulk data into postgresql table directly?
Thanks
Alec