You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@rocketmq.apache.org by "Yu Kaiyuan (JIRA)" <ji...@apache.org> on 2017/08/23 07:55:00 UTC
[jira] [Updated] (ROCKETMQ-272) The config `syncFlushTimeout`
doesn't work for SYNC_MASTER
[ https://issues.apache.org/jira/browse/ROCKETMQ-272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Yu Kaiyuan updated ROCKETMQ-272:
--------------------------------
Description:
It's quite frequent to get result as `sendStatus=FLUSH_SLAVE_TIMEOUT` when sending big messages(>500k) in SYNC_MASTER/SLAVE scenario.
The timeout value used by the sync process currently as I found, is the config `syncFlushTimeout`. And its default value is 5000 milliseconds.
But it shows that producer get the result as `FLUSH_SLAVE_TIMEOUT` less than 1 second.
So why does the config not work as expected?
Relevant code:
{code:java}
// CommitLog.java
public void handleHA(AppendMessageResult result, PutMessageResult putMessageResult, MessageExt messageExt) {
if (BrokerRole.SYNC_MASTER == this.defaultMessageStore.getMessageStoreConfig().getBrokerRole()) {
HAService service = this.defaultMessageStore.getHaService();
if (messageExt.isWaitStoreMsgOK()) {
// Determine whether to wait
if (service.isSlaveOK(result.getWroteOffset() + result.getWroteBytes())) {
GroupCommitRequest request = new GroupCommitRequest(result.getWroteOffset() + result.getWroteBytes());
service.putRequest(request);
service.getWaitNotifyObject().wakeupAll();
boolean flushOK =
request.waitForFlush(this.defaultMessageStore.getMessageStoreConfig().getSyncFlushTimeout());
if (!flushOK) {
log.error("do sync transfer other node, wait return, but failed, topic: " + messageExt.getTopic() + " tags: "
+ messageExt.getTags() + " client address: " + messageExt.getBornHostNameString());
putMessageResult.setPutMessageStatus(PutMessageStatus.FLUSH_SLAVE_TIMEOUT);
}
}
// Slave problem
else {
// Tell the producer, slave not available
putMessageResult.setPutMessageStatus(PutMessageStatus.SLAVE_NOT_AVAILABLE);
}
}
}
}
{code}
was:
It's quite frequent to get result as `sendStatus=FLUSH_SLAVE_TIMEOUT` when sending big messages(>500k) in SYNC_MASTER/SLAVE scenario.
The timeout value used by the sync process currently as I found, is the config `syncFlushTimeout`. And its default value is 5000 milliseconds.
But it shows that producer get the result as `FLUSH_SLAVE_TIMEOUT` less than 1 second.
So why does the config not work as expected?
Relevant code:
{code:java}
// CommitLog.java
public PutMessageResult putMessage(final MessageExtBrokerInner msg) {
{
// Synchronous write double
if (BrokerRole.SYNC_MASTER == this.defaultMessageStore.getMessageStoreConfig().getBrokerRole()) {
HAService service = this.defaultMessageStore.getHaService();
if (msg.isWaitStoreMsgOK()) {
// Determine whether to wait
if (service.isSlaveOK(result.getWroteOffset() + result.getWroteBytes())) {
if (null == request) {
request = new GroupCommitRequest(result.getWroteOffset() + result.getWroteBytes());
}
service.putRequest(request);
service.getWaitNotifyObject().wakeupAll();
boolean flushOK =
// TODO
request.waitForFlush(this.defaultMessageStore.getMessageStoreConfig().getSyncFlushTimeout());
if (!flushOK) {
log.error("do sync transfer other node, wait return, but failed, topic: " + msg.getTopic() + " tags: "
+ msg.getTags() + " client address: " + msg.getBornHostString());
putMessageResult.setPutMessageStatus(PutMessageStatus.FLUSH_SLAVE_TIMEOUT);
}
}
// Slave problem
else {
// Tell the producer, slave not available
putMessageResult.setPutMessageStatus(PutMessageStatus.SLAVE_NOT_AVAILABLE);
}
}
}
return putMessageResult;
}
{code}
> The config `syncFlushTimeout` doesn't work for SYNC_MASTER
> ----------------------------------------------------------
>
> Key: ROCKETMQ-272
> URL: https://issues.apache.org/jira/browse/ROCKETMQ-272
> Project: Apache RocketMQ
> Issue Type: Bug
> Components: rocketmq-broker
> Affects Versions: 4.1.0-incubating
> Reporter: Yu Kaiyuan
> Assignee: yukon
>
> It's quite frequent to get result as `sendStatus=FLUSH_SLAVE_TIMEOUT` when sending big messages(>500k) in SYNC_MASTER/SLAVE scenario.
> The timeout value used by the sync process currently as I found, is the config `syncFlushTimeout`. And its default value is 5000 milliseconds.
> But it shows that producer get the result as `FLUSH_SLAVE_TIMEOUT` less than 1 second.
> So why does the config not work as expected?
> Relevant code:
> {code:java}
> // CommitLog.java
> public void handleHA(AppendMessageResult result, PutMessageResult putMessageResult, MessageExt messageExt) {
> if (BrokerRole.SYNC_MASTER == this.defaultMessageStore.getMessageStoreConfig().getBrokerRole()) {
> HAService service = this.defaultMessageStore.getHaService();
> if (messageExt.isWaitStoreMsgOK()) {
> // Determine whether to wait
> if (service.isSlaveOK(result.getWroteOffset() + result.getWroteBytes())) {
> GroupCommitRequest request = new GroupCommitRequest(result.getWroteOffset() + result.getWroteBytes());
> service.putRequest(request);
> service.getWaitNotifyObject().wakeupAll();
> boolean flushOK =
> request.waitForFlush(this.defaultMessageStore.getMessageStoreConfig().getSyncFlushTimeout());
> if (!flushOK) {
> log.error("do sync transfer other node, wait return, but failed, topic: " + messageExt.getTopic() + " tags: "
> + messageExt.getTags() + " client address: " + messageExt.getBornHostNameString());
> putMessageResult.setPutMessageStatus(PutMessageStatus.FLUSH_SLAVE_TIMEOUT);
> }
> }
> // Slave problem
> else {
> // Tell the producer, slave not available
> putMessageResult.setPutMessageStatus(PutMessageStatus.SLAVE_NOT_AVAILABLE);
> }
> }
> }
> }
> {code}
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)