From dcb3650d6305483c002477804c57162c466ce397 Mon Sep 17 00:00:00 2001 From: unknown Date: Wed, 6 Nov 2013 14:51:06 +0100 Subject: [PATCH] MDEV-4506: Parallel replication MDEV-5217: Incorrect MyISAM event execution order causing incorrect parallel replication In parallel replication, if transactions A,B group-commit together on the master, we can execute them in parallel on a replication slave. But then, if transaction C follows on the master, on the slave, we need to be sure that both A and B have completed before starting on C to be sure to avoid conflicts. The necessary wait is implemented such that B waits for A to commit before it commits itself (thus preserving commit order). And C waits for B to commit before it itself can start executing. This way C does not start until both A and B have completed. The wait for B's commit on A happens inside the commit processing. However, in the case of MyISAM with no binlog enabled on the slave, it appears that no commit processing takes place (since MyISAM is non-transactional), and thus the wait of B for A was not done. This allowed C to start before A, which can lead to conflicts and incorrect replication. Fixed by doing an extra wait for A at the end of B before signalling C. --- sql/rpl_parallel.cc | 24 +++++++++++++++++++++--- 1 file changed, 21 insertions(+), 3 deletions(-) diff --git a/sql/rpl_parallel.cc b/sql/rpl_parallel.cc index 4725834eae3..8a5ebf983b9 100644 --- a/sql/rpl_parallel.cc +++ b/sql/rpl_parallel.cc @@ -127,10 +127,28 @@ finish_event_group(THD *thd, int err, uint64 sub_id, /* Remove any left-over registration to wait for a prior commit to complete. Normally, such wait would already have been removed at - this point by wait_for_prior_commit(), but eg. in error case we - might have skipped waiting, so we would need to remove it explicitly. + this point by wait_for_prior_commit() called from within COMMIT + processing. However, in case of MyISAM and no binlog, we might not + have any commit processing, and so we need to do the wait here, + before waking up any subsequent commits, to preserve correct + order of event execution. Also, in the error case we might have + skipped waiting and thus need to remove it explicitly. + + It is important in the non-error case to do a wait, not just an + unregister. Because we might be last in a group-commit that is + replicated in parallel, and the following event will then wait + for us to complete and rely on this also ensuring that any other + event in the group has completed. + + But in the error case, we have to abort anyway, and it seems best + to just complete as quickly as possible with unregister. Anyone + waiting for us will in any case receive the error back from their + wait_for_prior_commit() call. */ - wfc->unregister_wait_for_prior_commit(); + if (err) + wfc->unregister_wait_for_prior_commit(); + else + wfc->wait_for_prior_commit(); thd->wait_for_commit_ptr= NULL; /*