[Commits] a408291: MDEV-7882: Excessive transaction retry in parallel replication

Kristian Nielsen knielsen at knielsen-hq.org
Mon Mar 30 15:17:11 EEST 2015


revision-id: a4082918c8942b4c72b8f2a65cb237aeaaf10b3e
parent(s): fb71449b10100e9a0f887b1585000fbfab294f3c
committer: Kristian Nielsen
branch nick: mariadb
timestamp: 2015-03-30 14:16:57 +0200
message:

MDEV-7882: Excessive transaction retry in parallel replication

When a transaction in parallel replication needs to retry (eg. because of
deadlock kill), first wait for all prior transactions to commit before doing
the retry. This way, we avoid the retry once again conflicting with a prior
transaction, requiring yet another retry.

Without this patch, we saw "in the wild" that transactions had to be retried
more than 10 times to succeed, which exceeds the default
--slave_transaction_retries value and is in any case undesirable.

(We already do this in 10.1 in "optimistic" parallel replication mode; this
patch just makes the code use the same logic for "conservative" mode (only
mode in 10.0)).

---
 sql/rpl_parallel.cc | 31 ++++++++++++++++++++++++++++---
 1 file changed, 28 insertions(+), 3 deletions(-)

diff --git a/sql/rpl_parallel.cc b/sql/rpl_parallel.cc
index 46c3e4a..3fbc529 100644
--- a/sql/rpl_parallel.cc
+++ b/sql/rpl_parallel.cc
@@ -372,9 +372,34 @@ do_retry:
   statistic_increment(slave_retried_transactions, LOCK_status);
   mysql_mutex_unlock(&rli->data_lock);
 
-  mysql_mutex_lock(&entry->LOCK_parallel_entry);
-  register_wait_for_prior_event_group_commit(rgi, entry);
-  mysql_mutex_unlock(&entry->LOCK_parallel_entry);
+  for (;;)
+  {
+    mysql_mutex_lock(&entry->LOCK_parallel_entry);
+    register_wait_for_prior_event_group_commit(rgi, entry);
+    mysql_mutex_unlock(&entry->LOCK_parallel_entry);
+
+    /*
+      Let us wait for all prior transactions to complete before trying again.
+      This way, we avoid repeatedly conflicting with and getting deadlock
+      killed by the same earlier transaction.
+    */
+    if (!(err= thd->wait_for_prior_commit()))
+      break;
+
+    convert_kill_to_deadlock_error(rgi);
+    if (!has_temporary_error(thd))
+      goto err;
+    /*
+      If we get a temporary error such as a deadlock kill, we can safely
+      ignore it, as we already rolled back.
+
+      But we still want to retry the wait for the prior transaction to
+      complete its commit.
+    */
+    thd->clear_error();
+    if(thd->wait_for_commit_ptr)
+      thd->wait_for_commit_ptr->unregister_wait_for_prior_commit();
+  }
 
   strmake_buf(log_name, ir->name);
   if ((fd= open_binlog(&rlog, log_name, &errmsg)) <0)


More information about the commits mailing list