[Commits] 53d0e921794: MDEV-11094: Blackhole table updates on slave fail when row annotation is enabled

sujatha sujatha.sivakumar at mariadb.com
Thu May 23 11:56:58 EEST 2019


Hello Andrei,

Good morning.

Thank you for the review comments. Please find my replies inline.

On 22/05/19 10:22 PM, andrei.elkin at pp.inet.fi wrote:
> Sujatha, salute!
>
> Thanks for a detailed description! I would add up to it a minor piece
>
>> revision-id: 53d0e9217943719b806ef9fa1cac0a699df4839c (mariadb-10.1.39-31-g53d0e921794)
>> parent(s): 47637a3dd13d19e897a7cbfd1499f1bf3b2fdb2a
>> author: Sujatha
>> committer: Sujatha
>> timestamp: 2019-05-17 13:11:49 +0530
>> message:
>>
>> MDEV-11094: Blackhole table updates on slave fail when row annotation is enabled
>>
>> Problem:
>> =======
>> rpl_blackhole.test fails when executed with following options
>> mysqld=--binlog_annotate_row_events=1, mysqld=--replicate_annotate_row_events=1
>>
>> Test output:
>> ------------
>> worker[1] Using MTR_BUILD_THREAD 300, with reserved ports 16000..16019
>> rpl.rpl_blackhole_bug 'mix'              [ pass ]    791
>> rpl.rpl_blackhole_bug 'row'              [ fail ]
>> Replicate_Wild_Ignore_Table
>> Last_Errno	1032
>> Last_Error	Could not execute Update_rows_v1 event on table test.t1; Can't find
>> record in 't1', Error_code: 1032; handler error HA_ERR_END_OF_FILE; the event's
>> master log master-bin.000001, end_log_pos 1510
>>
>> Analysis:
>> =========
>> Enabling "replicate_annotate_row_events" on slave, Tells the slave to write
>> annotate rows events received from the master to its own binary log. The
>> received annotate events are applied after the Gtid event as shown below.
>> thd->query() will be set to the actual query received from the master, through
>> annotate event. Annotate_rows event should not be deleted after the event is
>> applied as the thd->query will be used to generate new Annotate_rows event
>> during applying the subsequent Rows events.
> [ here ]
>    The Table_map and Rows_log_event that follow do not touch thd->query().

The Table_map and Rows_log_event that follow "Annotate_rows" event do 
make use of "thd->query()".

Please check the following path.

#0  write_locked_table_maps (thd=0x7fff98001910) at 
/home/sujatha/bug_repo/MDEV-11094-10.1/sql/handler.cc:5735
#1  0x0000555555c7cdb1 in binlog_log_row (table=0x7fff9801c7e0, 
before_record=0x0,
     after_record=0x7fff9801c028 "\376aUnsafe statement written to the 
binary log using statement format since BINLOG_FORMAT = STATEMENT",
     log_func=0x555555c7f8f7 
<Write_rows_log_event::binlog_row_logging_function(THD*, TABLE*, bool, 
unsigned char const*, unsigned char const*)>)
     at /home/sujatha/bug_repo/MDEV-11094-10.1/sql/handler.cc:5842
#2  0x0000555555c7d53d in handler::ha_write_row (this=0x7fff9801d428,
     buf=0x7fff9801c028 "\376aUnsafe statement written to the binary log 
using statement format since BINLOG_FORMAT = STATEMENT")
     at /home/sujatha/bug_repo/MDEV-11094-10.1/sql/handler.cc:5985
#3  0x0000555555d79397 in Rows_log_event::write_row 
(this=0x7fff98018c40, rgi=0x7fff98000cf0, overwrite=false)
     at /home/sujatha/bug_repo/MDEV-11094-10.1/sql/log_event.cc:11854
#4  0x0000555555d79d85 in Write_rows_log_event::do_exec_row 
(this=0x7fff98018c40, rgi=0x7fff98000cf0) at 
/home/sujatha/bug_repo/MDEV-11094-10.1/sql/log_event.cc:12038
#5  0x0000555555d759aa in Rows_log_event::do_apply_event 
(this=0x7fff98018c40, rgi=0x7fff98000cf0) at 
/home/sujatha/bug_repo/MDEV-11094-10.1/sql/log_event.cc:10191
#6  0x0000555555996a89 in Log_event::apply_event (this=0x7fff98018c40, 
rgi=0x7fff98000cf0) at 
/home/sujatha/bug_repo/MDEV-11094-10.1/sql/log_event.h:1343
#7  0x000055555598c21d in apply_event_and_update_pos_apply 
(ev=0x7fff98018c40, thd=0x7fff98001910, rgi=0x7fff98000cf0, reason=0)
     at /home/sujatha/bug_repo/MDEV-11094-10.1/sql/slave.cc:3479
#8  0x000055555598c6a6 in apply_event_and_update_pos (ev=0x7fff98018c40, 
thd=0x7fff98001910, rgi=0x7fff98000cf0)
     at /home/sujatha/bug_repo/MDEV-11094-10.1/sql/slave.cc:3600
#9  0x000055555598ceaa in exec_relay_log_event (thd=0x7fff98001910, 
rli=0x555557c18f20, serial_rgi=0x7fff98000cf0)
     at /home/sujatha/bug_repo/MDEV-11094-10.1/sql/slave.cc:3885
#10 0x00005555559900bd in handle_slave_sql (arg=0x555557c17250) at 
/home/sujatha/bug_repo/MDEV-11094-10.1/sql/slave.cc:4981
#11 0x0000555555f97ed1 in pfs_spawn_thread (arg=0x7fffac02e680) at 
/home/sujatha/bug_repo/MDEV-11094-10.1/storage/perfschema/pfs.cc:1861
#12 0x00007ffff75646db in start_thread (arg=0x7fffe8681700) at 
pthread_create.c:463
#13 0x00007ffff6b6688f in clone () at 
../sysdeps/unix/sysv/linux/x86_64/clone.S:95

static int write_locked_table_maps(THD *thd)
{
  ...

   if (thd->get_binlog_table_maps() == 0)
   {
     MYSQL_LOCK *locks[2];
     locks[0]= thd->extra_lock;
     locks[1]= thd->lock;
     my_bool with_annotate= thd->variables.binlog_annotate_row_events &&
                            thd->query() && thd->query_length();

On slave upon receipt of annotate event 
"thd->variables.binlog_annotate_row_events" is set within

"set_annotate_event(Annotate_rows_log_event *event)" call.  This flag 
will be used by the following rows events

like 'Write_rows", "Update_rows" and "Delete_rows" events to know that 
there is a replicated annotate event

available.  All rows events will have a "Table_map_log_event" and it is 
written by invoking "write_locked_table_maps()".

Within "write_locked_table_maps" first step is to check for replicated 
annotate event flag, and if is set "thd->query()"

is used to generate a new annotate event and it is written to the 
slave's binary log.

IMHO the recommended suggestion of resetting the query to NULL, within 
"Rows_log_event::do_apply_event" will not work.

Since the query is cleared within "Rows_log_event::do_apply_event" prior 
to the "do_exec_row" call, at the time of writing

the "Table_map_log_event" the annotation check will fail. So no annotate 
events are written to binlog on the slave.

my_bool with_annotate= thd->variables.binlog_annotate_row_events &&
                            thd->query() && thd->query_length();

The "thd->variables.binlog_annotate_row_events" is cleared within 
"free_annotate_event"  after the last row event

has been applied. Hence "thd->variables.binlog_annotate_row_events" flag 
is a key to identify

Row based replication + replicated annotate row events.

Please let me know your thoughts.

> Notice that the being suggested sentence is sufficient to explain the
> not-NULL without the sentence below (fill free to remove then):
>
>> After the last Rows event has been
>> applied, the saved Annotate_rows event (if any) will be deleted.
>>
>> master-bin.000001 | 457 | Gtid          | BEGIN GTID 0-1-2
>> master-bin.000001 | 495 | Annotate_rows | insert into t values (10)
>> master-bin.000001 | 539 | Table_map     | table_id: 19 (test.t)
>> master-bin.000001 | 579 | Write_rows_v1 | table_id: 19 flags: STMT_END_F
>> master-bin.000001 | 613 | Xid           | COMMIT /* xid=7 */
>>
>> In balckhole engine all the DML operations are noops as they donot store any
>> data. They simply return success without doing any operation. But the existing
>> strictly expects thd->query() to be 'NULL' to identify that row based
>> replication is in use. This assumption will fail when row annotations are
>> enabled as the query is not 'NULL'. Hence various row based operations like
>> 'update', 'delete', 'index lookup' will fail when row annotations are enabled.
>>
>> Fix:
>> ===
>> Extend the row based replication check to include row annotations as well.
>> i.e Either the thd->query() is NULL or thd->query() points to query and row
>> annotations are in use.
> This works, but the check does not really refers to the Annotate. So in
> principle thd->query() may turn in not-NULL of different reasons.
> Besides the check is multiplied over few places (again hopefully
> covering all).
>
> We can do it simpler actually:
>
> Rows_log_event::do_apply_event()
> {
>      ...
>      do
>      {
>        /* in_use can have been set to NULL in close_tables_for_reopen */
>        THD* old_thd= table->in_use;
>        if (!table->in_use)
>          table->in_use= thd;
>
>   --> add a check whether Annotate has been done and when so store/restore
>   --> thd->query() like `old_thd` above/below, and pass it as NULL (thd->reset_query()) to the following:
>    
>        error= do_exec_row(rgi);
>
>        if (error)
>          DBUG_PRINT("info", ("error: %s", HA_ERR(error)));
>        DBUG_ASSERT(error != HA_ERR_RECORD_DELETED);
>
>        table->in_use = old_thd;
>
>
> I also suggest to check out mysqlbinlog run. There seems to be no test
> for that though, so one needs to be written. The goal would be to prove
> a created binlog output is as expected (e.g to include Annotate).


Sure, I will add a test case.

Thank you

S.Sujatha


>
> Cheers,
>
> Andrei
>
>> ---
>>   mysql-test/suite/rpl/t/rpl_blackhole-master.opt |  1 +
>>   mysql-test/suite/rpl/t/rpl_blackhole-slave.opt  |  1 +
>>   storage/blackhole/ha_blackhole.cc               | 28 +++++++++++++++++++------
>>   3 files changed, 24 insertions(+), 6 deletions(-)
>>
>> diff --git a/mysql-test/suite/rpl/t/rpl_blackhole-master.opt b/mysql-test/suite/rpl/t/rpl_blackhole-master.opt
>> new file mode 100644
>> index 00000000000..1ad0b884c60
>> --- /dev/null
>> +++ b/mysql-test/suite/rpl/t/rpl_blackhole-master.opt
>> @@ -0,0 +1 @@
>> +--binlog_annotate_row_events
>> diff --git a/mysql-test/suite/rpl/t/rpl_blackhole-slave.opt b/mysql-test/suite/rpl/t/rpl_blackhole-slave.opt
>> new file mode 100644
>> index 00000000000..7ac6a84faa7
>> --- /dev/null
>> +++ b/mysql-test/suite/rpl/t/rpl_blackhole-slave.opt
>> @@ -0,0 +1 @@
>> +--binlog_annotate_row_events --replicate_annotate_row_events
>> diff --git a/storage/blackhole/ha_blackhole.cc b/storage/blackhole/ha_blackhole.cc
>> index 01aaa9ea15f..43bcdc541a1 100644
>> --- a/storage/blackhole/ha_blackhole.cc
>> +++ b/storage/blackhole/ha_blackhole.cc
>> @@ -25,6 +25,16 @@
>>   #include "ha_blackhole.h"
>>   #include "sql_class.h"                          // THD, SYSTEM_THREAD_SLAVE_SQL
>>   
>> +static bool is_row_based_replication(THD *thd)
>> +{
>> +  /*
>> +    A row event which has its thd->query() == NULL or a row event which has
>> +    replicate_annotate_row_events enabled. In the later case the thd->query()
>> +    will be pointing to the query, received through replicated annotate event
>> +    from master.
>> +  */
>> +  return ((thd->query() == NULL) || thd->variables.binlog_annotate_row_events);
>> +}
>>   /* Static declarations for handlerton */
>>   
>>   static handler *blackhole_create_handler(handlerton *hton,
>> @@ -109,7 +119,8 @@ int ha_blackhole::update_row(const uchar *old_data, uchar *new_data)
>>   {
>>     DBUG_ENTER("ha_blackhole::update_row");
>>     THD *thd= ha_thd();
>> -  if (thd->system_thread == SYSTEM_THREAD_SLAVE_SQL && thd->query() == NULL)
>> +  if (thd->system_thread == SYSTEM_THREAD_SLAVE_SQL &&
>> +      is_row_based_replication(thd))
>>       DBUG_RETURN(0);
>>     DBUG_RETURN(HA_ERR_WRONG_COMMAND);
>>   }
>> @@ -118,7 +129,8 @@ int ha_blackhole::delete_row(const uchar *buf)
>>   {
>>     DBUG_ENTER("ha_blackhole::delete_row");
>>     THD *thd= ha_thd();
>> -  if (thd->system_thread == SYSTEM_THREAD_SLAVE_SQL && thd->query() == NULL)
>> +  if (thd->system_thread == SYSTEM_THREAD_SLAVE_SQL &&
>> +      is_row_based_replication(thd))
>>       DBUG_RETURN(0);
>>     DBUG_RETURN(HA_ERR_WRONG_COMMAND);
>>   }
>> @@ -135,7 +147,8 @@ int ha_blackhole::rnd_next(uchar *buf)
>>     int rc;
>>     DBUG_ENTER("ha_blackhole::rnd_next");
>>     THD *thd= ha_thd();
>> -  if (thd->system_thread == SYSTEM_THREAD_SLAVE_SQL && thd->query() == NULL)
>> +  if (thd->system_thread == SYSTEM_THREAD_SLAVE_SQL &&
>> +      is_row_based_replication(thd))
>>       rc= 0;
>>     else
>>       rc= HA_ERR_END_OF_FILE;
>> @@ -220,7 +233,8 @@ int ha_blackhole::index_read_map(uchar * buf, const uchar * key,
>>     int rc;
>>     DBUG_ENTER("ha_blackhole::index_read");
>>     THD *thd= ha_thd();
>> -  if (thd->system_thread == SYSTEM_THREAD_SLAVE_SQL && thd->query() == NULL)
>> +  if (thd->system_thread == SYSTEM_THREAD_SLAVE_SQL &&
>> +      is_row_based_replication(thd))
>>       rc= 0;
>>     else
>>       rc= HA_ERR_END_OF_FILE;
>> @@ -235,7 +249,8 @@ int ha_blackhole::index_read_idx_map(uchar * buf, uint idx, const uchar * key,
>>     int rc;
>>     DBUG_ENTER("ha_blackhole::index_read_idx");
>>     THD *thd= ha_thd();
>> -  if (thd->system_thread == SYSTEM_THREAD_SLAVE_SQL && thd->query() == NULL)
>> +  if (thd->system_thread == SYSTEM_THREAD_SLAVE_SQL &&
>> +      is_row_based_replication(thd))
>>       rc= 0;
>>     else
>>       rc= HA_ERR_END_OF_FILE;
>> @@ -249,7 +264,8 @@ int ha_blackhole::index_read_last_map(uchar * buf, const uchar * key,
>>     int rc;
>>     DBUG_ENTER("ha_blackhole::index_read_last");
>>     THD *thd= ha_thd();
>> -  if (thd->system_thread == SYSTEM_THREAD_SLAVE_SQL && thd->query() == NULL)
>> +  if (thd->system_thread == SYSTEM_THREAD_SLAVE_SQL &&
>> +      is_row_based_replication(thd))
>>       rc= 0;
>>     else
>>       rc= HA_ERR_END_OF_FILE;
>> _______________________________________________
>> commits mailing list
>> commits at mariadb.org
>> https://lists.askmonty.org/cgi-bin/mailman/listinfo/commits


More information about the commits mailing list