[Commits] Rev 3586: Backport of: olav.sandstaa at oracle.com-20120516074923-vd0dhp183vqcp2ql in file:///data0/psergey/dev2/5.3/

Sergey Petrunya psergey at askmonty.org
Sun Nov 4 21:37:59 EET 2012


At file:///data0/psergey/dev2/5.3/

------------------------------------------------------------
revno: 3586
revision-id: psergey at askmonty.org-20121010052122-1uua5ogs8pyoanbf
parent: sanja at montyprogram.com-20121005092655-1xfcw36gsdyhawu5
committer: Sergey Petrunya <psergey at askmonty.org>
branch nick: 5.3
timestamp: Wed 2012-10-10 09:21:22 +0400
message:
  Backport of: olav.sandstaa at oracle.com-20120516074923-vd0dhp183vqcp2ql
  .. into MariaDB 5.3
  
  Fix for Bug#12667154 SAME QUERY EXEC AS WHERE SUBQ GIVES DIFFERENT
                       RESULTS ON IN() & NOT IN() COMP #3
  
  This bug causes a wrong result in mysql-trunk when ICP is used
  and bad performance in mysql-5.5 and mysql-trunk.
  
  Using the query from bug report to explain what happens and causes
  the wrong result from the query when ICP is enabled:
  
  1. The t3 table contains four records. The outer query will read
     these and for each of these it will execute the subquery.
  
  2. Before the first execution of the subquery it will be optimized. In
     this case the important is what happens to the first table t1:
     -make_join_select() will call the range optimizer which decides
      that t1 should be accessed using a range scan on the k1 index
      It creates a QUICK_RANGE_SELECT object for this.
     -As the last part of optimization the ICP code pushes the
      condition down to the storage engine for table t1 on the k1 index.
  
     This produces the following information in the explain for this table:
  
       2 DEPENDENT SUBQUERY t1 range k1 k1 5 NULL 3 Using index condition; Using filesort
  
     Note the use of filesort.
  
  3. The first execution of the subquery does (among other things) due
     to the need for sorting:
     a. Call create_sort_index() which again will call find_all_keys():
     b. find_all_keys() will read the required keys for all qualifying
        rows from the storage engine. To do this it checks if it has a
        quick-select for the table. It will use the quick-select for
        reading records. In this case it will read four records from the
        storage engine (based on the range criteria). The storage engine
        will evaluate the pushed index condition for each record.
     c. At the end of create_sort_index() there is code that cleans up a
        lot of stuff on the join tab. One of the things that is cleaned
        is the select object. The result of this is that the
        quick-select object created in make_join_select is deleted.
  
  4. The second execution of the subquery does the same as the first but
     the result is different:
     a. Call create_sort_index() which again will call find_all_keys()
        (same as for the first execution)
     b. find_all_keys() will read the keys from the storage engine. To
        do this it checks if it has a quick-select for the table. Now
        there is NO quick-select object(!) (since it was deleted in
        step 3c). So find_all_keys defaults to read the table using a
        table scan instead. So instead of reading the four relevant records
        in the range it reads the entire table (6 records). It then
        evaluates the table's condition (and here it goes wrong). Since
        the entire condition has been pushed down to the storage engine
        using ICP all 6 records qualify. (Note that the storage engine
        will not evaluate the pushed index condition in this case since
        it was pushed for the k1 index and now we do a table scan
        without any index being used).
        The result is that here we return six qualifying key values
        instead of four due to not evaluating the table's condition.
     c. As above.
  
  5. The two last execution of the subquery will also produce wrong results
     for the same reason.
  
  Summary: The problem occurs due to all but the first executions of the
  subquery is done as a table scan without evaluating the table's
  condition (which is pushed to the storage engine on a different
  index). This is caused by the create_sort_index() function deleting
  the quick-select object that should have been used for executing the
  subquery as a range scan.
  
  Note that this bug in addition to causing wrong results also can
  result in bad performance due to executing the subquery using a table
  scan instead of a range scan. This is an issue in MySQL 5.5.
  
  The fix for this problem is to avoid that the Quick-select-object that
  the optimizer created is deleted when create_sort_index() is doing
  clean-up of the join-tab. This will ensure that the quick-select
  object and the corresponding pushed index condition will be available
  and used by all following executions of the subquery.
=== modified file 'mysql-test/r/subselect2.result'
--- a/mysql-test/r/subselect2.result	2011-12-24 16:55:10 +0000
+++ b/mysql-test/r/subselect2.result	2012-10-10 05:21:22 +0000
@@ -179,4 +179,23 @@
 SET optimizer_switch=@tmp_optimizer_switch;
 DROP VIEW v1;
 DROP TABLE t1,t2,t3;
+#
+# MDEV-567: Wrong result from a query with correlated subquery if ICP is allowed
+#
+CREATE TABLE t1 (a int, b int, INDEX idx(a));
+INSERT INTO t1 VALUES (9,0), (7,1), (1,9), (7,3), (2,1);
+CREATE TABLE t2 (a int, b int, INDEX idx(a));
+INSERT INTO t2 VALUES (2,1), (6,4), (7,6), (9,4);
+CREATE TABLE t3 (a int, b int);
+INSERT INTO t3 VALUES (1,0), (1,1), (1,3);
+SELECT * FROM t3
+WHERE a = (SELECT COUNT(DISTINCT t2.b) FROM t1, t2
+WHERE t1.a = t2.a AND t2.a BETWEEN 7 AND 9
+AND t3.b = t1.b
+GROUP BY t1.b);
+a	b
+1	0
+1	1
+1	3
+DROP TABLE t1, t2, t3;
 set optimizer_switch=@subselect2_test_tmp;

=== modified file 'mysql-test/t/subselect2.test'
--- a/mysql-test/t/subselect2.test	2011-12-13 22:20:47 +0000
+++ b/mysql-test/t/subselect2.test	2012-10-10 05:21:22 +0000
@@ -203,5 +203,24 @@
 DROP VIEW v1;
 DROP TABLE t1,t2,t3;
 
+--echo #
+--echo # MDEV-567: Wrong result from a query with correlated subquery if ICP is allowed
+--echo #
+CREATE TABLE t1 (a int, b int, INDEX idx(a));
+INSERT INTO t1 VALUES (9,0), (7,1), (1,9), (7,3), (2,1);
+
+CREATE TABLE t2 (a int, b int, INDEX idx(a));
+INSERT INTO t2 VALUES (2,1), (6,4), (7,6), (9,4);
+
+CREATE TABLE t3 (a int, b int);
+INSERT INTO t3 VALUES (1,0), (1,1), (1,3);
+
+SELECT * FROM t3
+  WHERE a = (SELECT COUNT(DISTINCT t2.b) FROM t1, t2
+                WHERE t1.a = t2.a AND t2.a BETWEEN 7 AND 9
+                      AND t3.b = t1.b
+              GROUP BY t1.b);
+DROP TABLE t1, t2, t3;
+
 set optimizer_switch=@subselect2_test_tmp;
 

=== modified file 'sql/sql_select.cc'
--- a/sql/sql_select.cc	2012-08-25 06:15:57 +0000
+++ b/sql/sql_select.cc	2012-10-10 05:21:22 +0000
@@ -18687,6 +18687,14 @@
   /* Currently ORDER BY ... LIMIT is not supported in subqueries. */
   DBUG_ASSERT(join->group_list || !join->is_in_subquery());
 
+  /* 
+    If we have a select->quick object that is created outside of
+    create_sort_index() and this is part of a subquery that
+    potentially can be executed multiple times then we should not
+    delete the quick object on exit from this function.
+  */
+  bool keep_quick= select && select->quick && join->join_tab_save;
+
   /*
     When there is SQL_BIG_RESULT do not sort using index for GROUP BY,
     and thus force sorting on disk unless a group min-max optimization
@@ -18738,6 +18746,7 @@
 			    get_quick_select_for_ref(thd, table, &tab->ref, 
                                                      tab->found_records))))
 	goto err;
+      DBUG_ASSERT(!keep_quick);
     }
   }
 
@@ -18769,10 +18778,25 @@
 
     tablesort_result_cache= table->sort.io_cache;
     table->sort.io_cache= NULL;
-
-    select->cleanup();				// filesort did select
-    table->quick_keys.clear_all();  // as far as we cleanup select->quick
-    table->intersect_keys.clear_all();
+    /*
+      If a quick object was created outside of create_sort_index()
+      that might be reused, then do not call select->cleanup() since
+      it will delete the quick object.
+    */
+    if (!keep_quick)
+    {
+      select->cleanup();
+      /*
+        The select object should now be ready for the next use. If it
+        is re-used then there exists a backup copy of this join tab
+        which has the pointer to it. The join tab will be restored in
+        JOIN::reset(). So here we just delete the pointer to it.
+      */
+      tab->select= NULL;
+      // If we deleted the quick select object we need to clear quick_keys
+      table->quick_keys.clear_all();
+    }
+    // Restore the output resultset
     table->sort.io_cache= tablesort_result_cache;
   }
   tab->set_select_cond(NULL, __LINE__);



More information about the commits mailing list