MariaDB-server

Author	SHA1	Message	Date
Sergei Golubchik	07de0ac69e	MDEV-20299 SET SESSION AUTHORIZATION a.k.a. "sudo"	2025-05-03 12:06:36 +02:00
Sergei Golubchik	78d23a3e60	fix error messages when a definer for SP/view is wrong - it shold be ER_MALFORMED_DEFINER, not ER_NO_SUCH_USER when one uses current_role as a definer or grantee but there's no current role - it should be ER_INVALID_ROLE not ER_MALFORMED_DEFINER when a non-existent user is specified - it should be ER_NO_SUCH_USER, which should say "The user does not exist", not "Definer does not exist" clarify ER_CANT_CHANGE_TX_CHARACTERISTICS to say what cannot be changed	2025-05-02 13:56:25 +02:00
Sergei Golubchik	02b81afff8	cleanup: THD::change_user	2025-05-02 13:56:25 +02:00
Vasilii Lakhin	40c5b62531	Fix remaining typos	2025-04-29 11:18:00 +10:00
Monty	d9c3b775b8	Comment and indentation improvements	2025-04-28 12:59:39 +03:00
Monty	f8ba5ced55	MDEV-36099 Ensure that creation and usage of temporary tables in replication is predictable MDEV-36563 Assertion `!mysql_bin_log.is_open()' failed in THD::mark_tmp_table_as_free_for_reuse The purpose of this commit is to ensure that creation and changes of temporary tables are properly and predicable logged to the binary log. It also fixes some bugs where ROW logging was used in MIXED mode, when STATEMENT would be a better (and expected) choice. In this comment STATEMENT stands for logging to binary log in STATEMENT format, MIXED stands for MIXED binlog format and ROW for ROW binlog format. New rules for logging of temporary tables - CREATE of temporary tables are now by default binlogged only if STATEMENT binlog format is used. If it is binlogged, 1 is stored in TABLE_SHARE->table_creation_was_logged. The user can change this behavior by setting create_temporary_table_binlog_formats to MIXED,STATEMENT in which case the create is logged in statement format also in MIXED mode (as before). - Changes to temporary tables are only binlogged if and only if the CREATE was logged. The logging happens under STATEMENT or MIXED. If binlog_format=ROW, temporary table changes are not binlogged. A temporary table that are changed under ROW are marked as 'not up to date in binlog' and no future row changes are logged. Any usage of this temporary table will force row logging of other tables in any future statements using the temporary table to be row logged. - DROP TEMPORARY is binlogged only of the CREATE was binlogged. Changes done: - Row logging is forced for any statement using temporary tables that are not up to date in the binary log. (Before the row logging was forced if the user has a temporary table) - If there is any changes to the temporary table that is not binlogged, the table is marked as not up to date. - TABLE_SHARE->table_creation_was_logged has a new definition for temporary tables: 0 Table creating was not logged to binary log 1 Table creating was logged to binary log and table is up to date. 2 Table creating was logged to binary log but some changes where not logged to binary log. Table is not up to date in binary log is defined as value 0 or 2. - If a multi-table-update or multi-table-delete fails then all updated temporary tables are marked as not up to date. - Enforce row logging if the query is using temporary tables that are not up to date. Before row logging was enforced if the user had any temporary tables. - When dropping temporary tables use IF EXISTS. This ensures that slave will not stop if it had crashed and lost the temporary tables. - Remove comment and version from DROP /*!4000 TEMPORARY.. generated when a connection closes that has open temporary tables. Added 'generated by server' at the end of the DROP. Bugs fixed: - When using temporary tables with commands that forced row based, like INSERT INTO temporary_table VALUES (UUID()), this was never logged which causes the temporary table to be inconsistent on master and slave. - Used binlog format is now clearly defined. It is now only depending on the current binlog_format and the tables used. Before it was depending on the user had ANY temporary tables and the state of 'current_stmt_binlog_format' set by previous queries. This also caused temporary tables to be logged to binary log in some cases. - CREATE TABLE t1 LIKE not_logged_temporary_table caused replication to stop. - Rename of not binlogged temporary tables where binlogged to binary log which caused replication to stop. Changes in behavior: - By default create_temporary_table_binlog_formats=STATEMENT, which means that CREATE TEMPORARY is not logged to binary log under MIXED binary logging. This can be changed by setting create_temporary_table_binlog_formats to MIXED,STATEMENT. - Using temporary tables that was not logged to the binary log will cause any query using them for updating other tables to be logged in ROW format. Before all queries was logged in ROW format if the user had any temporary tables, even if they were not used by the query. - Generated DROP TEMPORARY TABLE is now always using IF EXISTS and has a "generated by server" comment in the binary log. The consequences of the above is that manipulations of a lot of rows through temporary tables will by default be be slower in mixed mode. For example: BEGIN; CREATE TEMPORARY TABLE tmp AS SELECT a, b, c FROM large_table1 JOIN large_table2 ON ...; INSERT INTO other_table SELECT b, c FROM tmp WHERE a <100; DROP TEMPORARY TABLE tmp; COMMIT; By default this will create a huge entry in the binary log, compared to just a few hundred bytes in statement mode. However the change in this commit will make usage of temporary tables more reliable and predicable and is thus worth it. Using statement mode or create_temporary_table_binlog_formats can be used to avoid this issue.	2025-04-28 12:59:38 +03:00
Dmitry Shulga	ecb7c9b692	MDEV-10164: Add support for TRIGGERS that fire on multiple events Added capability to create a trigger associated with several trigger events. For this goal, the syntax of the CREATE TRIGGER statement was extended to support the syntax structure { event [ OR ... ] } for the `trigger_event` clause. Since one trigger will be able to handle several events it should be provided a way to determine what kind of event is handled on execution of a trigger. For this goal support of the clauses INSERTING, UPDATING , DELETING was added by this patch. These clauses can be used inside a trigger body to detect what kind of trigger action is currently processed using the following boilerplate: IF INSERTING THEN ... ELSIF UPDATING THEN ... ELSIF DELETING THEN ... In case one of the clauses INSERTING, UPDATING, DELETING specified in a trigger's body not matched with a trigger event type, the error ER_INCOMPATIBLE_EVENT_FLAG is emitted. After this patch be pushed, one Trigger object will be associated with several trigger events. It means that the array Table_triggers_list::triggers can contain several pointers to the same Trigger object in array members corresponding to different events. Moreover, support of several trigger events for the same trigger requires that the data members `next` and `action_order` of the Trigger class be converted to arrays to store relating information per trigger event base. Ability to specify the same trigger for different event types results in necessity to handle invalid cases on execution of the multi-event trigger, when the OLD or NEW qualifiers doesn't match a current event type against that the trigger is run. The clause OLD should produces the NULL value for INSERT event, whereas the clause NEW should produce the NULL value for DELETE event.	2025-04-19 18:36:03 +07:00
Alexander Barkov	f11504af51	MDEV-20034 Add support for the pre-defined weak SYS_REFCURSOR This patch adds support for SYS_REFCURSOR (a weakly typed cursor) for both sql_mode=ORACLE and sql_mode=DEFAULT. Works as a regular stored routine variable, parameter and return value: - can be passed as an IN parameter to stored functions and procedures - can be passed as an INOUT and OUT parameter to stored procedures - can be returned from a stored function Note, strongly typed REF CURSOR will be added separately. Note, to maintain dependencies easier, some parts of sql_class.h and item.h were moved to new header files: - select_results.h: class select_result_sink class select_result class select_result_interceptor - sp_cursor.h: class sp_cursor_statistics class sp_cursor - sp_rcontext_handler.h class Sp_rcontext_handler and its descendants The implementation consists of the following parts: - A new class sp_cursor_array deriving from Dynamic_array - A new class Statement_rcontext which contains data shared between sub-statements of a compound statement. It has a member m_statement_cursors of the sp_cursor_array data type, as well as open cursor counter. THD inherits from Statement_rcontext. - A new data type handler Type_handler_sys_refcursor in plugins/type_cursor/ It is designed to store uint16 references - positions of the cursor in THD::m_statement_cursors. - Type_handler_sys_refcursor suppresses some derived numeric features. When a SYS_REFCURSOR variable is used as an integer an error is raised. - A new abstract class sp_instr_fetch_cursor. It's needed to share the common code between "OPEN cur" (for static cursors) and "OPER cur FOR stmt" (for SYS_REFCURSORs). - New sp_instr classes: * sp_instr_copen_by_ref - OPEN sys_ref_curor FOR stmt; * sp_instr_cfetch_by_ref - FETCH sys_ref_cursor INTO targets; * sp_instr_cclose_by_ref - CLOSE sys_ref_cursor; * sp_instr_destruct_variable - to destruct SYS_REFCURSOR variables when the execution goes out of the BEGIN..END block where SYS_REFCURSOR variables are declared. - New methods in LEX: * sp_open_cursor_for_stmt - handles "OPEN sys_ref_cursor FOR stmt". * sp_add_instr_fetch_cursor - "FETCH cur INTO targets" for both static cursors and SYS_REFCURSORs. * sp_close - handles "CLOSE cur" both for static cursors and SYS_REFCURSORs. - Changes in cursor functions to handle both static cursors and SYS_REFCURSORs: * Item_func_cursor_isopen * Item_func_cursor_found * Item_func_cursor_notfound * Item_func_cursor_rowcount - A new system variable @@max_open_cursors - to limit the number of cursors (static and SYS_REFCURSORs) opened at the same time. Its allowed range is [0-65536], with 50 by default. - A new virtual method Type_handler::can_return_bool() telling if calling item->val_bool() is allowed for Items of this data type, or if otherwise the "Illegal parameter for operation" error should be raised at fix_fields() time. - New methods in Sp_rcontext_handler: * get_cursor() * get_cursor_by_ref() - A new class Sp_rcontext_handler_statement to handle top level statement wide cursors which are shared by all substatements. - A new virtual method expr_event_handler() in classes Item and Field. It's needed to close (and make available for a new OPEN) unused THD::m_statement_cursors elements which do not have any references any more. It can happen in various moments in time, e.g. * after evaluation parameters of an SQL routine * after assigning a cursor expression into a SYS_REFCURSOR variable * when leaving a BEGIN..END block with SYS_REFCURSOR variables * after setting OUT/INOUT routine actual parameters from formal parameters.	2025-04-19 10:59:58 +04:00
Sergei Golubchik	9b824e62d4	Merge branch '11.8' into main	2025-04-18 17:11:01 +02:00
Marko Mäkelä	bb1d88b6dc	Merge 11.4 into 11.8	2025-04-02 14:07:01 +03:00
Marko Mäkelä	f5bd250f5b	Merge 10.11 into 11.4	2025-03-28 13:55:21 +02:00
Marko Mäkelä	ab0f2a00b6	Merge 10.6 into 10.11	2025-03-27 08:01:47 +02:00
Sergey Vojtovich	feb1cf9086	Corrections to parent "fix typos" commmit	2025-03-14 12:08:56 +04:00
Vasilii Lakhin	717c12de0e	Fix typos in C comments inside sql/	2025-03-14 12:08:56 +04:00
Kristian Nielsen	2641409731	Fix redundant ER_PRIOR_COMMIT_FAILED in parallel replication wait_for_prior_commit() can be called multiple times per event group, only do my_error() the first time the call fails. Remove redundant set_overwrite_status() calls. Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org> Reviewed-by: Monty <monty@mariadb.org>	2025-03-11 12:45:59 +01:00
Marko Mäkelä	bb9f010432	Merge 11.4 into 11.8	2025-03-05 20:39:47 +02:00
Alexander Barkov	b7d67ceb5f	MDEV-36047 Package body variables are not allowed as FETCH targets It was not possible to use a package body variable as a fetch target: CREATE PACKAGE BODY pkg AS vc INT := 0; FUNCTION f1 RETURN INT AS CURSOR cur IS SELECT 1 AS c FROM DUAL; BEGIN OPEN cur; FETCH cur INTO vc; -- this returned "Undeclared variable: vc" error. CLOSE cur; RETURN vc; END; END; FETCH assumed that all fetch targets reside of the same sp_rcontext instance with the cursor. This patch fixes the problem. Now a cursor and its fetch target can reside in different sp_rcontext instances. Details: - Adding a helper class sp_rcontext_addr (a combination of Sp_rcontext_handler pointer and an offset in the rcontext) - Adding a new class sp_fetch_target deriving from sp_rcontext_addr. Fetch targets in "FETCH cur INTO target1, target2 ..." are now collected into this structure instead of sp_variable. sp_variable cannot be used any more to store fetch targets, because it does not have a pointer to Sp_rcontext_handler (it only has the current rcontext offset). - Removing members sp_instr_set members m_rcontext_handler and m_offset. Deriving sp_instr_set from sp_rcontext_addr instead. - Renaming sp_instr_cfetch member "List<sp_variable> m_varlist" to "List<sp_fetch_target> m_fetch_target_list". - Fixing LEX::sp_add_cfetch() to return the pointer to the created sp_fetch_target instance (instead of returning bool). This helps to make the grammar in sql_yacc.c simpler - Renaming LEX::sp_add_cfetch() to LEX::sp_add_instr_cfetch(), as `if(sp_add_cfetch())` changed its meaning to the opposite, to avoid automatic wrong merge from earlier versions. - Chaning the "List<sp_variable> vars" parameter to sp_cursor::fetch to have the data type "List<sp_fetch_target> ". - Changing the data type of "List<sp_variable> &vars" in sp_cursor::Select_fetch_into_spvars::send_data_to_variable_list() to "List<sp_fetch_target> &". - Adding THD helper methods get_rcontext() and get_variable(). - Moving the code from sql_yacc.yy into a new LEX method LEX::make_fetch_target(). - Simplifying the grammar in sql_yacc.yy using the new LEX method. Changing the data type of the bison rule sp_fetch_list from "void" to "List<sp_fetch_target> *".	2025-02-09 13:56:19 +04:00
Sergei Golubchik	ba01c2aaf0	Merge branch '11.4' into 11.7 * rpl.rpl_system_versioning_partitions updated for MDEV-32188 * innodb.row_size_error_log_warnings_3 changed error for MDEV-33658 (checks are done in a different order)	2025-02-06 16:46:36 +01:00
Sergei Golubchik	7d657fda64	Merge branch '10.11 into 11.4	2025-01-30 12:01:11 +01:00
Sergei Golubchik	e69f8cae1a	Merge branch '10.6' into 10.11	2025-01-30 11:55:13 +01:00
Sergei Golubchik	066e8d6aea	Merge branch '10.5' into 10.6	2025-01-29 11:17:38 +01:00
Nikita Malyavin	e7cc1a3047	MDEV-33658 2/2 Cannot add a foreign key on a table with a matching long UNIQUE Cannot add a foreign key on a table with a long UNIQUE multi-column index, that contains a foreign key as a prefix. Check that index algorithms match during the "generated" keys duplicate removal.	2025-01-26 16:15:46 +01:00
Nikita Malyavin	ecaedbe299	MDEV-33658 1/2 Refactoring: extract Key length initialization mysql_prepare_create_table: Extract a Key initialization part that relates to length calculation and long unique index designation. append_system_key_parts call also moves there. Move this initialization before the duplicate elimination. Extract WITHOUT OVERPLAPS check into a separate function. It had to be moved earlier in the code to preserve the order of the error checks, as in the tests.	2025-01-26 16:15:46 +01:00
Marko Mäkelä	98dbe3bfaf	Merge 10.5 into 10.6	2025-01-20 09:57:37 +02:00
Denis Protivensky	901c6c7ab6	MDEV-33064: Sync trx->wsrep state from THD on trx start InnoDB transactions may be reused after committed: - when taken from the transaction pool - during a DDL operation execution In this case wsrep flag on trx object is cleared, which may cause wrong execution logic afterwards (wsrep-related hooks are not run). Make trx->wsrep flag initialize from THD object only once on InnoDB transaction start and don't change it throughout the transaction's lifetime. The flag is reset at commit time as before. Unconditionally set wsrep=OFF for THD objects that represent InnoDB background threads. Make Wsrep_schema::store_view() operate in its own transaction. Fix streaming replication transactions' fragments rollback to not switch THD->wsrep value during transaction's execution (use THD->wsrep_ignore_table as a workaround). Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>	2025-01-14 02:17:22 +01:00
Oleksandr Byelkin	0d35fe6e57	MDEV-35326: Memory Leak in init_io_cache_ext upon SHUTDOWN The problems were that: 1) resources was freed "asimetric" normal execution in send_eof, in case of error in destructor. 2) destructor was not called in case of SP for result objects. (so if the last SP execution ended with error resorces was not freeded on reinit before execution (cleanup() called before next execution) and destructor also was not called due to lack of delete call for the object) Result cleanup() renamed to reset_for_next_ps_execution() to better reflect function(). All result method revised and freeing resources made "symetric". Destructor of result object called for SP. Added skipped invalidation in case of error in insert. Removed misleading naming of reset(thd) (could be mixed with with reset()).	2025-01-13 10:04:27 +01:00
Marko Mäkelä	15700f54c2	Merge 11.4 into 11.7	2025-01-09 09:41:38 +02:00
Marko Mäkelä	17f01186f5	Merge 10.11 into 11.4	2025-01-09 07:58:08 +02:00
Marko Mäkelä	420d9eb27f	Merge 10.6 into 10.11	2025-01-08 12:51:26 +02:00
Monty	a2d37705ca	Only print "InnoDB: Transaction was aborted..." if log_warnings >= 4 This is a minor fixup for MDEV-24035 Failing assertion UT_LIST_GET_LEN(lock.trx_locks) == 0 causing disruption and replication failure	2025-01-05 16:40:12 +02:00
Monty	e600f9aebb	MDEV-35750 Change MEM_ROOT allocation sizes to reduse calls to malloc() and avoid memory fragmentation This commit updates default memory allocations size used with MEM_ROOT objects to minimize the number of calls to malloc(). Changes: - Updated MEM_ROOT block sizes in sql_const.h - Updated MALLOC_OVERHEAD to also take into account the extra memory allocated by my_malloc() - Updated init_alloc_root() to only take MALLOC_OVERHEAD into account as buffer size, not MALLOC_OVERHEAD + sizeof(USED_MEM). - Reset mem_root->first_block_usage if and only if first block was used. - Increase MEM_ROOT buffers sized used by my_load_defaults, plugin_init, Create_tmp_table, allocate_table_share, TABLE and TABLE_SHARE. This decreases number of malloc calls during queries. - Use a small buffer for THD->main_mem_root in THD::THD. This avoids multiple malloc() call for new connections. I tried the above changes on a complex select query with 12 tables. The following shows the number of extra allocations that where used to increase the size of the MEM_ROOT buffers. Original code: - Connection to MariaDB: 9 allocations - First query run: 146 allocations - Second query run: 24 allocations Max memory allocated for thd when using with heap table: 61,262,408 Max memory allocated for thd when using Aria tmp table: 419,464 After changes: Connection to MariaDB: 0 allocations - First run: 25 allocations - Second run: 7 allocations Max memory allocated for thd when using with heap table: 61,347,424 Max memory allocated for thd when using Aria table: 529,168 The new code uses slightly more memory, but avoids memory fragmentation and is slightly faster thanks to much fewer calls to malloc(). Reviewed-by: Sergei Golubchik <serg@mariadb.org>	2025-01-05 16:40:11 +02:00
Monty	95975b921e	MDEV-35720 Add query_time to statistics Added Query_time (total time spent running queries) to status_variables. Other things: - Added SHOW_MICROSECOND_STATUS type that shows an ulonglong variable in microseconds converted to a double (in seconds). - Changed Busy_time and Cpu_time to use SHOW_MICROSECOND_STATUS, which simplified the code and avoids some double divisions for each query. Reviewed-by: Sergei Golubchik <serg@mariadb.org>	2024-12-30 16:13:20 +02:00
Marko Mäkelä	33907f9ec6	Merge 11.4 into 11.7	2024-12-02 17:51:17 +02:00
Marko Mäkelä	2719cc4925	Merge 10.11 into 11.4	2024-12-02 11:35:34 +02:00
Marko Mäkelä	3d23adb766	Merge 10.6 into 10.11	2024-11-29 13:43:17 +02:00
Marko Mäkelä	7d4077cc11	Merge 10.5 into 10.6	2024-11-29 12:37:46 +02:00
Brandon Nesterenko	840fe316d4	MDEV-34348: my_hash_get_key fixes Partial commit of the greater MDEV-34348 scope. MDEV-34348: MariaDB is violating clang-16 -Wcast-function-type-strict Change the type of my_hash_get_key to: 1) Return const 2) Change the context parameter to be const void* Also fix casting in hash adjacent areas. Reviewed By: ============ Marko Mäkelä <marko.makela@mariadb.com>	2024-11-23 08:14:22 -07:00
Oleksandr Byelkin	b12ff287ec	Merge branch '11.6' into 11.7	2024-11-10 19:22:21 +01:00
Oleksandr Byelkin	9e1fb104a3	MariaDB 11.4.4 release -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEF39AEP5WyjM2MAMF8WVvJMdM0dgFAmck77AACgkQ8WVvJMdM 0dgccQ/+Lls8fWt4D+gMPP7x+drJSO/IE/gZFt3ugbWF+/p3B2xXAs5AAE83wxEh QSbp4DCkb/9PnuakhLmzg0lFbxMUlh4rsJ1YyiuLB2J+YgKbAc36eQQf+rtYSipd DT5uRk36c9wOcOXo/mMv4APEvpPXBIBdIL4VvpKFbIOE7xT24Sp767zWXdXqrB1f JgOQdM2ct+bvSPC55oZ5p1kqyxwvd6K6+3RB3CIpwW9zrVSLg7enT3maLjj/761s jvlRae+Cv+r+Hit9XpmEH6n2FYVgIJ3o3WhdAHwN0kxKabXYTg7OCB7QxDZiUHI9 C/5goKmKaPB1PCQyuTQyLSyyK9a8nPfgn6tqw/p/ZKDQhKT9sWJv/5bSWecrVndx LLYifSTrFC/eXLzgPvCnNv/U8SjsZaAdMIKS681+qDJ0P5abghUIlGnMYTjYXuX1 1B6Vrr0bdrQ3V1CLB3tpkRjpUvicrsabtuAUAP65QnEG2G9UJXklOer+DE291Gsl f1I0o6C1zVGAOkUUD3QEYaHD8w7hlvyfKme5oXKUm3DOjaAar5UUKLdr6prxRZL4 ebhmGEy42Mf8fBYoeohIxmxgvv6h2Xd9xCukgPp8hFpqJGw8abg7JNZTTKH4h2IY J51RpD10h4eoi6WRn3opEcjexTGvZ+xNR7yYO5WxWw6VIre9IUA= =s+WW -----END PGP SIGNATURE----- Merge tag '11.4' into 11.6 MariaDB 11.4.4 release	2024-11-08 07:17:00 +01:00
Sergei Golubchik	aed5928207	cleanup: extract transaction-related part of handlerton into a separate transaction_participant structure handlerton inherits it, so handlerton itself doesn't change. but entities that only need to participate in a transaction, like binlog or online alter log, use a transaction_participant and no longer need to pretend to be a full-blown but invisible storage engine which doesn't support create table.	2024-11-05 14:00:50 -08:00
Sergei Golubchik	44c6328cbb	cleanup: thd->alloc<>() and thd->calloc<>() create templates thd->alloc<X>(n) to use instead of (X)thd->alloc(sizeof(X)n) and the same for thd->calloc(). By the default the type is char, so old usage of thd->alloc(size) works too.	2024-11-05 14:00:48 -08:00
Sergei Golubchik	32e6f8ff2e	cleanup: remove unconditional #ifdef's	2024-11-05 14:00:47 -08:00
Oleg Smirnov	a914087fab	MDEV-35307 Unexpected error WARN_SORTING_ON_TRUNCATED_LENGTH or assertion failure in diagnostics area #2 When strict mode is enabled, all warnings during `INSERT` are converted to errors regardless of their actual severity. `WARN_SORTING_ON_TRUNCATED_LENGTH` is not considered severe enough to be elevated to the ERROR level, and this commit fixes that	2024-11-05 14:52:20 +07:00
Oleksandr Byelkin	c770bce898	Merge branch '11.2' into 11.4	2024-10-30 15:11:17 +01:00
Oleksandr Byelkin	69d033d165	Merge branch '10.11' into 11.2	2024-10-29 16:42:46 +01:00
Oleksandr Byelkin	3d0fb15028	Merge branch '10.6' into 10.11	2024-10-29 15:24:38 +01:00
Oleksandr Byelkin	f00711bba2	Merge branch '10.5' into 10.6	2024-10-29 14:20:03 +01:00
Vlad Lesin	8c7786e7d5	MDEV-34690 lock_rec_unlock_unmodified() causes deadlock lock_rec_unlock_unmodified() is executed either under lock_sys.wr_lock() or under a combination of lock_sys.rd_lock() + record locks hash table cell latch. It also requests page latch to check if locked records were changed by the current transaction or not. Usually InnoDB requests page latch to find the certain record on the page, and then requests lock_sys and/or record lock hash cell latch to request record lock. lock_rec_unlock_unmodified() requests the latches in the opposite order, what causes deadlocks. One of the possible scenario for the deadlock is the following: thread 1 - lock_rec_unlock_unmodified() is invoked under locks hash table cell latch, the latch is acquired; thread 2 - purge thread acquires page latch and tries to remove delete-marked record, it invokes lock_update_delete(), which requests locks hash table cell latch, held by thread 1; thread 1 - requests page latch, held by thread 2. To fix it we need to release lock_sys.latch and/or lock hash cell latch, acquire page latch and re-acquire lock_sys related latches. When lock_sys.latch and/or lock hash cell latch are released in lock_release_on_prepare() and lock_release_on_prepare_try(), the page on which the current lock is held, can be merged. In this case the bitmap of the current lock must be cleared, and the new lock must be added to the end of trx->lock.trx_locks list, or bitmap of already existing lock must be changed. The new field trx_lock_t::set_nth_bit_calls indicates if new locks (bits in existing lock bitmaps or new lock objects) were created during the period when lock_sys was released in trx->lock.trx_locks list iteration loop in lock_release_on_prepare() or lock_release_on_prepare_try(). And, if so, we traverse the list again. The block can be freed during pages merging, what causes assertion failure in buf_page_get_gen(), as btr_block_get() passes BUF_GET as page get mode to it. That's why page_get_mode parameter was added to btr_block_get() to pass BUF_GET_POSSIBLY_FREED from lock_release_on_prepare() and lock_release_on_prepare_try() to buf_page_get_gen(). As searching for id of trx, which modified secondary index record, is quite expensive operation, restrict its usage for master. System variable was added to remove the restriction for testing simplifying. The variable exists only either for debug build or for build with -DINNODB_ENABLE_XAP_UNLOCK_UNMODIFIED_FOR_PRIMARY option to increase the probability of catching bugs for release build with RQG. Note that the code, which does primary index lookup to find out what transaction modified secondary index record, is necessary only when there is no primary key and no unique secondary key on replica with row based replication, because only in this case extra X locks on unmodified records can be set during scan phase. Reviewed by Marko Mäkelä.	2024-10-23 12:36:17 +03:00
Oleg Smirnov	fd87e01f38	MDEV-27277 Add a warning when max_sort_length is reached During a query execution some sorting and grouping operations on strings may be involved. System variable max_sort_length defines the maximum number of bytes to use when comparing strings during sorting/grouping. Thus, the comparable parts of strings may be less than their actual size, so the results of the query may be not sorted/grouped properly. To indicate that some comparisons were done on a truncated lengths, a new warning has been introduced with this commit.	2024-10-22 22:39:36 +07:00
Brandon Nesterenko	1ed30e08af	MDEV-34122: Assertion `entry' failed in Active_tranx::assert_thd_is_waiter If semi-sync is switched off then on while a transaction is in-between binlogging and waiting for an ACK, the semi-sync state of the transaction is removed, leading to a debug assertion that indicates the transaction tried to wait, but cannot receive an ACK signal. More specifically, when semi-sync is switched off, the Active_tranx list is cleared (where a transaction adds an entry to this list during binlogging), and each entry in this list saves the thread which will wait for an ACK, and the thread has the COND variable to signal to wake itself. So if the entry is lost, the Ack_receiver thread won’t be able to find the thread to wake up when an ACK comes in The fix is to ensure that the entry exists before awaiting the ACK, and if there is no entry, skip the wait. In debug builds, an informative message is written explaining that the transaction is skipping its wait. Additional debug-build only logic is added to ensure that the cause of the missing entry is due to semi-sync being turned off and on Reviewed By: ============ Kristian Nielsen <knielsen@knielsen-hq.org>	2024-10-21 15:35:54 -06:00

1 2 3 4 5 ...

4066 Commits