MariaDB-server

Author	SHA1	Message	Date
Oleksandr Byelkin	c770bce898	Merge branch '11.2' into 11.4	2024-10-30 15:11:17 +01:00
Oleksandr Byelkin	69d033d165	Merge branch '10.11' into 11.2	2024-10-29 16:42:46 +01:00
Oleksandr Byelkin	3d0fb15028	Merge branch '10.6' into 10.11	2024-10-29 15:24:38 +01:00
Oleksandr Byelkin	f00711bba2	Merge branch '10.5' into 10.6	2024-10-29 14:20:03 +01:00
Vlad Lesin	8c7786e7d5	MDEV-34690 lock_rec_unlock_unmodified() causes deadlock lock_rec_unlock_unmodified() is executed either under lock_sys.wr_lock() or under a combination of lock_sys.rd_lock() + record locks hash table cell latch. It also requests page latch to check if locked records were changed by the current transaction or not. Usually InnoDB requests page latch to find the certain record on the page, and then requests lock_sys and/or record lock hash cell latch to request record lock. lock_rec_unlock_unmodified() requests the latches in the opposite order, what causes deadlocks. One of the possible scenario for the deadlock is the following: thread 1 - lock_rec_unlock_unmodified() is invoked under locks hash table cell latch, the latch is acquired; thread 2 - purge thread acquires page latch and tries to remove delete-marked record, it invokes lock_update_delete(), which requests locks hash table cell latch, held by thread 1; thread 1 - requests page latch, held by thread 2. To fix it we need to release lock_sys.latch and/or lock hash cell latch, acquire page latch and re-acquire lock_sys related latches. When lock_sys.latch and/or lock hash cell latch are released in lock_release_on_prepare() and lock_release_on_prepare_try(), the page on which the current lock is held, can be merged. In this case the bitmap of the current lock must be cleared, and the new lock must be added to the end of trx->lock.trx_locks list, or bitmap of already existing lock must be changed. The new field trx_lock_t::set_nth_bit_calls indicates if new locks (bits in existing lock bitmaps or new lock objects) were created during the period when lock_sys was released in trx->lock.trx_locks list iteration loop in lock_release_on_prepare() or lock_release_on_prepare_try(). And, if so, we traverse the list again. The block can be freed during pages merging, what causes assertion failure in buf_page_get_gen(), as btr_block_get() passes BUF_GET as page get mode to it. That's why page_get_mode parameter was added to btr_block_get() to pass BUF_GET_POSSIBLY_FREED from lock_release_on_prepare() and lock_release_on_prepare_try() to buf_page_get_gen(). As searching for id of trx, which modified secondary index record, is quite expensive operation, restrict its usage for master. System variable was added to remove the restriction for testing simplifying. The variable exists only either for debug build or for build with -DINNODB_ENABLE_XAP_UNLOCK_UNMODIFIED_FOR_PRIMARY option to increase the probability of catching bugs for release build with RQG. Note that the code, which does primary index lookup to find out what transaction modified secondary index record, is necessary only when there is no primary key and no unique secondary key on replica with row based replication, because only in this case extra X locks on unmodified records can be set during scan phase. Reviewed by Marko Mäkelä.	2024-10-23 12:36:17 +03:00
Brandon Nesterenko	1ed30e08af	MDEV-34122: Assertion `entry' failed in Active_tranx::assert_thd_is_waiter If semi-sync is switched off then on while a transaction is in-between binlogging and waiting for an ACK, the semi-sync state of the transaction is removed, leading to a debug assertion that indicates the transaction tried to wait, but cannot receive an ACK signal. More specifically, when semi-sync is switched off, the Active_tranx list is cleared (where a transaction adds an entry to this list during binlogging), and each entry in this list saves the thread which will wait for an ACK, and the thread has the COND variable to signal to wake itself. So if the entry is lost, the Ack_receiver thread won’t be able to find the thread to wake up when an ACK comes in The fix is to ensure that the entry exists before awaiting the ACK, and if there is no entry, skip the wait. In debug builds, an informative message is written explaining that the transaction is skipping its wait. Additional debug-build only logic is added to ensure that the cause of the missing entry is due to semi-sync being turned off and on Reviewed By: ============ Kristian Nielsen <knielsen@knielsen-hq.org>	2024-10-21 15:35:54 -06:00
Sergei Golubchik	3a1cf2c85b	MDEV-34679 ER_BAD_FIELD uses non-localizable substrings	2024-10-17 21:37:37 +02:00
Monty	bddbef3573	MDEV-34533 asan error about stack overflow when writing record in Aria The problem was that when using clang + asan, we do not get a correct value for the thread stack as some local variables are not allocated at the normal stack. It looks like that for example clang 18.1.3, when compiling with -O2 -fsanitize=addressan it puts local variables and things allocated by alloca() in other areas than on the stack. The following code shows the issue Thread 6 "mariadbd" hit Breakpoint 3, do_handle_one_connection (connect=0x5080000027b8, put_in_cache=<optimized out>) at sql/sql_connect.cc:1399 THD thd; 1399 thd->thread_stack= (char) &thd; (gdb) p &thd (THD *) 0x7fffedee7060 (gdb) p $sp (void ) 0x7fffef4e7bc0 The address of thd is 24M away from the stack pointer (gdb) info reg ... rsp 0x7fffef4e7bc0 0x7fffef4e7bc0 ... r13 0x7fffedee7060 140737185214560 r13 is pointing to the address of the thd. Probably some kind of "local stack" used by the sanitizer I have verified this with gdb on a recursive call that calls alloca() in a loop. In this case all objects was stored in a local heap, not on the stack. To solve this issue in a portable way, I have added two functions: my_get_stack_pointer() returns the address of the current stack pointer. The code is using asm instructions for intel 32/64 bit, powerpc, arm 32/64 bit and sparc 32/64 bit. Supported compilers are gcc, clang and MSVC. For MSVC 64 bit we are using _AddressOfReturnAddress() As a fallback for other compilers/arch we use the address of a local variable. my_get_stack_bounds() that will return the address of the base stack and stack size using pthread_attr_getstack() or NtCurrentTed() with fallback to using the address of a local variable and user provided stack size. Server changes are: - Moving setting of thread_stack to THD::store_globals() using my_get_stack_bounds(). - Removing setting of thd->thread_stack, except in functions that allocates a lot on the stack before calling store_globals(). When using estimates for stack start, we reduce stack_size with MY_STACK_SAFE_MARGIN (8192) to take into account the stack used before calling store_globals(). I also added a unittest, stack_allocation-t, to verify the new code. Reviewed-by: Sergei Golubchik <serg@mariadb.org>	2024-10-16 17:24:46 +03:00
Sergei Golubchik	5ebda30ccc	Revert "MDEV-35019 Provide a way to enable "rollback XA on disconnect" behavior we had before 10.5.2" This reverts commit 8ae462a220b6dee3f493de3cb2fecacfc6ae610c.	2024-10-16 13:23:47 +02:00
Kristian Nielsen	8ae462a220	MDEV-35019 Provide a way to enable "rollback XA on disconnect" behavior we had before 10.5.2 Implement variable legacy_xa_rollback_at_disconnect to support backwards compatibility for applications that rely on the pre-10.5 behavior for connection disconnect, which is to rollback the transaction (in violation of the XA specification). Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>	2024-10-16 10:18:36 +02:00
Monty	6f6c1911dc	MDEV-34251 Conditional jump or move depends on uninitialised value in ha_handler_stats::has_stats Fixed by checking handler_stats if it's active instead of thd->variables.log_slow_verbosity & LOG_SLOW_VERBOSITY_ENGINE. Reviewed-by: Sergei Petrunia <sergey@mariadb.com>	2024-10-03 13:45:26 +03:00
Oleksandr Byelkin	eb70e0d6e2	Merge branch '11.2' into 11.4	2024-08-21 09:30:54 +02:00
Oleksandr Byelkin	6197e6abc4	Merge branch '10.11' into 11.2	2024-08-21 07:58:46 +02:00
Marko Mäkelä	62bfcfd8b2	Merge 10.6 into 10.11	2024-08-14 11:36:52 +03:00
Marko Mäkelä	757c368139	Merge 10.5 into 10.6	2024-08-14 10:56:11 +03:00
Jan Lindström	cd8b8bb964	MDEV-34594 : Assertion `client_state.transaction().active()' failed in int wsrep_thd_append_key(THD, const wsrep_key, int, Wsrep_service_key_type) CREATE TABLE [SELECT\|REPLACE SELECT] is CTAS and idea was that we force ROW format. However, it was not correctly enforced and keys were appended before wsrep transaction was started. At THD::decide_logging_format we should force used stmt binlog format to ROW in CTAS case and produce a warning if used binlog format was not ROW. At ha_innodb::update_row we should not append keys similarly as in ha_innodb::write_row if sql_command is SQLCOM_CREATE_TABLE. Improved error logging on ::write_row, ::update_row and ::delete_row if wsrep key append fails. Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>	2024-08-12 23:54:30 +02:00
Oleksandr Byelkin	1640c9b06e	Merge branch '11.2' into 11.4	2024-08-04 17:27:48 +02:00
Oleksandr Byelkin	dced6cbdb6	Merge branch '11.1' into 11.2	2024-08-03 09:50:16 +02:00
Oleksandr Byelkin	80abd847da	Merge branch '10.11' into 11.1	2024-08-03 09:32:42 +02:00
Oleksandr Byelkin	0e8fb977b0	Merge branch '10.6' into 10.11	2024-08-03 09:15:40 +02:00
Monty	4bf7c966b3	MDEV-34664: Add an option to fix InnoDB's doubling of secondary index cardinalities (With trivial fixes by sergey@mariadb.com) Added option fix_innodb_cardinality to optimizer_adjust_secondary_key_costs Using fix_innodb_cardinality disables the 'divide by 2' of rec_per_key_int in InnoDB that in effect doubles the Cardinality for secondary keys. This has the biggest effect for indexes where a few rows has the same key value. Using this may also cause table scans for very small tables (which in some cases may be better than an index scan). The user visible effect is that 'SHOW INDEX FROM table_name' will for InnoDB show the true Cardinality (and not 2x the real value). It will also allow the optimizer to chose a better index in some cases as the division by 2 could have a bad effect for tables with 2-5 identical values per key. A few notes about using fix_innodb_cardinality: - It has direct affect for SHOW INDEX FROM table_name. SHOW INDEX will also update the statistics in table share. - The effect of fix_innodb_cardinality for query plans or EXPLAIN is only visible after first open of the table. This is why one must do a flush tables or use SHOW INDEX for the option to take effect. - Using fix_innodb_cardinality can thus affect all user in their query plans if they are using the same tables. Because of this, it is strongly recommended that one uses optimizer_adjust_secondary_key_costs=fix_innodb_cardinality mainly in configuration files to not cause issues for other users.	2024-07-29 16:40:53 +03:00
Oleksandr Byelkin	0fe39d368a	Merge branch '10.6' into 10.11	2024-07-22 15:14:50 +02:00
Dave Gosselin	02e38e2ece	MDEV-33971 NAME_CONST in WHERE clause replaced by inner item Improve performance of queries like SELECT * FROM t1 WHERE field = NAME_CONST('a', 4); by, in this example, replacing the WHERE clause with field = 4 in the case of ref access. The rewrite is done during fix_fields and we disambiguate this case from other cases of NAME_CONST by inspecting where we are in parsing. We rely on THD::where to accomplish this. To improve performance there, we change the type of THD::where to be an enumeration, so we can avoid string comparisons during Item_name_const::fix_fields. Consequently, this patch also changes all usages of THD::where to conform likewise.	2024-07-10 17:23:43 -04:00
Alexander Barkov	5fb07d942b	Merge remote-tracking branch 'origin/11.2' into 11.4	2024-07-09 21:45:37 +04:00
Alexander Barkov	8aad19ddfc	Merge remote-tracking branch 'origin/11.1' into 11.2	2024-07-09 14:04:11 +04:00
Oleksandr Byelkin	2447dda2c0	Merge branch '10.11' into 11.1	2024-07-08 22:40:16 +02:00
Oleksandr Byelkin	034a175982	Merge branch '10.6' into 10.11	2024-07-04 11:52:07 +02:00
Denis Protivensky	cfbd57dfb7	MDEV-33064: Sync trx->wsrep state from THD on trx start InnoDB transactions may be reused after committed: - when taken from the transaction pool - during a DDL operation execution In this case wsrep flag on trx object is cleared, which may cause wrong execution logic afterwards (wsrep-related hooks are not run). Make trx->wsrep flag initialize from THD object only once on InnoDB transaction start and don't change it throughout the transaction's lifetime. The flag is reset at commit time as before. Unconditionally set wsrep=OFF for THD objects that represent InnoDB background threads. Make Wsrep_schema::store_view() operate in its own transaction. Fix streaming replication transactions' fragments rollback to not switch THD->wsrep value during transaction's execution (use THD->wsrep_ignore_table as a workaround). Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>	2024-07-01 13:07:39 +02:00
Alexander Barkov	c4bf4ce948	Merge remote-tracking branch 'origin/11.2' into 11.4	2024-06-17 15:46:39 +04:00
Marko Mäkelä	a21e49cbcc	Merge 11.1 into 11.2	2024-06-17 12:02:03 +03:00
Yuchen Pei	2d3e2c58b6	Merge branch '10.11' into 11.1	2024-05-31 10:54:31 +10:00
Marko Mäkelä	22ba7e4ff8	Merge 10.6 into 10.11	2024-05-30 16:04:00 +03:00
Marko Mäkelä	5ba542e9ee	Merge 10.5 into 10.6	2024-05-30 14:27:07 +03:00
Alexander Barkov	4a158ec167	MDEV-34226 On startup: UBSAN: applying zero offset to null pointer in my_copy_fix_mb from strings/ctype-mb.c and other locations nullptr+0 is an UB (undefined behavior). - Fixing my_string_metadata_get_mb() to handle {nullptr,0} without UB. - Fixing THD::copy_with_error() to disallow {nullptr,0} by DBUG_ASSERT(). - Fixing parse_client_handshake_packet() to call THD::copy_with_error() with an empty string {"",0} instead of NULL string {nullptr,0}.	2024-05-27 13:19:13 +04:00
Oleksandr Byelkin	99b370e023	Merge branch '11.2' into 11.4	2024-05-21 19:38:51 +02:00
Alexander Barkov	310fd6ff69	Backporting bugs fixes fixed by MDEV-31340 from 11.5 The patch for MDEV-31340 fixed the following bugs: MDEV-33084 LASTVAL(t1) and LASTVAL(T1) do not work well with lower-case-table-names=0 MDEV-33085 Tables T1 and t1 do not work well with ENGINE=CSV and lower-case-table-names=0 MDEV-33086 SHOW OPEN TABLES IN DB1 -- is case insensitive with lower-case-table-names=0 MDEV-33088 Cannot create triggers in the database `MYSQL` MDEV-33103 LOCK TABLE t1 AS t2 -- alias is not case sensitive with lower-case-table-names=0 MDEV-33108 TABLE_STATISTICS and INDEX_STATISTICS are case insensitive with lower-case-table-names=0 MDEV-33109 DROP DATABASE MYSQL -- does not drop SP with lower-case-table-names=0 MDEV-33110 HANDLER commands are case insensitive with lower-case-table-names=0 MDEV-33119 User is case insensitive in INFORMATION_SCHEMA.VIEWS MDEV-33120 System log table names are case insensitive with lower-cast-table-names=0 Backporting the fixes from 11.5 to 10.5	2024-05-21 14:58:01 +04:00
Sergei Golubchik	bf5da43e50	Merge branch '11.1' into 11.2	2024-05-13 10:00:26 +02:00
Sergei Golubchik	f9807aadef	Merge branch '10.11' into 11.0	2024-05-12 12:18:28 +02:00
Kristian Nielsen	383ee364dc	Merge 10.6 to 10.11	2024-05-07 08:45:31 +02:00
Kristian Nielsen	2a2019e199	MDEV-33798: Follow-up patch Don't deadlock kill event groups in other domains if they are not SPECULATE_OPTIMISTIC. Such event groups may not be able to safely roll back and retry (eg. DDL). But do deadlock kill a transaction T2 from a blocked transaction U in another domain, even if T2 has lower sub_id than U. Otherwise, in case of a cycle T2->T1->U->T2, we might not break the cycle if U is not SPECULATE_OPTIMISTIC Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>	2024-05-05 19:01:56 +02:00
Kristian Nielsen	e365877bae	MDEV-33798: ROW base optimistic deadlock with concurrent writes on same table One case is conflicting transactions T1 and T2 with different domain id, in optimistic parallel replication in non-GTID mode. Then T2 will wait_for_prior_commit on T1; and if T1 got a row lock wait on T2 it would hang, as different domains caused the deadlock kill to be skipped in thd_rpl_deadlock_check(). More generally, if we have transactions T1 and T2 in one domain/master connection, and independent transactions U in another, then we can still deadlock like this: T1 row low wait on U U row lock wait on T2 T2 wait_for_prior_commit on T1 This commit enforces the deadlock kill in these cases. If the waited-for transaction is speculatively applied, then it will be deadlock killed in case of a conflict, even if the two transactions are in different domains or master connections. Reviewed-by: Andrei Elkin <andrei.elkin@mariadb.com> Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>	2024-05-02 21:07:45 +02:00
Sergei Golubchik	018d537ec1	Merge branch '10.6' into 10.11	2024-04-22 15:23:10 +02:00
Marko Mäkelä	829cb1a49c	Merge 10.5 into 10.6	2024-04-17 14:14:58 +03:00
Kristian Nielsen	16aa4b5f59	Merge from 10.4 to 10.5 Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>	2024-04-15 17:46:49 +02:00
Monty	3655cefc42	MDEV-33813 ERROR 1021 (HY000): Disk full (./org/test1.MAI); waiting for someone to free some space Fixed that internal temporary tables are not waiting for freed disk space. Other things: - 'kill id' will now kill a query waiting for free disk space instantly. Before it could take up to 60 seconds for the kill would be noticed. - Fixed that sorting one index is not using MY_WAIT_IF_FULL for temp files. - Fixed bug where share->write_flag set MY_WAIT_IF_FULL for temp files. It is quite hard to do a test case for this. Instead I tested all combinations interactively.	2024-04-10 17:01:24 +03:00
Oleksandr Byelkin	cd28b2479c	Merge branch '11.1' into 11.2	2024-04-09 12:12:33 +02:00
Marko Mäkelä	fec2fd6add	Merge 10.11 into 11.0	2024-03-28 10:51:36 +02:00
Marko Mäkelä	788953463d	Merge 10.6 into 10.11 Some fixes related to commit f838b2d7998f18ac2a1bb9d56081aac6e563de1e and Rows_log_event::do_apply_event() and Update_rows_log_event::do_exec_row() for system-versioned tables were provided by Nikita Malyavin. This was required by test versioning.rpl,trx_id,row.	2024-03-28 09:16:57 +02:00
Brandon Nesterenko	75c7c6dc39	MDEV-33551: Semi-sync Wait Point AFTER_COMMIT Slow on Workloads with Heavy Concurrency When using semi-sync replication with rpl_semi_sync_master_wait_point=AFTER_COMMIT, the performance of the primary can significantly reduce compared to AFTER_SYNC's performance for workloads with many concurrent users executing transactions. This is because all connections on the primary share the same cond_wait variable/mutex pair, so any time an ACK is received from a replica, all waiting connections are awoken to check if the ACK was for itself, which is done in mutual exclusion. This patch changes this such that the waiting THD will use its own local condition variable, and the ACK receiver thread only signals connections which have been ACKed for wakeup. That is, the THD::LOCK_wakeup_ready condition variable is re-used for this purpose, and the Active_tranx queue nodes are extended to hold the waiting thread, so it can be signalled once ACKed. Additionally: 1) Removed part of MDEV-11853 additions, which allowed suspended connection threads awaiting their semi-sync ACKs to live until their ACKs had been received. This part, however, wasn't needed. That is, all that was needed was for the Ack_thread to survive. So now the connection threads are killed during phase 1. Thereby THD::is_awaiting_semisync_ack, and all its related code was removed. 2) COND_binlog_send is repurposed to signal on the condition when Active_tranx is emptied during clear_active_tranx_nodes. 3) At master shutdown (when waiting for slaves), instead of the main loop individually waiting for each ACK, await_slave_reply() (renamed await_all_slave_replies()) just waits once for the repurposed COND_binlog_send to signal it is empty. 4) Test rpl_semi_sync_shutdown_await_ack is updates as following: 4.1) Added test case (adapted from Kristian Nielsen) to ensure that if a thread awaiting its ACK is killed while SHUTDOWN WAIT FOR ALL SLAVES is issued, the primary will still wait for the ACK from the killed thread. 4.2) As connections which by-passed phase 1 of thread killing no longer are delayed for kill until phase 2, we can no longer query yes/no tx after receiving an ACK/timeout. The check for these variables is removed. 4.3) Comment descriptions are updated which mention that the connection is alive; and adjusted to be the Ack_thread. Reviewed By: ============ Kristian Nielsen <knielsen@knielsen-hq.org>	2024-03-21 08:42:18 -06:00
Kristian Nielsen	77b9b28a59	MDEV-24622: Replication does not support bulk insert into empty table Remove work-around that disables bulk insert optimization in replication The root cause of the original problem is now fixed (MDEV-33475). Though the bulk insert optimization will still be disabled in replication, as it is only enabled in special circumstances meant for loading a mysqldump. Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>	2024-03-15 15:55:07 +01:00

1 2 3 4 5 ...

4000 Commits