Checking for kill with thd_kill_level() or check_killed() runs apc
requests, which takes the LOCK_thd_kill mutex. But this is dangerous,
as checking for kill needs to be called while holding many different
mutexes, and can lead to cyclic mutex dependency and deadlock.
But running apc is only "best effort", so skip running the apc if the
LOCK_thd_kill is not available. The apc will then be run on next check
of kill signal.
Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
In some cases "SHOW PROCESSLIST" could show "Reset for next command"
as State, even if the previous query had finished properly.
Fixed by clearing State after end of command and also setting the State
for the "Connect" command.
Other things:
- Changed usage of 'thd->set_command(COM_SLEEP)' to
'thd->mark_connection_idle()'.
- Changed thread_state_info() to return "" instead of NULL. This is
just a safety measurement and in line with the logic of the
rest of the function.
Binary logging is now disabled for the queries run by SQL SERVICE.
The binlogging can be turned on with the 'SET SQL_LOG_BIN=On' query.
Conflicts:
sql/sql_prepare.cc
Conflicts:
sql/sql_prepare.cc
XA support for online alter was totally missing.
Tying on binlog_hton made this hardly visible: simply having binlog_commit
called from xa_commit made an impression that it will automagically work
for online alter, which turns out wrong: all binlog does is writes
"XA END" into trx cache and flushes it to a real binlog.
In comparison, online alter can't do the same, since online replication
happens in a single transaction.
Solution: make a dedicated XA support.
* Extend struct xid_t with a pointer to Online_alter_cache_list
* On prepare: move online alter cache from THD::ha_data to XID passed
* On XA commit/rollback: use the online alter cache stored in this XID.
This makes us pass xid_cache_element->xid to xa_commit/xa_rollback
instead of lex->xid
* Use manual memory management for online alter cache list, instead of
mem_root allocation, since we don't have mem_root connected to the XA
transaction.
MDEV-32441 SENT_ROWS shows random wrong values when stored function
is selected.
MDEV-32281 EXAMINED_ROWS is not populated in
information_schema.processlist upon SELECT.
Added ROWS_SENT to information_schema.processlist
This is to have the same information as Percona server (SENT_ROWS)
To ensure that information_schema.processlist has correct values for
sent_rows and examined_rows I introduced two new variables to hold the
total counts so far. This was needed as stored functions and stored
procedures will reset the normal counters to be able to count rows for
each statement individually for slow query log.
Other things:
- Selects with functions shows in processlist the total examined_rows
and sent_rows by the main statement and all functions.
- Stored procedures shows in processlist examined_rows and sent_rows
per stored procedure statement.
- Fixed some double accounting for sent_rows and examined_rows.
- HANDLER operations now also supports send_rows and examined_rows.
- Display sizes for MEMORY_USED, MAX_MEMORY_USED, EXAMINED_ROWS and
QUERY_ID in information_schema.processlist changed to 10 characters.
- EXAMINED_ROWS and SENT_ROWS changed to bigint.
- INSERT RETURNING and DELETE RETURNING now updates SENT_ROWS.
- As thd is always up to date with examined_rows, we do not need
to handle examined row counting for unions or filesort.
- I renamed SORT_INFO::examined_rows to m_examined_rows to ensure that
we don't get bugs in merges that tries to use examined_rows.
- Removed calls of type "thd->set_examined_row_count(0)" as they are
not needed anymore.
- Removed JOIN::join_examined_rows
- Removed not used functions:
THD::set_examined_row_count()
- Made inline some functions that where called for each row.
Merge sys_var_charptr with sys_var_charptr_base, as well as merge
Sys_var_session_lexstring into Sys_var_lexstring. Also refactored
update methods of sys_var_charptr accordingly.
Because the class is more generic, session_update() calls
sys_var_charptr::session_update() which does not assume a buffer field
associated with THD, but instead call strdup/free, we get rid of
THD::default_master_connection_buff accordingly. This also makes THD
smaller by ~192 bytes, and there can be many thousands of concurrent
THDs.
don't forget to reset mdl_context.m_deadlock_overweight when
taking the THD out of the cache - the history of previous connections
should not affect the weight in deadlock victim selection
(small cleanup of the test to help the correct merge)
The MDEV-29693 conflict resolution is from Monty, as well as is
a bug fix where ANALYZE TABLE wrongly built histograms for
single-column PRIMARY KEY.
Also includes a fix for safe_malloc error reporting.
Other things:
- Copied main.log_slow from 10.4 to avoid mtr issue
Disabled test:
- spider/bugfix.mdev_27239 because we started to get
+Error 1429 Unable to connect to foreign data source: localhost
-Error 1158 Got an error reading communication packets
- main.delayed
- Bug#54332 Deadlock with two connections doing LOCK TABLE+INSERT DELAYED
This part is disabled for now as it fails randomly with different
warnings/errors (no corruption).
Raise notes if indexes cannot be used:
- in case of data type or collation mismatch (diferent error messages).
- in case if a table field was replaced to something else
(e.g. Item_func_conv_charset) during a condition rewrite.
Added option to write warnings and notes to the slow query log for
slow queries.
New variables added/changed:
- note_verbosity, with is a set of the following options:
basic - All old notes
unusable_keys - Print warnings about keys that cannot be used
for select, delete or update.
explain - Print unusable_keys warnings for EXPLAIN querys.
The default is 'basic,explain'. This means that for old installations
the only notable new behavior is that one will get notes about
unusable keys when one does an EXPLAIN for a query. One can turn all
of all notes by either setting note_verbosity to "" or setting sql_notes=0.
- log_slow_verbosity has a new option 'warnings'. If this is set
then warnings and notes generated are printed in the slow query log
(up to log_slow_max_warnings times per statement).
- log_slow_max_warnings - Max number of warnings written to
slow query log.
Other things:
- One can now use =ALL for any 'set' variable to set all options at once.
For example using "note_verbosity=ALL" in a config file or
"SET @@note_verbosity=ALL' in SQL.
- mysqldump will in the future use @@note_verbosity=""' instead of
@sql_notes=0 to disable notes.
- Added "enum class Data_type_compatibility" and changing the return type
of all Field::can_optimize*() methods from "bool" to this new data type.
Reviewer & Co-author: Alexander Barkov <bar@mariadb.com>
- The code that prints out the notes comes mainly from Alexander
At the moment we cannot support
wsrep_forced_binlog_format=[MIXED|STATEMENT]
during CREATE TABLE AS SELECT.
Statement will use ROW instead and give
a warning.
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
Follow-up to fix issue with access to probably not-initialized mutex/cond_var
Constructor of the class st_debug_sync_globals was changed to initialize
the data members dsp_hits, dsp_executed, dsp_max_active with zero.
Formerly, these data members were filled with zeroes by C-runtime since
the variable debug_sync_global was declared as static and according with C rules
the static variable initialized with zero bytes.
By the same reason, the data members
debug_sync_global->ds_mutex
debug_sync_global->ds_cond
were initialized by zeros before the patch for MDEV-31871. After this patch
the memory for the synch primitives debug_sync_global->ds_mutex
and debug_sync_global->ds_cond are initialized explicitly by calling
the functions mysql_mutex_init/mysql_cond_init so access to these synch
primitives should be done only after such initialization be completed.
Guarded access to these synch primitives has been added to the function
debug_sync_end_thread() that is called on clean up since that was single
problem place detected by MSAN. Theoretically problem places located in the
function debug_sync_execute were not protected with similar check since
it is not obvious that the variables debug_sync_global->ds_mutex
and debug_sync_global->ds_cond could be not initilialized for use cases where
the function debug_sync_execute() is called. It is required additional study
to conclude whether it does need or not.
Problem:
Under terms of MDEV-27490, we'll update Unicode version used
to compare identifiers to 14.0.0. Unlike in the old Unicode version,
in the new version a string can grow during lower-case. We cannot
perform check_db_name() inplace any more.
Change summary:
- Allocate memory to store lower-cased identifiers in memory root
- Removing check_db_name() performing both in-place lower-casing and validation
at the same time. Splitting it into two separate stages:
* creating a memory-root lower-cased copy of an identifier
(using new MEM_ROOT functions and Query_arena wrapper methods)
* performing validation on a constant string
(using Lex_ident_fs methods)
Implementation details:
- Adding a mysys helper function to allocate lower-cased strings on MEM_ROOT:
lex_string_casedn_root()
and a Query_arena wrappers for it:
make_ident_casedn()
make_ident_opt_casedn()
- Adding a Query_arena method to perform both MEM_ROOT lower-casing and
database name validation at the same time:
to_ident_db_internal_with_error()
This method is very close to the old (pre-11.3) check_db_name(),
but performs lower-casing to a newly allocated MEM_ROOT
memory (instead of performing lower-casing the original string in-place).
- Adding a Table_ident method which additionally handles derived table names:
to_ident_db_internal_with_error()
- Removing the old check_db_name()
The problem was that parallel replication of temporary tables using
statement-based binlogging could overlap the COMMIT in one thread with a DML
or DROP TEMPORARY TABLE in another thread using the same temporary table.
Temporary tables are not safe for concurrent access, so this caused
reference to freed memory and possibly other nastiness.
The fix is to disable the optimisation with overlapping commits of one
transaction with the start of a later transaction, when temporary tables are
in use. Then the following event groups will be blocked from starting until
the one using temporary tables is completed.
This also fixes occasional test failures of rpl.rpl_parallel_temptable seen
in Buildbot.
Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
- Adding a new class Lex_ident_db, to store normalized database names:
lower-cased if lower-case-table-name says so,
and checked to be a valid database name using Lex_ident_fs::check_db_name()
- Reusing the new class in parameters to functions:
prepare_db_action()
mysql_create_db()
mysql_alter_db()
mysql_rm_db()
mysql_upgrade_db()
This change removed two old-style check_db_name() calls.
Remove the exception that InnoDB does not report auto-increment locks waits
to the parallel replication.
There was an assumption that these waits could not cause conflicts with
in-order parallel replication and thus need not be reported. However, this
assumption is wrong and it is possible to get conflicts that lead to hangs
for the duration of --innodb-lock-wait-timeout. This can be seen with three
transactions:
1. T1 is waiting for T3 on an autoinc lock
2. T2 is waiting for T1 to commit
3. T3 is waiting on a normal row lock held by T2
Here, T3 needs to be deadlock killed on the wait by T1.
Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
Restore code to make InnoDB choose the second transaction as a deadlock
victim if two transactions deadlock that need to commit in-order for
parallel replication. This code was erroneously removed when VATS was
implemented in InnoDB.
Also add a test case for InnoDB choosing the right deadlock victim.
Also fixes this bug, with testcase that reliably reproduces:
MDEV-28776: rpl.rpl_mark_optimize_tbl_ddl fails with timeout on sync_with_master
Reviewed-by: Marko Mäkelä <marko.makela@mariadb.com>
Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
Restore code to make InnoDB choose the second transaction as a deadlock
victim if two transactions deadlock that need to commit in-order for
parallel replication. This code was erroneously removed when VATS was
implemented in InnoDB.
Also add a test case for InnoDB choosing the right deadlock victim.
Also fixes this bug, with testcase that reliably reproduces:
MDEV-28776: rpl.rpl_mark_optimize_tbl_ddl fails with timeout on sync_with_master
Note: This should be null-merged to 10.6, as a different fix is needed
there due to InnoDB locking code changes.
Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
Remove the exception that InnoDB does not report auto-increment locks waits
to the parallel replication.
There was an assumption that these waits could not cause conflicts with
in-order parallel replication and thus need not be reported. However, this
assumption is wrong and it is possible to get conflicts that lead to hangs
for the duration of --innodb-lock-wait-timeout. This can be seen with three
transactions:
1. T1 is waiting for T3 on an autoinc lock
2. T2 is waiting for T1 to commit
3. T3 is waiting on a normal row lock held by T2
Here, T3 needs to be deadlock killed on the wait by T1.
Note: This should be null-merged to 10.6, as a different fix is needed
there due to InnoDB lock code changes.
Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
* Log rows in online_alter_binlog.
* Table online data is replicated within dedicated binlog file
* Cached data is written on commit.
* Versioning is fully supported.
* Works both wit and without binlog enabled.
* For now savepoints setup is forbidden while ONLINE ALTER goes on.
Extra support is required. We can simply log the SAVEPOINT query events
and replicate them together with row events. But it's not implemented
for now.
* Cache flipping:
We want to care for the possible bottleneck in the online alter binlog
reading/writing in advance.
IO_CACHE does not provide anything better that sequential access,
besides, only a single write is mutex-protected, which is not suitable,
since we should write a transaction atomically.
To solve this, a special layer on top Event_log is implemented.
There are two IO_CACHE files underneath: one for reading, and one for
writing.
Once the read cache is empty, an exclusive lock is acquired (we can wait
for a currently active transaction finish writing), and flip() is emitted,
i.e. the write cache is reopened for read, and the read cache is emptied,
and reopened for writing.
This reminds a buffer flip that happens in accelerated graphics
(DirectX/OpenGL/etc).
Cache_flip_event_log is considered non-blocking for a single reader and a
single writer in this sense, with the only lock held by reader during flip.
An alternative approach by implementing a fair concurrent circular buffer
is described in MDEV-24676.
* Cache managers:
We have two cache sinks: statement and transactional.
It is important that the changes are first cached per-statement and
per-transaction.
If a statement fails, then only statement data is rolled back. The
transaction moves along, however.
Turns out, there's no guarantee that TABLE well persist in
thd->open_tables to the transaction commit moment.
If an error occurs, tables from statement are purged.
Therefore, we can't store te caches in TABLE. Ideally, it should be
handlerton, but we cut the corner and store it in THD in a list.
Event_log is supposed to be a basic logging class that can write events in
a single file.
MYSQL_BIN_LOG in comparison will have:
* rotation support
* index files
* purging
* gtid and transactional information handling.
* is dedicated for a general-purpose binlog