Database Administrators
Q&A for database professionals who wish to improve their database skills
Latest Questions
0
votes
1
answers
1104
views
Why is the character set for the uca1400_ai_ci collation NULL?
I've been working with database collations of late. I generally don't like ambiguous queries such as `SHOW COLLATION` so I eventually found the source via the following query: SELECT * FROM information_schema.COLLATIONS ORDER BY CHARACTER_SET_NAME ASC, COLLATION_NAME ASC; I now have the option to up...
I've been working with database collations of late. I generally don't like ambiguous queries such as
SHOW COLLATION
so I eventually found the source via the following query:
SELECT *
FROM information_schema.COLLATIONS
ORDER BY CHARACTER_SET_NAME ASC, COLLATION_NAME ASC;
I now have the option to update the collation to uca1400_ai_ci
however the associated character set value is NULL
.
Why is the uca1400_ai_ci
NULL
and what, if any considerations should I have while updating from the character set utf8mb4
and collation utf8mb4_unicode_520_ci
?
John
(769 rep)
Jul 19, 2024, 01:43 AM
• Last activity: Aug 4, 2025, 06:01 PM
-3
votes
1
answers
591
views
System dbs have different collation
I have installed an instance of SQL Server 2016 SP2 called DNF containing 2 DBs and also the system DBs. After providing the instance to the dev team, they realised that they need to change the collation. They changed the collation for the 2 DBs by themselves and asked me to make the change for the...
I have installed an instance of SQL Server 2016 SP2 called DNF containing 2 DBs and also the system DBs.
After providing the instance to the dev team, they realised that they need to change the collation. They changed the collation for the 2 DBs by themselves and asked me to make the change for the system DB. I took a backup of all databases and rebuilt the system DB.
It works fine and all the dbs changed to new collation.
I made a mistake by restoring the msdb, model and finally I tried to restore the master. I succeeded to restore the model and msdb and faced a lot of issues when coming to themaster.
I ran the query select name, collation_name from sys.databases and found that the model and msdb have the old collation, and this is logic and was a mistake from me.
I tried to rebuild again, but collation refused to change on model and msdb and found the sql server and sql server for the instance stopped and won't start anymore.
The situation now is : master and the 2 non system DBs have the new collation and model, msdb have the old one, and the services won't start.
What is the way to solve this issue? Does an instance repair will solve this issue? Or there is something else that I can do?
.
Toni
(1 rep)
Oct 21, 2023, 09:35 AM
• Last activity: Jul 11, 2025, 11:36 AM
0
votes
1
answers
166
views
Match similar special character in filters
I have to perform a search on a table with Turkish city names. The users would enter the first 4 characters of a city and I need to fetch the results from SQL. For example, user wants to find the entries for `Istanbul`, so he would enter `ista` or `Ista` or `ISTA` Problem is that `Istanbul` is writt...
I have to perform a search on a table with Turkish city names.
The users would enter the first 4 characters of a city and I need to fetch the results from SQL.
For example, user wants to find the entries for
Istanbul
, so he would enter ista
or Ista
or ISTA
Problem is that Istanbul
is written with special character. It is Ìstanbùl
. So my query
where UPPER(city) like 'UPPER()%'
will not find it. It only finds it when the user enters Ìsta
(with the special character).
Is there a way of converting similar special character to 1 character. So for example ìíî
will be converted to i
. This way a user can find Ìnstanbùl
with whatever variant of i
they enter.
roel
(113 rep)
Apr 5, 2017, 08:32 AM
• Last activity: Jul 7, 2025, 11:05 AM
3
votes
2
answers
219
views
Restore PostgreSQL database from linux to MacOS with utf8 collated columns
I want to copy from a postgres db running on linux with column definition like: col1 varchar COLLATE "en_US.utf8" NULL; This collate does not exist on macos, or at least, it's called en_US.UTF-8. Because of this, a restore from a backup made with pg_dump will fail on these tables. I tried to remove...
I want to copy from a postgres db running on linux with column definition like:
col1 varchar COLLATE "en_US.utf8" NULL;
This collate does not exist on macos, or at least, it's called en_US.UTF-8. Because of this, a restore from a backup made with pg_dump will fail on these tables.
I tried to remove the collation specification from the ddl in the backups: there seems to be two files: restore.sql and toc.dat, both of them containing the ddls for some reason. Removing from the first achieves nothing, and removing it from the second leaves the file corrupted even if I go in with a hex editor and remove the exact bytes.
(I also tried to create this collation on macos, at which i failed miserably.)
I wonder if somebody encountered this scenario, and what would be the solution here.
sifear
(53 rep)
Jun 17, 2025, 05:17 AM
• Last activity: Jun 18, 2025, 10:37 PM
2
votes
1
answers
1892
views
What is the impact of converting latin1/latin1_swedish_ci to utf8mb4/utf8mb4_unicode_ci?
I was facing some issues with the character's encoding. Those are resolved by updating the CHARACTER and COLLATE for some columns in the table. So my concern is - Is this conversion safe? - Or can this create issues for different encoding techniques? - Also how to convert the CHARACTER and COLLATE f...
I was facing some issues with the character's encoding. Those are resolved by updating the CHARACTER and COLLATE for some columns in the table. So my concern is
- Is this conversion safe?
- Or can this create issues for different encoding techniques?
- Also how to convert the CHARACTER and COLLATE for multiple columns of the table in the same MySQL query?
Mehar
(121 rep)
Jul 26, 2021, 04:21 PM
• Last activity: Jun 4, 2025, 07:04 PM
11
votes
2
answers
11153
views
ORDER BY and comparison of mixed strings of letters and numbers
We need to do some reporting on values that are usually mixed strings of numbers and letters that need to be sorted 'naturally'. Things like, e.g. "P7B18" or "P12B3". @The strings will mostly be sequences of letters then numbers alternating. The number of these segments and the length of each could...
We need to do some reporting on values that are usually mixed strings of numbers and letters that need to be sorted 'naturally'. Things like, e.g. "P7B18" or "P12B3". @The strings will mostly be sequences of letters then numbers alternating. The number of these segments and the length of each could vary, though.
We'd like the numeric portions of these to be sorted in numeric order. Obviously, if I just handle those string values directly with
ORDER BY
, then "P12B3" is going to come before "P7B18", since "P1" is earlier than "P7", but I'd like the reverse, as "P7" naturally precedes "P12".
I'd also like to be able to do range comparisons, e.g. @bin < 'P13S6'
or some such. I don't have to handle floating point or negative numbers; these will strictly be non-negative integers that we're dealing with. String lengths and number of segments could potentially be arbitrary, with no fixed upper bounds.
In our case, string casing isn't important, though if there's a way to do this in a collation-aware fashion, others might find that useful. The ugliest part of all this is I'd like to be able to do both ordering, and range filtering in the WHERE
clause.
If I were doing this in C#, it would be a pretty simple task: do some parsing to separate the alpha from the numeric, implement IComparable, and you're basically done. SQL Server, of course, doesn't appear to offer any similar functionality, at least as far as I'm aware.
Anybody know any good tricks to make this work? Is there some little-publicized ability to create custom CLR types that implement IComparable and have this behave as expected? I'm also not opposed to Stupid XML Tricks (see also: list concatenation), and I've got CLR regex matching/extracting/replacement wrapper functions available on the server as well.
**EDIT:**
As a slightly more detailed example, I'd want the data to behave something like this.
SELECT bin FROM bins ORDER BY bin
bin
--------------------
M7R16L
P8RF6JJ
P16B5
PR7S19
PR7S19L
S2F3
S12F0
i.e. break the strings into tokens of all letters or all numbers, and sort them either alphabetically or numerically respectively, with the leftmost tokens being the most significant sorting term. Like I mentioned, piece of cake in .NET if you implement IComparable, but I don't know how (or if) you can do that sort of thing in SQL Server. It's certainly not something I've ever come across in 10 or so years of working with it.
db2
(9708 rep)
Jan 20, 2016, 02:28 PM
• Last activity: May 17, 2025, 04:01 AM
1
votes
1
answers
2094
views
MySQL- Performance implications of changing a column collation from latin1_swedish_ci to latin1_bin
I am a developer so was hoping someone with DBA experience could help me out here. I am using a MySQL database (MYSQL 5.6). I came across a situation where a particular column stores Ids generated by any external system (to be provided by user input) and hence requires case sensitivity (`abcd` is a...
I am a developer so was hoping someone with DBA experience could help me out here.
I am using a MySQL database (MYSQL 5.6).
I came across a situation where a particular column stores Ids generated by any external system (to be provided by user input) and hence requires case sensitivity (
abcd
is a different id from ABCD
). It seemed pretty natural to me to change the collation for that column to latin1_bin
from latin1_swedish_ci
where my database and tables all use latin1_swedish_ci
collation.
I use the following statement to change the collation:
ALTER TABLE T MODIFY C VARCHAR(200) BINARY NOT NULL;
Are there any performance implications of doing this change? My application does not use COLLATE specifically in any query.
Also, column C is part of an index and a unique constraint with 2 other columns say A, B in table T (A,B,C). A and B are also foreign keys in table T.
Do I need to drop the index and unique constraints and re-create them?
Also, how do I rollback this change? Will this statement do the job:
ALTER TABLE T MODIFY C VARCHAR(200) NOT NULL;
The table T looks like :
|id|A|B|C|created_at|
id
is auto increment.
Please help.
Kushan Sen
(41 rep)
Dec 13, 2019, 06:37 AM
• Last activity: Apr 27, 2025, 04:06 AM
2
votes
2
answers
103
views
Can I change the collation of a table and its columns without using convert to?
All of the tables and columns in my database are using the utf8mb4 character set and utf8mb4_unicode_ci collation. I'd like to migrate to utf8mb4_0900_ai_ci collation but using `convert to` is prohibitive on my large tables unless I use gh-ost or pt-osc. It seems to me that `convert to` should not b...
All of the tables and columns in my database are using the utf8mb4 character set and utf8mb4_unicode_ci collation. I'd like to migrate to utf8mb4_0900_ai_ci collation but using
convert to
is prohibitive on my large tables unless I use gh-ost or pt-osc.
It seems to me that convert to
should not be necessary since the character set is not changing, but I cannot find any documentation which confirms or denies this. Do I need to use convert to
in this case, or is alter table ... modify col ... collate ...
sufficient?
I've looked https://dba.stackexchange.com/questions/229954/using-convert-to-character-set-for-changing-only-collation which asks a very similar question, but the age of that question renders the answers somewhat obsolete. Even so, one answer seems to imply that convert to
is unnecessary. The other answer delves into character set conversion and using pt-osc which I'd hoping to avoid.
Joel
(133 rep)
Apr 4, 2025, 09:01 PM
• Last activity: Apr 8, 2025, 05:39 PM
1
votes
1
answers
60
views
CONVERT TO CHARACTER SET isnt working with modify column
I am trying to convert the table charset and collation along side preserving some of the columns explicit charset and collation. I need to do it in single alter statement. If i use following, it is changing the charset of the column which i want to retain. Any idea if something is missing? query 1-...
I am trying to convert the table charset and collation along side preserving some of the columns explicit charset and collation. I need to do it in single alter statement. If i use following, it is changing the charset of the column which i want to retain. Any idea if something is missing?
query 1-
ALTER TABLE
summary
CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci , MODIFY COLUMN summary_id
varchar(11) CHARACTER SET ascii COLLATE ascii_bin NOT NULL;
Show create table o/p:
Create Table: CREATE TABLE "summary" (
“summary_id" varchar(11) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci NOT NULL,
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
DTdev
(111 rep)
Mar 18, 2025, 09:14 AM
• Last activity: Mar 18, 2025, 01:53 PM
9
votes
2
answers
16242
views
Best General Purpose Character Set and Collation for MySQL
Currently, whenever I create a new MySQL database, I use utf8mb4 as a character set and utf8mb4_unicode_520_ci for the collation, e.g.: CREATE DATABASE IF NOT EXISTS db_name DEFAULT CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_520_ci; Is there a newer/upgraded general purpose collation or character...
Currently, whenever I create a new MySQL database, I use utf8mb4 as a character set and utf8mb4_unicode_520_ci for the collation, e.g.:
CREATE DATABASE IF NOT EXISTS db_name
DEFAULT CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_520_ci;
Is there a newer/upgraded general purpose collation or character set for MySQL?
For example if there's a collation that superseded utf8mb4_unicode_520_ci, like utf8mb4_unicode_800_ci or something like that?
Thanks for your help.
GTS Joe
(273 rep)
Jun 2, 2021, 11:48 PM
• Last activity: Mar 13, 2025, 11:03 AM
2
votes
2
answers
2038
views
Identify the version of a collation from ICU in Postgres
Postgres 10 and later incorporates the [International Components for Unicode (ICU)][1] library for text-handling and other internationalization issues. Changes happen to human languages, such as sorting rules evolving. Therefore collation definitions are versioned in ICU. ➥ How can I identify the ve...
Postgres 10 and later incorporates the International Components for Unicode (ICU) library for text-handling and other internationalization issues.
Changes happen to human languages, such as sorting rules evolving. Therefore collation definitions are versioned in ICU.
➥ How can I identify the version number of a collation?
I know Postgres will report version number as part of a collation version mismatch error message. Is there a way to proactively query for the version rather than wait for an error message?
Basil Bourque
(11188 rep)
Aug 4, 2019, 03:16 AM
• Last activity: Mar 11, 2025, 06:36 AM
21
votes
4
answers
114808
views
How to do a case-insensitive LIKE in a case-sensitive database?
My vendor requires the data warehouse database to be case sensitive, but I need to do case-insensitive queries against it. In a case-sensitive database, how would you write this to be case-insensitive? Where Name like '%hospitalist%'
My vendor requires the data warehouse database to be case sensitive, but I need to do case-insensitive queries against it.
In a case-sensitive database, how would you write this to be case-insensitive?
Where Name like '%hospitalist%'
James
(2668 rep)
Jun 13, 2018, 01:49 PM
• Last activity: Jan 30, 2025, 04:25 PM
5
votes
3
answers
13953
views
Case Insensitive ORDER BY clause using COLLATE
I have spent a long time looking for this, and I am getting mixed messages. In other DBMSs (tested in SQLite, Oracle, MariaDB, MSSQL) I can override the default sort order using the `COLLATE` clause: ```sql SELECT * FROM orderby ORDER BY string COLLATE … ; -- SQLite: BINARY | NOCASE -- MariaDB: utf8...
I have spent a long time looking for this, and I am getting mixed messages.
In other DBMSs (tested in SQLite, Oracle, MariaDB, MSSQL) I can override the default sort order using the
COLLATE
clause:
SELECT *
FROM orderby
ORDER BY string COLLATE … ;
-- SQLite: BINARY | NOCASE
-- MariaDB: utf8mb4_bin | utf8mb4_general_ci
-- Oracle: BINARY | BINARY_CI
-- MSSQL: Latin1_General_BIN | Latin1_General_CI_AS
I have pored over the documentation and searched high and low, but I can’t find anything so straightforward for PostgreSQL.
Is there a COLLATE
clause value that would sort Case Insensitive?
I know there are many questions regarding case sensitivity, but (a) most of them are old and (b) none that I have seen relate to the COLLATE
clause.
FWIW, I am testing on PostgreSQL 11.8. I have a test fiddle on http://sqlfiddle.com/#!17/05cab/1 , but it’s only for PostgreSQL 9.6.
MySQL/MariaDB and SQL Server default to case insensitive, and that would certainly make sense when sorting most text. Oracle and SQLite default to case sensitive, but have a relatively simple solution for a case insensitive sort. The default collation for my database is en_US.UTF-8
. I’m trying to fill in a few gaps here.
Manngo
(3145 rep)
Nov 22, 2020, 04:44 AM
• Last activity: Jan 24, 2025, 12:49 PM
0
votes
1
answers
652
views
Incorrect string value for MariaDB + utf8mb4 + utf8mb4_uca1400_ai_ci
> Incorrect string value: > \\\'\\\\xCD\\\\xCF\\\\xD7\\\\xA7\\\\xC9\\\\xFA...\\\ Everything on my server should be `utf8mb4` and `utf8mb4_uca1400_ai_ci` so let's go through everything. My `my.ini` file on Windows 10 for MariaDB 11.4.2: [mysql] default-character-set = utf8mb4 [mysqld] datadir=C:/WAMP...
> Incorrect string value:
> \\\'\\\\xCD\\\\xCF\\\\xD7\\\\xA7\\\\xC9\\\\xFA...\\\
Everything on my server should be
Every time I do a server migration I run the following:
SET CHARACTER SET 'utf8';
SET collation_connection = 'utf8mb4_uca1400_ai_ci';
SET GLOBAL collation_connection = 'utf8mb4_uca1400_ai_ci';
SET GLOBAL collation_database = 'utf8mb4_uca1400_ai_ci';
SET GLOBAL collation_server = 'utf8mb4_uca1400_ai_ci';
The
utf8mb4
and utf8mb4_uca1400_ai_ci
so let's go through everything.
My my.ini
file on Windows 10 for MariaDB 11.4.2:
[mysql]
default-character-set = utf8mb4
[mysqld]
datadir=C:/WAMP/MariaDB/data
port=3306
innodb_buffer_pool_size=50M
character-set-server=utf8mb4
character-set-client-handshake = FALSE
collation-server = utf8mb4_uca1400_ai_ci
general-log
general-log-file=C:/WAMP/MariaDB/logs/queries.log
[client]
default-character-set = utf8mb4
port=3306
plugin-dir=C:\WAMP\MariaDB/lib/plugin
The local query I use to verify character sets and collations:
SELECT LOWER(VARIABLE_NAME) AS variable_name
,
GLOBAL_VALUE AS global_value
,
DEFAULT_VALUE AS default_value
FROM INFORMATION_SCHEMA.SYSTEM_VARIABLES
WHERE VARIABLE_NAME LIKE '%CHARACTER_SET_%'
OR VARIABLE_NAME LIKE '%COLLATION_%'
ORDER BY VARIABLE_NAME
ASC;

SHOW CREATE DATABASE
returns in utf8mb4
and utf8mb4_uca1400_ai_ci
.
The column associated with the error uses utf8mb4_uca1400_ai_ci
.
This is for an email message with the subject "[标题]Satisfy all your fantasies about Xianxia's hand tour." 🙄︀ I'm guessing something got lost in translation. I don't have everything maxed out on character sets and collations though I think I've covered most things. So...
What have I not yet correctly set for character sets and/or collation and how do I go about ensuring that everything possible is set to utf8mb4
and utf8mb4_uca1400_ai_ci
?
John
(769 rep)
Jan 5, 2025, 09:17 AM
• Last activity: Jan 7, 2025, 03:09 PM
0
votes
1
answers
124
views
A collation that ignores cedillas for comparison?
*Cedillas* are mutations applied to letters in some languages, like portuguese, french, turkish and romanian. Some words: "coração", "açúcar", "linhaça" (letters 'Ç', 'Ş', ...). How to choose a collation that ignores *cedillas* for comparison? They are not accents. [Thi...
*Cedillas* are mutations applied to letters in some languages, like portuguese, french, turkish and romanian. Some words: "coração", "açúcar", "linhaça" (letters 'Ç', 'Ş', ...).
How to choose a collation that ignores *cedillas* for comparison? They are not accents.
[This blog post](https://sqlfromhell.wordpress.com/2010/05/01/ignorando-letras-maiusculas-minusculas-acentos-e-c/) (in portuguese) teaches that:
*
SQL_Latin1_General_CP1_CI_AI
: considers *cedillas*
* Latin1_General_CI_AI
: ignores *cedillas*
What is the difference? It's not clear in the name of the collation.
Alexandre
(103 rep)
Nov 23, 2024, 12:00 AM
• Last activity: Dec 20, 2024, 11:49 AM
0
votes
2
answers
227
views
upgrade postgres cluster with changing collation & encoding
we face a problem when trying to `pg_upgrade` pg12 > pg14. our source instance has `en_US.UTF-8` which it should not have had to begin with, and the new pg14 cluster is being initiated with `de_DE.UTF-8`. `pg_upgrade` throws an error on ``` lc_collate values for database "postgres" do not match: old...
we face a problem when trying to
pg_upgrade
pg12 > pg14.
our source instance has en_US.UTF-8
which it should not have had to begin with, and the new pg14 cluster is being initiated with de_DE.UTF-8
.
pg_upgrade
throws an error on
lc_collate values for database "postgres" do not match: old "en_US.UTF-8", new "de_DE.UTF-8"
not being experienced in such at all, we came up with this solution and I'd like to hear some feedback whether this is sane, or not
pg14/initdb -D new_data_dir
pg14/pg_ctl -D new_data_dir start
pg_dumpall_12 | sed -i "/en_US.UTF-8/de_DE.UTF-8/" | psql_14 -p 6543 |& tee some.out
this leaves Ctype still with en_US.UTF-8
so we add the below per database
psql -c "UPDATE pg_database SET datctype = 'de_DE.UTF-8' WHERE datname = ;"
vrms
(269 rep)
Dec 18, 2024, 09:31 AM
• Last activity: Dec 18, 2024, 02:21 PM
1
votes
1
answers
284
views
Any downsides of using UTF8 collation in SQL Server 2019?
We have some columns in the table that could possibly get some non-ANSI characters and we must store them. Before the SQL Server 2019 there was the only 1 option - `NVARCHAR` data type, however starting with SQL Server 2019 it is possible to use `VARCHAR` data type with the collations that ends with...
We have some columns in the table that could possibly get some non-ANSI characters and we must store them. Before the SQL Server 2019 there was the only 1 option -
NVARCHAR
data type, however starting with SQL Server 2019 it is possible to use VARCHAR
data type with the collations that ends with %_UTF8
. I understand that for full compatibility with the size, I'll need to double the VARCHAR
columns size, so if it was VARCHAR(20)
it will needed to be VARCHAR(40)
to cover the case if somebody would insert text where all the 20 characters are non-ANSI, but are there any other downsides of such collation? The data in that column will be ~99.9% with ANSI characters only.
Dmitrij Kultasev
(191 rep)
Dec 9, 2024, 09:54 AM
• Last activity: Dec 9, 2024, 02:16 PM
36
votes
2
answers
37762
views
Why would you index text_pattern_ops on a text column?
Today [Seven Databases in Seven Weeks](http://pragprog.com/book/rwdata/seven-databases-in-seven-weeks) introduced me to per-operator indexes. > You can index strings for pattern matching the previous queries by creating a `text_pattern_ops` operator class index, as long as the values are indexed in...
Today [Seven Databases in Seven Weeks](http://pragprog.com/book/rwdata/seven-databases-in-seven-weeks) introduced me to per-operator indexes.
> You can index strings for pattern matching the previous queries by creating a
text_pattern_ops
operator class index, as long as the values are indexed in lowercase.
CREATE INDEX moves_title_pattern ON movies (
(lower(title) text_pattern_ops);
> We used the text_pattern_ops
because the title is of type text. If you need to index varchars, chars, or names, use the related ops: varchar_pattern_ops
, bpchar_pattern_ops
, and name_pattern_ops
.
I find the example really confusing. Why is it useful to do this?
If the column is type text, wouldn't the other types (varchar, char, name) be cast to to text before being used as a search value?
How does that index behave differently from one using the default operator?
CREATE INDEX moves_title_pattern ON movies (lower(title));
Iain Samuel McLean Elder
(2408 rep)
Nov 22, 2013, 05:51 PM
• Last activity: Dec 5, 2024, 09:36 AM
1
votes
1
answers
306
views
Illegal mix of collations
I am managing a MariaDB 10.6 Server for a bunch of people. I also dump the databases for backup purposes. Recently, one of the databases started to make problems while dumping: > mysqldump: Couldn't execute 'show create table `my_view`': Illegal mix of collations (utf8mb4_general_ci,COERCIBLE) and (...
I am managing a MariaDB 10.6 Server for a bunch of people. I also dump the databases for backup purposes. Recently, one of the databases started to make problems while dumping:
> mysqldump: Couldn't execute 'show create table
my_view
': Illegal mix of collations (utf8mb4_general_ci,COERCIBLE) and (utf8mb4_unicode_ci,COERCIBLE) for operation '=' (1267)
Everything in that database was created using the utf8mb4
character set and utf8mb4_unicode_ci
collation. I checked the DB, tables, columns and view.
I then found out, that when I open a regular mariadb shell, the collation_connection
differs:
MariaDB [(none)]> SHOW GLOBAL VARIABLES LIKE "collation_connection";
+----------------------+--------------------+
| Variable_name | Value |
+----------------------+--------------------+
| collation_connection | utf8mb4_unicode_ci |
+----------------------+--------------------+
1 row in set (0.002 sec)
MariaDB [(none)]> SHOW VARIABLES LIKE "collation_connection";
+----------------------+--------------------+
| Variable_name | Value |
+----------------------+--------------------+
| collation_connection | utf8mb4_general_ci |
+----------------------+--------------------+
1 row in set (0.003 sec)
**My first question:** Why does the global scope have a different collation_connection
compared to the session scope? I always thought the session scope inherits from the global scope. I was not able to find in the docs, where this gets set.
Here are the relevant parts of my mariadb option file:
[mariadb]
character_set_server=utf8mb4
collation_server=utf8mb4_unicode_ci
[client]
default-character-set=utf8mb4
[mariadb-dump]
default-character-set=utf8mb4
**My second question:** How to prevent such issues? Do I have to tell all of my users to use the same collation to prevent issues like that?
---
Here is an example to reproduce it:
CREATE TABLE employees (
id INT AUTO_INCREMENT PRIMARY KEY,
name VARCHAR(100) NOT NULL,
department VARCHAR(100),
salary DECIMAL(10, 2),
created_at TIMESTAMP
);
INSERT INTO employees (name, department, salary, created_at)
VALUES
('Alice', 'HR', 50000, NOW()),
('Bob', 'Engineering', 70000, NOW()),
('Charlie', 'HR', 45000, NOW()),
('David', 'Engineering', 80000, NOW()),
('Eve', 'Marketing', 60000, NOW());
SET collation_connection = utf8mb4_unicode_ci;
CREATE VIEW hr_employees AS
SELECT id, name, salary
FROM employees
WHERE department = 'HR' AND date_format(created_at, "%Y") IN ('2024');
SET collation_connection = utf8mb4_general_ci;
SHOW CREATE VIEW hr_employees;
ERROR 1267 (HY000): Illegal mix of collations (utf8mb4_general_ci,COERCIBLE) and (utf8mb4_unicode_ci,COERCIBLE) for operation '='
Vince
(111 rep)
Nov 22, 2024, 01:23 PM
• Last activity: Nov 28, 2024, 09:19 AM
1
votes
1
answers
98
views
MySQL: using NAME_CONST internally
I'm using MySQL 8.0.32 I have some questions about an internal conversion, that I don't understand. In addition to this, sometimes this conversion is shown in SHOW FULL PROCESSLIST, as its execution was too long. This is my table. CREATE TABLE `provinces` ( `ID` VARCHAR(2) NOT NULL COLLATE 'utf8mb4_...
I'm using MySQL 8.0.32
I have some questions about an internal conversion, that I don't understand. In addition to this, sometimes this conversion is shown in SHOW FULL PROCESSLIST, as its execution was too long.
This is my table.
CREATE TABLE
provinces
(
ID
VARCHAR(2) NOT NULL COLLATE 'utf8mb4_swedish_ci',
PROVINCIA
VARCHAR(45) NOT NULL COLLATE 'utf8mb4_swedish_ci',
PRIMARY KEY (ID
)
) COLLATE='utf8mb4_swedish_ci' ENGINE=InnoDB;
I have 53 rows like these
ID |PROVINCIA
----|----
01 |Álava
02 |Albacete
The MySQL Function is:
CREATE FUNCTION get_provincia
( id_provincia
VARCHAR(2) )
RETURNS varchar(45)
LANGUAGE SQL
NOT DETERMINISTIC
READS SQL DATA
SQL SECURITY DEFINER
COMMENT ''
BEGIN
DECLARE desc_provincia VARCHAR(45) DEFAULT '-';
SELECT PROVINCIA INTO desc_provincia
FROM provinces
WHERE ID = id_provincia;
RETURN desc_provincia;
END
The SHOW STATUS of this function shows:
- character_set_client: utf8mb4
- collation_connection: utf8mb4_0900_ai_ci
- database_collation: latin1_swedish_ci
That function is invoked as a field in a query.
SELECT customer_id, customer_name, ..., get_provincia(SUBSTRING(zip_code,1,2))...
FROM ...
When I'm reviewing the execution with SHOW FULL PROCESSLIST, the query inside the function is shown like this:
SELECT provincia INTO desc_provincia
FROM provincias
WHERE id = NAME_CONST('id_provincia',_latin1'15' COLLATE 'latin1_swedish_ci')
So, I have these questions:
1. Why is the NAME_CONST forced by MySQL?
2. Does it have an impact on performance? I don't understand that very simple query shows "2" value in column "Time" of SHOW FULL PROCESSLIST command, as it takes 2 seconds.
Thanks in advance.
yaki_nuka
(155 rep)
Nov 6, 2024, 03:54 PM
• Last activity: Nov 8, 2024, 03:28 PM
Showing page 1 of 20 total questions