Database Administrators

Q&A for database professionals who wish to improve their database skills

Latest Questions

0 votes

1 answers

1104 views

Why is the character set for the uca1400_ai_ci collation NULL?

I've been working with database collations of late. I generally don't like ambiguous queries such as `SHOW COLLATION` so I eventually found the source via the following query: SELECT * FROM information_schema.COLLATIONS ORDER BY CHARACTER_SET_NAME ASC, COLLATION_NAME ASC; I now have the option to up...

                                  I've been working with database collations of late. I generally don't like ambiguous queries such as SHOW COLLATION so I eventually found the source via the following query:

    SELECT * 
    FROM information_schema.COLLATIONS 
    ORDER BY CHARACTER_SET_NAME ASC, COLLATION_NAME ASC;

I now have the option to update the collation to uca1400_ai_ci however the associated character set value is NULL.

Why is the uca1400_ai_ci NULL and what, if any considerations should I have while updating from the character set utf8mb4 and collation utf8mb4_unicode_520_ci?
                                

John (769 rep)

Jul 19, 2024, 01:43 AM • Last activity: Aug 4, 2025, 06:01 PM

-3 votes

1 answers

591 views

System dbs have different collation

sql-server-2016 sql-server-agent collation system-databases

I have installed an instance of SQL Server 2016 SP2 called DNF containing 2 DBs and also the system DBs. After providing the instance to the dev team, they realised that they need to change the collation. They changed the collation for the 2 DBs by themselves and asked me to make the change for the...

                                  I have installed an instance of SQL Server 2016 SP2 called DNF containing 2 DBs and also the system DBs. 

After providing the instance to the dev team, they realised that they need to change the collation. They changed the collation for the 2 DBs by themselves and asked me to make the change for the system DB. I took a backup of all databases and rebuilt the system DB.

It works fine and all the dbs changed to new  collation.
I made a mistake by restoring the msdb, model and finally I tried to restore the master. I succeeded to restore the model and msdb and faced a lot of issues when coming to themaster. 

I ran the query select name, collation_name from sys.databases and found that the model and msdb have the old collation, and this is logic and was a mistake from me.
I tried to rebuild again, but collation refused to change on model and msdb and found the sql server and sql server for the instance stopped and won't  start anymore.

The situation now is : master and the 2 non system DBs have the new collation and model, msdb have the old one, and the services won't start.

What is the way to solve this issue? Does an instance repair will solve this issue? Or there is something else that I can do?
.

Toni (1 rep)

Oct 21, 2023, 09:35 AM • Last activity: Jul 11, 2025, 11:36 AM

0 votes

1 answers

166 views

Match similar special character in filters

oracle collation string-searching

I have to perform a search on a table with Turkish city names. The users would enter the first 4 characters of a city and I need to fetch the results from SQL. For example, user wants to find the entries for `Istanbul`, so he would enter `ista` or `Ista` or `ISTA` Problem is that `Istanbul` is writt...

                                  I have to perform a search on a table with Turkish city names.

The users would enter the first 4 characters of a city and I need to fetch the results from SQL.

For example, user wants to find the entries for Istanbul, so he would enter ista or Ista or ISTA

Problem is that Istanbul is written with special character. It is Ìstanbùl. So my query

where UPPER(city) like 'UPPER()%'

will not find it. It only finds it when the user enters Ìsta  (with the special character).

Is there a way of converting similar special character to 1 character. So for example ìíî will be converted to i. This way a user can find Ìnstanbùl with whatever variant of i they enter.

roel (113 rep)

Apr 5, 2017, 08:32 AM • Last activity: Jul 7, 2025, 11:05 AM

3 votes

2 answers

219 views

Restore PostgreSQL database from linux to MacOS with utf8 collated columns

postgresql collation utf-8

I want to copy from a postgres db running on linux with column definition like: col1 varchar COLLATE "en_US.utf8" NULL; This collate does not exist on macos, or at least, it's called en_US.UTF-8. Because of this, a restore from a backup made with pg_dump will fail on these tables. I tried to remove...

                                  I want to copy from a postgres db running on linux with column definition like:

    col1 varchar COLLATE "en_US.utf8" NULL;
This collate does not exist on macos, or at least, it's called en_US.UTF-8. Because of this, a restore from a backup made with pg_dump will fail on these tables.
I tried to remove the collation specification from the ddl in the backups: there seems to be two files: restore.sql and toc.dat, both of them containing the ddls for some reason. Removing from the first achieves nothing, and removing it from the second leaves the file corrupted even if I go in with a hex editor and remove the exact bytes.
(I also tried to create this collation on macos, at which i failed miserably.)

I wonder if somebody encountered this scenario, and what would be the solution here.
                                

sifear (53 rep)

Jun 17, 2025, 05:17 AM • Last activity: Jun 18, 2025, 10:37 PM

2 votes

1 answers

1892 views

What is the impact of converting latin1/latin1_swedish_ci to utf8mb4/utf8mb4_unicode_ci?

mysql collation character-set encoding

I was facing some issues with the character's encoding. Those are resolved by updating the CHARACTER and COLLATE for some columns in the table. So my concern is - Is this conversion safe? - Or can this create issues for different encoding techniques? - Also how to convert the CHARACTER and COLLATE f...

                                  I was facing some issues with the character's encoding. Those are resolved by updating the CHARACTER and COLLATE for some columns in the table. So my concern is 

- Is this conversion safe? 
- Or can this create issues for different encoding techniques?
- Also how to convert the CHARACTER and COLLATE for multiple columns of the table in the same MySQL query?
                                

Mehar (121 rep)

Jul 26, 2021, 04:21 PM • Last activity: Jun 4, 2025, 07:04 PM

11 votes

2 answers

11153 views

ORDER BY and comparison of mixed strings of letters and numbers

sql-server sql-server-2012 collation sorting natural-sort

We need to do some reporting on values that are usually mixed strings of numbers and letters that need to be sorted 'naturally'. Things like, e.g. "P7B18" or "P12B3". @The strings will mostly be sequences of letters then numbers alternating. The number of these segments and the length of each could...

                                  We need to do some reporting on values that are usually mixed strings of numbers and letters that need to be sorted 'naturally'. Things like, e.g. "P7B18" or "P12B3". @The strings will mostly be sequences of letters then numbers alternating. The number of these segments and the length of each could vary, though.

We'd like the numeric portions of these to be sorted in numeric order. Obviously, if I just handle those string values directly with ORDER BY, then "P12B3" is going to come before "P7B18", since "P1" is earlier than "P7", but I'd like the reverse, as "P7" naturally precedes "P12".

I'd also like to be able to do range comparisons, e.g. @bin < 'P13S6' or some such. I don't have to handle floating point or negative numbers; these will strictly be non-negative integers that we're dealing with. String lengths and number of segments could potentially be arbitrary, with no fixed upper bounds.

In our case, string casing isn't important, though if there's a way to do this in a collation-aware fashion, others might find that useful. The ugliest part of all this is I'd like to be able to do both ordering, and range filtering in the WHERE clause.

If I were doing this in C#, it would be a pretty simple task: do some parsing to separate the alpha from the numeric, implement IComparable, and you're basically done. SQL Server, of course, doesn't appear to offer any similar functionality, at least as far as I'm aware.

Anybody know any good tricks to make this work? Is there some little-publicized ability to create custom CLR types that implement IComparable and have this behave as expected? I'm also not opposed to Stupid XML Tricks (see also: list concatenation), and I've got CLR regex matching/extracting/replacement wrapper functions available on the server as well.

**EDIT:**
As a slightly more detailed example, I'd want the data to behave something like this.

    SELECT bin FROM bins ORDER BY bin

    bin
    --------------------
    M7R16L
    P8RF6JJ
    P16B5
    PR7S19
    PR7S19L
    S2F3
    S12F0

i.e. break the strings into tokens of all letters or all numbers, and sort them either alphabetically or numerically respectively, with the leftmost tokens being the most significant sorting term. Like I mentioned, piece of cake in .NET if you implement IComparable, but I don't know how (or if) you can do that sort of thing in SQL Server. It's certainly not something I've ever come across in 10 or so years of working with it.
                                

db2 (9708 rep)

Jan 20, 2016, 02:28 PM • Last activity: May 17, 2025, 04:01 AM

1 votes

1 answers

2094 views

MySQL- Performance implications of changing a column collation from latin1_swedish_ci to latin1_bin

mysql performance mysql-5.6 collation

I am a developer so was hoping someone with DBA experience could help me out here. I am using a MySQL database (MYSQL 5.6). I came across a situation where a particular column stores Ids generated by any external system (to be provided by user input) and hence requires case sensitivity (`abcd` is a...

                                  I am a developer so was hoping someone with DBA experience could help me out here.

I am using a MySQL database (MYSQL 5.6).

I came across a situation where a particular column stores Ids generated by any external system (to be provided by user input) and hence requires case sensitivity (abcd is a different id from ABCD). It seemed pretty natural to me to change the collation for that column to latin1_bin from latin1_swedish_ci where my database and tables all use latin1_swedish_ci collation.

I use the following statement to change the collation:

    ALTER TABLE T MODIFY C VARCHAR(200) BINARY NOT NULL;

Are there any performance implications of doing this change? My application does not use COLLATE specifically in any query.

Also, column C is part of an index and a unique constraint with 2 other columns say A, B in table T (A,B,C). A and B are also foreign keys in table T.

Do I need to drop the index and unique constraints and re-create them?

Also, how do I rollback this change? Will this statement do the job:

    ALTER TABLE T MODIFY C VARCHAR(200) NOT NULL;

The table T looks like :

    |id|A|B|C|created_at|

id is auto increment.

Please help.

Kushan Sen (41 rep)

Dec 13, 2019, 06:37 AM • Last activity: Apr 27, 2025, 04:06 AM

2 votes

2 answers

103 views

Can I change the collation of a table and its columns without using convert to?

mysql mysql-8.0 collation

All of the tables and columns in my database are using the utf8mb4 character set and utf8mb4_unicode_ci collation. I'd like to migrate to utf8mb4_0900_ai_ci collation but using `convert to` is prohibitive on my large tables unless I use gh-ost or pt-osc. It seems to me that `convert to` should not b...

                                  All of the tables and columns in my database are using the utf8mb4 character set and utf8mb4_unicode_ci collation. I'd like to migrate to utf8mb4_0900_ai_ci collation but using convert to is prohibitive on my large tables unless I use gh-ost or pt-osc. 

It seems to me that convert to should not be necessary since the character set is not changing, but I cannot find any documentation which confirms or denies this. Do I need to use convert to in this case, or is alter table ... modify col ... collate ... sufficient?

I've looked https://dba.stackexchange.com/questions/229954/using-convert-to-character-set-for-changing-only-collation  which asks a very similar question, but the age of that question renders the answers somewhat obsolete. Even so, one answer seems to imply that convert to is unnecessary. The other answer delves into character set conversion and using pt-osc which I'd hoping to avoid.

Joel (133 rep)

Apr 4, 2025, 09:01 PM • Last activity: Apr 8, 2025, 05:39 PM

1 votes

1 answers

60 views

CONVERT TO CHARACTER SET isnt working with modify column

mysql collation character-set

I am trying to convert the table charset and collation along side preserving some of the columns explicit charset and collation. I need to do it in single alter statement. If i use following, it is changing the charset of the column which i want to retain. Any idea if something is missing? query 1-...

                                  I am trying to convert the table charset and collation along side preserving some of the columns explicit charset and collation. I need to do it in single alter statement. If i use following, it is changing the charset of the column which i want to retain. Any idea if something is missing?

query 1-

    ALTER TABLE summary CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci , MODIFY COLUMN summary_id varchar(11) CHARACTER SET ascii COLLATE ascii_bin NOT NULL;

Show create table o/p:

    Create Table: CREATE TABLE "summary" (
      “summary_id" varchar(11) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci NOT NULL,
    ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci

DTdev (111 rep)

Mar 18, 2025, 09:14 AM • Last activity: Mar 18, 2025, 01:53 PM

9 votes

2 answers

16242 views

Best General Purpose Character Set and Collation for MySQL

mysql database-design collation character-set

Currently, whenever I create a new MySQL database, I use utf8mb4 as a character set and utf8mb4_unicode_520_ci for the collation, e.g.: CREATE DATABASE IF NOT EXISTS db_name DEFAULT CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_520_ci; Is there a newer/upgraded general purpose collation or character...

                                  Currently, whenever I create a new MySQL database, I use utf8mb4 as a character set and utf8mb4_unicode_520_ci for the collation, e.g.:

    CREATE DATABASE IF NOT EXISTS db_name
        DEFAULT CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_520_ci;

Is there a newer/upgraded general purpose collation or character set for MySQL?

For example if there's a collation that superseded utf8mb4_unicode_520_ci, like utf8mb4_unicode_800_ci or something like that?

Thanks for your help.

GTS Joe (273 rep)

Jun 2, 2021, 11:48 PM • Last activity: Mar 13, 2025, 11:03 AM

2 votes

2 answers

2038 views

Identify the version of a collation from ICU in Postgres

postgresql collation data-versioning international-components-unicode

Postgres 10 and later incorporates the [International Components for Unicode (ICU)][1] library for text-handling and other internationalization issues. Changes happen to human languages, such as sorting rules evolving. Therefore collation definitions are versioned in ICU. ➥ How can I identify the ve...

                                  Postgres 10 and later incorporates the International Components for Unicode (ICU)  library for text-handling and other internationalization issues. 

Changes happen to human languages, such as sorting rules evolving. Therefore collation definitions are versioned in ICU. 

➥ How can I identify the version number of a collation? 

I know Postgres will report  version number as part of a collation version mismatch error message. Is there a way to proactively query for the version rather than wait for an error message?

Basil Bourque (11188 rep)

Aug 4, 2019, 03:16 AM • Last activity: Mar 11, 2025, 06:36 AM

21 votes

4 answers

114808 views

How to do a case-insensitive LIKE in a case-sensitive database?

sql-server collation like

My vendor requires the data warehouse database to be case sensitive, but I need to do case-insensitive queries against it. In a case-sensitive database, how would you write this to be case-insensitive? Where Name like '%hospitalist%'

                                  My vendor requires the data warehouse database to be case sensitive, but I need to do case-insensitive queries against it.

In a case-sensitive database, how would you write this to be case-insensitive?

		Where Name like '%hospitalist%'

James (2668 rep)

Jun 13, 2018, 01:49 PM • Last activity: Jan 30, 2025, 04:25 PM

5 votes

3 answers

13953 views

Case Insensitive ORDER BY clause using COLLATE

postgresql order-by collation

I have spent a long time looking for this, and I am getting mixed messages. In other DBMSs (tested in SQLite, Oracle, MariaDB, MSSQL) I can override the default sort order using the `COLLATE` clause: ```sql SELECT * FROM orderby ORDER BY string COLLATE … ; -- SQLite: BINARY | NOCASE -- MariaDB: utf8...

I have spent a long time looking for this, and I am getting mixed messages. In other DBMSs (tested in SQLite, Oracle, MariaDB, MSSQL) I can override the default sort order using the COLLATE clause:

SELECT *
FROM orderby
ORDER BY string COLLATE … ;

--	SQLite:		BINARY | NOCASE
--	MariaDB:	utf8mb4_bin | utf8mb4_general_ci
--	Oracle:		BINARY | BINARY_CI
--	MSSQL:		Latin1_General_BIN | Latin1_General_CI_AS

I have pored over the documentation and searched high and low, but I can’t find anything so straightforward for PostgreSQL. Is there a COLLATE clause value that would sort Case Insensitive? I know there are many questions regarding case sensitivity, but (a) most of them are old and (b) none that I have seen relate to the COLLATE clause. FWIW, I am testing on PostgreSQL 11.8. I have a test fiddle on http://sqlfiddle.com/#!17/05cab/1 , but it’s only for PostgreSQL 9.6. MySQL/MariaDB and SQL Server default to case insensitive, and that would certainly make sense when sorting most text. Oracle and SQLite default to case sensitive, but have a relatively simple solution for a case insensitive sort. The default collation for my database is en_US.UTF-8. I’m trying to fill in a few gaps here.

Manngo (3145 rep)

Nov 22, 2020, 04:44 AM • Last activity: Jan 24, 2025, 12:49 PM

0 votes

1 answers

652 views

Incorrect string value for MariaDB + utf8mb4 + utf8mb4_uca1400_ai_ci

mariadb collation character-set

> Incorrect string value: > \\\'\\\\xCD\\\\xCF\\\\xD7\\\\xA7\\\\xC9\\\\xFA...\\\ Everything on my server should be `utf8mb4` and `utf8mb4_uca1400_ai_ci` so let's go through everything. My `my.ini` file on Windows 10 for MariaDB 11.4.2: [mysql] default-character-set = utf8mb4 [mysqld] datadir=C:/WAMP...

> Incorrect string value: > \\\'\\\\xCD\\\\xCF\\\\xD7\\\\xA7\\\\xC9\\\\xFA...\\\ Everything on my server should be utf8mb4 and utf8mb4_uca1400_ai_ci so let's go through everything. My my.ini file on Windows 10 for MariaDB 11.4.2: [mysql] default-character-set = utf8mb4 [mysqld] datadir=C:/WAMP/MariaDB/data port=3306 innodb_buffer_pool_size=50M character-set-server=utf8mb4 character-set-client-handshake = FALSE collation-server = utf8mb4_uca1400_ai_ci general-log general-log-file=C:/WAMP/MariaDB/logs/queries.log [client] default-character-set = utf8mb4 port=3306 plugin-dir=C:\WAMP\MariaDB/lib/plugin The local query I use to verify character sets and collations:

SELECT LOWER(VARIABLE_NAME) AS variable_name, 
       GLOBAL_VALUE AS global_value, 
       DEFAULT_VALUE AS default_value 
FROM INFORMATION_SCHEMA.SYSTEM_VARIABLES 
WHERE VARIABLE_NAME LIKE '%CHARACTER_SET_%' 
   OR VARIABLE_NAME LIKE '%COLLATION_%' 
ORDER BY VARIABLE_NAME ASC;

Every time I do a server migration I run the following: SET CHARACTER SET 'utf8'; SET collation_connection = 'utf8mb4_uca1400_ai_ci'; SET GLOBAL collation_connection = 'utf8mb4_uca1400_ai_ci'; SET GLOBAL collation_database = 'utf8mb4_uca1400_ai_ci'; SET GLOBAL collation_server = 'utf8mb4_uca1400_ai_ci'; The SHOW CREATE DATABASE returns in utf8mb4 and utf8mb4_uca1400_ai_ci. The column associated with the error uses utf8mb4_uca1400_ai_ci. This is for an email message with the subject "[标题]Satisfy all your fantasies about Xianxia's hand tour." 🙄︀ I'm guessing something got lost in translation. I don't have everything maxed out on character sets and collations though I think I've covered most things. So... What have I not yet correctly set for character sets and/or collation and how do I go about ensuring that everything possible is set to utf8mb4 and utf8mb4_uca1400_ai_ci?

John (769 rep)

Jan 5, 2025, 09:17 AM • Last activity: Jan 7, 2025, 03:09 PM

0 votes

1 answers

124 views

A collation that ignores cedillas for comparison?

sql-server collation

*Cedillas* are mutations applied to letters in some languages, like portuguese, french, turkish and romanian. Some words: "coração", "açúcar", "linhaça" (letters 'Ç', 'Ş', ...). How to choose a collation that ignores *cedillas* for comparison? They are not accents. [Thi...

                                  *Cedillas* are mutations applied to letters in some languages, like portuguese, french, turkish and romanian. Some words: "coração", "açúcar", "linhaça" (letters 'Ç', 'Ş', ...).

How to choose a collation that ignores *cedillas* for comparison? They are not accents.

[This blog post](https://sqlfromhell.wordpress.com/2010/05/01/ignorando-letras-maiusculas-minusculas-acentos-e-c/)  (in portuguese) teaches that:

* SQL_Latin1_General_CP1_CI_AI: considers *cedillas*
* Latin1_General_CI_AI: ignores *cedillas*

What is the difference? It's not clear in the name of the collation.

Alexandre (103 rep)

Nov 23, 2024, 12:00 AM • Last activity: Dec 20, 2024, 11:49 AM

0 votes

2 answers

227 views

upgrade postgres cluster with changing collation & encoding

postgresql collation

we face a problem when trying to `pg_upgrade` pg12 > pg14. our source instance has `en_US.UTF-8` which it should not have had to begin with, and the new pg14 cluster is being initiated with `de_DE.UTF-8`. `pg_upgrade` throws an error on ``` lc_collate values for database "postgres" do not match: old...

we face a problem when trying to pg_upgrade pg12 > pg14. our source instance has en_US.UTF-8 which it should not have had to begin with, and the new pg14 cluster is being initiated with de_DE.UTF-8. pg_upgrade throws an error on

lc_collate values for database "postgres" do not match:  old "en_US.UTF-8", new "de_DE.UTF-8"

not being experienced in such at all, we came up with this solution and I'd like to hear some feedback whether this is sane, or not

pg14/initdb -D new_data_dir

pg14/pg_ctl -D new_data_dir start

pg_dumpall_12 | sed -i "/en_US.UTF-8/de_DE.UTF-8/" | psql_14 -p 6543 |& tee some.out

this leaves Ctype still with en_US.UTF-8 so we add the below per database

psql -c "UPDATE pg_database SET datctype = 'de_DE.UTF-8' WHERE datname =  ;"

vrms (269 rep)

Dec 18, 2024, 09:31 AM • Last activity: Dec 18, 2024, 02:21 PM

1 votes

1 answers

284 views

Any downsides of using UTF8 collation in SQL Server 2019?

sql-server sql-server-2019 collation

We have some columns in the table that could possibly get some non-ANSI characters and we must store them. Before the SQL Server 2019 there was the only 1 option - `NVARCHAR` data type, however starting with SQL Server 2019 it is possible to use `VARCHAR` data type with the collations that ends with...

                                  We have some columns in the table that could possibly get some non-ANSI characters and we must store them. Before the SQL Server 2019 there was the only 1 option - NVARCHAR data type, however starting with SQL Server 2019 it is possible to use VARCHAR data type with the collations that ends with %_UTF8. I understand that for full compatibility with the size, I'll need to double the VARCHAR columns size, so if it was VARCHAR(20) it will needed to be VARCHAR(40) to cover the case if somebody would insert text where all the 20 characters are non-ANSI, but are there any other downsides of such collation? The data in that column will be ~99.9% with ANSI characters only.
                                

Dmitrij Kultasev (191 rep)

Dec 9, 2024, 09:54 AM • Last activity: Dec 9, 2024, 02:16 PM

36 votes

2 answers

37762 views

Why would you index text_pattern_ops on a text column?

postgresql index collation pattern-matching

Today [Seven Databases in Seven Weeks](http://pragprog.com/book/rwdata/seven-databases-in-seven-weeks) introduced me to per-operator indexes. > You can index strings for pattern matching the previous queries by creating a `text_pattern_ops` operator class index, as long as the values are indexed in...

                                  Today [Seven Databases in Seven Weeks](http://pragprog.com/book/rwdata/seven-databases-in-seven-weeks)  introduced me to per-operator indexes.

> You can index strings for pattern matching the previous queries by creating a text_pattern_ops operator class index, as long as the values are indexed in lowercase.

    CREATE INDEX moves_title_pattern ON movies (
        (lower(title) text_pattern_ops);

> We used the text_pattern_ops because the title is of type text. If you need to index varchars, chars, or names, use the related ops: varchar_pattern_ops, bpchar_pattern_ops, and name_pattern_ops.

I find the example really confusing. Why is it useful to do this?

If the column is type text, wouldn't the other types (varchar, char, name) be cast to to text before being used as a search value?

How does that index behave differently from one using the default operator?

    CREATE INDEX moves_title_pattern ON movies (lower(title));

Iain Samuel McLean Elder (2408 rep)

Nov 22, 2013, 05:51 PM • Last activity: Dec 5, 2024, 09:36 AM

1 votes

1 answers

306 views

Illegal mix of collations

mariadb collation mariadb-10.6

I am managing a MariaDB 10.6 Server for a bunch of people. I also dump the databases for backup purposes. Recently, one of the databases started to make problems while dumping: > mysqldump: Couldn't execute 'show create table my_view': Illegal mix of collations (utf8mb4_general_ci,COERCIBLE) and (utf8mb4_unicode_ci,COERCIBLE) for operation '=' (1267) Everything in that database was created using the utf8mb4 character set and utf8mb4_unicode_ci collation. I checked the DB, tables, columns and view. I then found out, that when I open a regular mariadb shell, the collation_connection differs:

MariaDB [(none)]> SHOW GLOBAL VARIABLES LIKE "collation_connection";
+----------------------+--------------------+
| Variable_name        | Value              |
+----------------------+--------------------+
| collation_connection | utf8mb4_unicode_ci |
+----------------------+--------------------+
1 row in set (0.002 sec)

MariaDB [(none)]> SHOW VARIABLES LIKE "collation_connection";
+----------------------+--------------------+
| Variable_name        | Value              |
+----------------------+--------------------+
| collation_connection | utf8mb4_general_ci |
+----------------------+--------------------+
1 row in set (0.003 sec)

**My first question:** Why does the global scope have a different collation_connection compared to the session scope? I always thought the session scope inherits from the global scope. I was not able to find in the docs, where this gets set. Here are the relevant parts of my mariadb option file:

[mariadb]
character_set_server=utf8mb4
collation_server=utf8mb4_unicode_ci

[client]
default-character-set=utf8mb4

[mariadb-dump]
default-character-set=utf8mb4

**My second question:** How to prevent such issues? Do I have to tell all of my users to use the same collation to prevent issues like that? --- Here is an example to reproduce it:

CREATE TABLE employees (
    id INT AUTO_INCREMENT PRIMARY KEY,
    name VARCHAR(100) NOT NULL,
    department VARCHAR(100),
    salary DECIMAL(10, 2),
    created_at TIMESTAMP
);

INSERT INTO employees (name, department, salary, created_at)
VALUES
    ('Alice', 'HR', 50000, NOW()),
    ('Bob', 'Engineering', 70000, NOW()),
    ('Charlie', 'HR', 45000, NOW()),
    ('David', 'Engineering', 80000, NOW()),
    ('Eve', 'Marketing', 60000, NOW());

SET collation_connection = utf8mb4_unicode_ci;

CREATE VIEW hr_employees AS
SELECT id, name, salary
FROM employees
WHERE department = 'HR' AND date_format(created_at, "%Y") IN ('2024');

SET collation_connection = utf8mb4_general_ci;

SHOW CREATE VIEW hr_employees;
ERROR 1267 (HY000): Illegal mix of collations (utf8mb4_general_ci,COERCIBLE) and (utf8mb4_unicode_ci,COERCIBLE) for operation '='

Vince (111 rep)

Nov 22, 2024, 01:23 PM • Last activity: Nov 28, 2024, 09:19 AM

1 votes

1 answers

98 views

MySQL: using NAME_CONST internally

mysql stored-procedures functions mysql-8.0 collation

I'm using MySQL 8.0.32 I have some questions about an internal conversion, that I don't understand. In addition to this, sometimes this conversion is shown in SHOW FULL PROCESSLIST, as its execution was too long. This is my table. CREATE TABLE `provinces` ( `ID` VARCHAR(2) NOT NULL COLLATE 'utf8mb4_...

                                  I'm using MySQL 8.0.32
I have some questions about an internal conversion, that I don't understand. In addition to this, sometimes this conversion is shown in SHOW FULL PROCESSLIST, as its execution was too long.

This is my table.

    CREATE TABLE provinces (
     ID VARCHAR(2) NOT NULL COLLATE 'utf8mb4_swedish_ci',
     PROVINCIA VARCHAR(45) NOT NULL COLLATE 'utf8mb4_swedish_ci',
      PRIMARY KEY (ID)
      ) COLLATE='utf8mb4_swedish_ci' ENGINE=InnoDB;

I have 53 rows like these

ID	|PROVINCIA
----|----
01	|Álava
02	|Albacete



The MySQL Function is:

    CREATE FUNCTION get_provincia( id_provincia VARCHAR(2)	)
    RETURNS varchar(45)
    LANGUAGE SQL
    NOT DETERMINISTIC
    READS SQL DATA
    SQL SECURITY DEFINER
    COMMENT ''
    BEGIN
    	DECLARE desc_provincia VARCHAR(45) DEFAULT '-';
    
    	SELECT PROVINCIA INTO desc_provincia
    	FROM provinces
    	WHERE ID = id_provincia;
    	
    	RETURN desc_provincia;
    END


The SHOW STATUS of this function shows:

 - character_set_client: utf8mb4
 - collation_connection: utf8mb4_0900_ai_ci 
 - database_collation:	latin1_swedish_ci

That function is invoked as a field in a query.

    SELECT customer_id, customer_name, ..., get_provincia(SUBSTRING(zip_code,1,2))...
    FROM ...

When I'm reviewing the execution with SHOW FULL PROCESSLIST, the query inside the function is shown like this:

    SELECT provincia INTO desc_provincia
    FROM provincias
    WHERE id =  NAME_CONST('id_provincia',_latin1'15' COLLATE 'latin1_swedish_ci')
	
So, I have these questions:

 1. Why is the NAME_CONST forced by MySQL?
 2. Does it have an impact on performance? I don't understand that very simple query shows "2" value in column "Time" of SHOW FULL PROCESSLIST command, as it takes 2 seconds.

Thanks in advance.
                                

yaki_nuka (155 rep)

Nov 6, 2024, 03:54 PM • Last activity: Nov 8, 2024, 03:28 PM

Showing page 1 of 20 total questions