Slow queries on JSON_TABLE based view on mariaDB
1
vote
1
answer
68
views
We a running an instance of Apache syncope with around 130k users (ca. 33k with
active
or pending
state). When searching for users by attributes we see long query times often over 50 seconds. One example query might be
select * from user_search_attr
where schema_id = 'familyName'
and stringvalue = 'Bergmann'
user_search_attr
is a view defined as follows:
CREATE OR REPLACE
ALGORITHM = UNDEFINED VIEW user_search_attr
AS
select
u
.id
AS any_id
,
attrs
.schema_id
AS schema_id
,
attrs
.booleanvalue
AS booleanvalue
,
attrs
.datevalue
AS datevalue
,
attrs
.doublevalue
AS doublevalue
,
attrs
.longvalue
AS longvalue
,
attrs
.stringvalue
AS stringvalue
from
(SyncopeUser
u
left join JSON_TABLE(coalesce(u
.plainAttrs
, '[]'), '$[*]'
COLUMNS (schema_id
varchar(255) PATH '$.schema',
NESTED PATH '$.values[*]'
COLUMNS (booleanvalue
int(11) PATH '$.booleanValue',
datevalue
varchar(32) PATH '$.dateValue',
doublevalue
double PATH '$.doubleValue',
longvalue
bigint(20) PATH '$.longValue',
stringvalue
varchar(255) PATH '$.stringValue'))) attrs
on
(1 = 1))
where
attrs
.schema_id
is not null
and (attrs
.booleanvalue
is not null
or attrs
.datevalue
is not null
or attrs
.doublevalue
is not null
or attrs
.longvalue
is not null
or attrs
.stringvalue
is not null);
As you can see the data comes from the table SyncopeUser
which is defined as follows:
CREATE TABLE SyncopeUser
(
id
varchar(36) NOT NULL,
creationContext
varchar(255) DEFAULT NULL,
creationDate
datetime(3) DEFAULT NULL,
creator
varchar(255) DEFAULT NULL,
lastChangeContext
varchar(255) DEFAULT NULL,
lastChangeDate
datetime(3) DEFAULT NULL,
lastModifier
varchar(255) DEFAULT NULL,
status
varchar(255) DEFAULT NULL,
changePwdDate
datetime(3) DEFAULT NULL,
cipherAlgorithm
varchar(20) DEFAULT NULL,
failedLogins
int(11) DEFAULT NULL,
lastLoginDate
datetime(3) DEFAULT NULL,
mustChangePassword
int(11) DEFAULT NULL,
password
varchar(255) DEFAULT NULL,
passwordHistory
text DEFAULT NULL,
securityAnswer
varchar(255) DEFAULT NULL,
suspended
int(11) DEFAULT NULL,
token
text DEFAULT NULL,
tokenExpireTime
datetime(3) DEFAULT NULL,
username
varchar(255) DEFAULT NULL,
plainAttrs
longtext CHARACTER SET utf8mb4 COLLATE utf8mb4_bin
DEFAULT NULL CHECK (json_valid(plainAttrs
)),
REALM_ID
varchar(36) DEFAULT NULL,
SECURITYQUESTION_ID
varchar(36) DEFAULT NULL,
PRIMARY KEY (id
),
UNIQUE KEY U_SYNCPSR_USERNAME
(username
),
UNIQUE KEY SyncopeUser_username
(username
),
KEY SECURITYQUESTION_ID
(SECURITYQUESTION_ID
),
KEY SyncopeUser_realm_id
(REALM_ID
),
CONSTRAINT SyncopeUser_ibfk_1
FOREIGN KEY (REALM_ID
)
REFERENCES Realm
(id
),
CONSTRAINT SyncopeUser_ibfk_2
FOREIGN KEY (SECURITYQUESTION_ID
)
REFERENCES SecurityQuestion
(id
)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb3 COLLATE=utf8mb3_general_ci;
The relevant column plainAttrs
contains the users attibutes as JSON string. One example might be [{"uniqueValue":{"stringValue":"123456789"},"schema":"lockSystemId"},{"values":[{"stringValue":"Bergmann"}],"schema":"familyName"}]
.
The following indexes are defined on SyncopeUser
:
Table |Non_unique|Key_name |Seq_in_index|Column_name |Collation|Cardinality|Sub_part|Packed|Null|Index_type|Comment|Index_comment|Ignored|
-----------+----------+--------------------+------------+-------------------+---------+-----------+--------+------+----+----------+-------+-------------+-------+
SyncopeUser| 0|PRIMARY | 1|id |A | 149635| | | |BTREE | | |NO |
SyncopeUser| 0|U_SYNCPSR_USERNAME | 1|username |A | 149635| | |YES |BTREE | | |NO |
SyncopeUser| 0|SyncopeUser_username| 1|username |A | 149635| | |YES |BTREE | | |NO |
SyncopeUser| 1|SECURITYQUESTION_ID | 1|SECURITYQUESTION_ID|A | 1| | |YES |BTREE | | |NO |
SyncopeUser| 1|SyncopeUser_realm_id| 1|REALM_ID |A | 1| | |YES |BTREE | | |NO |
As most normal users have over 15 attributes the user_search_attr
view contains over 2M rows.
The servers are VMs with currently 8 GB of RAM and 6 CPUs but this can be changed. We are currently running version 11.4.7 of MariaDB in a galera cluster of 3 Servers. Syncope is only connected to one of the servers as this cluster is also used for other applications and we don't want to influnce these other applications.
Here are some of the IMHO relevant configuration options:
[mysqld]
aria_pagecache_buffer_size = 32M
basedir = /usr
bind-address = ::
binlog_format = ROW
character_set_server = utf8
collation_server = utf8_general_ci
innodb_adaptive_hash_index = OFF
innodb_autoinc_lock_mode = 2
innodb_buffer_pool_instances = 1
innodb_buffer_pool_size = 4096M
innodb_flush_log_at_trx_commit = 0
innodb_log_buffer_size = 16MB
innodb_log_file_size = 512M
key_buffer_size = 1M
log-error = /var/log/mariadb/mariadb.log
log_queries_not_using_indexes = OFF
long_query_time = 4.0
max_allowed_packet = 128M
max_binlog_size = 100M
max_connections = 400
max_heap_table_size = 256M
performance_schema = ON
query_cache_limit = 16M
query_cache_size = 0
query_cache_type = OFF
skip-external-locking
skip-name-resolve = 1
slow_query_log = 0
slow_query_log_file = /var/log/mariadb/slow.log
sort_buffer_size = 4M
table_definition_cache = 800
table_open_cache = 1000
thread_cache_size = 8
thread_stack = 256K
tmp_table_size = 256M
wsrep_on = ON
wsrep_provider = /usr/lib64/galera-4/libgalera_smm.so
As most queries only search for active
or pending
users one option that I have not tried is indexing or partitioning the table for the status
column.
Currently we are not in the position of changing the design of the database or even switching to another IDM sollution. We hope to find optimazion potential on the DB layer or maybe some improvements on some queries.
Asked by Clemens Bergmann
(111 rep)
Aug 3, 2025, 06:44 AM
Last activity: Aug 5, 2025, 08:07 AM
Last activity: Aug 5, 2025, 08:07 AM