Database Administrators

Q&A for database professionals who wish to improve their database skills

Latest Questions

0 votes

1 answers

1374 views

mysql (mariadb) won't start after power outage (MacOS)

/var/log/system.log says (over and over) ``` Oct 13 19:34:01 Data-Server com.apple.xpc.launchd[1] (com.mariadb.server[2128]): Service exited with abnormal code: 1 Oct 13 19:34:01 Data-Server com.apple.xpc.launchd[1] (com.mariadb.server): Service only ran for 0 seconds. Pushing respawn out by 10 seco...

/var/log/system.log says (over and over)

Oct 13 19:34:01 Data-Server com.apple.xpc.launchd (com.mariadb.server): Service exited with abnormal code: 1
Oct 13 19:34:01 Data-Server com.apple.xpc.launchd (com.mariadb.server): Service only ran for 0 seconds. Pushing respawn out by 10 seconds.

/usr/local/var/mysql/Data-Server.local.err says (once recently, repeated a number of times well before the crash)

2020-10-13  2:44:25 20019181 [Warning] Aborted connection 20019181 to db: 'EcoReality' user: 'root' host: '10.1.2.2' (Got timeout reading communication packets)

First thing I did was to shutdown the launchctl entry, to keep it from constantly restarting. # launchctl unload /Library/LaunchDaemons/com.mariadb.server.plist Then I tried invoking mysqld manually:

# sudo  /usr/local/bin/mysqld -u mysql
2020-10-13 20:46:09 0 [Note] /usr/local/bin/mysqld (mysqld 10.4.6-MariaDB) starting as process 2364 ...
2020-10-13 20:46:09 0 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
2020-10-13 20:46:09 0 [Note] InnoDB: Uses event mutexes
2020-10-13 20:46:09 0 [Note] InnoDB: Compressed tables use zlib 1.2.11
2020-10-13 20:46:09 0 [Note] InnoDB: Number of pools: 1
2020-10-13 20:46:09 0 [Note] InnoDB: Using SSE2 crc32 instructions
2020-10-13 20:46:09 0 [Note] InnoDB: Initializing buffer pool, total size = 128M, instances = 1, chunk size = 128M
2020-10-13 20:46:09 0 [Note] InnoDB: Completed initialization of buffer pool
2020-10-13 20:46:09 0 [ERROR] InnoDB: Invalid log block checksum. block: 81635496 checkpoint no: 2609153 expected: 296846624 found: 3735928559
2020-10-13 20:46:09 0 [ERROR] InnoDB: Missing MLOG_CHECKPOINT at 41797373564 between the checkpoint 41797373564 and the end 41797373440.
2020-10-13 20:46:09 0 [ERROR] InnoDB: Plugin initialization aborted with error Generic error
2020-10-13 20:46:09 0 [Note] InnoDB: Starting shutdown...
2020-10-13 20:46:09 0 [ERROR] Plugin 'InnoDB' init function returned error.
2020-10-13 20:46:09 0 [ERROR] Plugin 'InnoDB' registration as a STORAGE ENGINE failed.
2020-10-13 20:46:09 0 [Note] Plugin 'FEEDBACK' is disabled.
2020-10-13 20:46:09 0 [Note] CONNECT: Version 1.06.0009 January 27, 2019
2020-10-13 20:46:09 0 [ERROR] Unknown/unsupported storage engine: InnoDB
2020-10-13 20:46:09 0 [ERROR] Aborting

So now I'm a bit stumped at the lack of diagnostic messages. Is there any way to coax more info out of mysqld when it goes down? Or should I just start incrementing innodb_force_recovery until something interesting happens?

Jan Steinman (191 rep)

Oct 14, 2020, 04:01 AM • Last activity: Aug 5, 2025, 07:01 PM

0 votes

2 answers

628 views

MySQL 5.7 intermittent max connections errors

mysql crash max-connections

I’m having a problem with MySQL 5.7 with « too many connections » causing services crashes. The max_connections system variable is set at 1000 and on average there are +/- 250 sessions/thread, so it’s odd that the max connections are being reached. The issue appears mostly at night between...

                                  I’m having a problem with MySQL 5.7 with « too many connections » causing services crashes. The max_connections system variable is set at 1000 and on average there are +/- 250 sessions/thread, so it’s odd that the max connections are being reached. The issue appears mostly at night between 10 and 11 pm at certain weeknights.

The machine is a Windows 2008 R2 Enterprise Server with 32 Gb RAM and Dual Xeon CPUs. Here's some more environmental information:

        Variable            |   Max Connection Memory
    -------------------------------------------------
    join_buffer_size        |       250.00 MB
    read_buffer_size        |       62.50 MB
    read_rnd_buffer_size    |       250.00 MB
    sort_buffer_size        |       250.00 MB
    max_connections = 1000  |       812.50 MB
    
    Timeouts                    |   VALUE
    -------------------------------------------
    connect_timeout             |   10
    delayed_insert_timeout      |   300
    have_statement_timeout      |   YES
    innodb_flush_log_at_timeout |   1
    innodb_lock_wait_timeout    |   50
    innodb_rollback_on_timeout  |   OFF
    interactive_timeout         |   28800
    lock_wait_timeout           |   31536000
    net_read_timeout            |   30
    net_write_timeout           |   60
    rpl_stop_slave_timeout      |   31536000
    slave_net_timeout           |   60
    wait_timeout                |   28800
    -------------------------------------------
    max_allowed_packet          | 33554432
    slave_max_allowed_packet    | 1073741824

Here's my sample log file:

    Aborted connection 27933 to db: 'wms_mysql' user: 'mysql' host: 'eifprdrds01.domain.com' (Got an error reading communication packets)
    Aborted connection 26736 to db: 'wms_mysql' user: 'mysql' host: 'eifprdrds01.domain.com' (Got an error reading communication packets)
    Aborted connection 27200 to db: 'wms_mysql' user: 'mysql' host: 'eifprdrds01.domain.com' (Got an error reading communication packets)
    Aborted connection 27356 to db: 'wms_mysql' user: 'mysql' host: 'eifprdrds01.domain.com' (Got an error reading communication packets)
    Aborted connection 29119 to db: 'wms_mysql' user: 'mysql' host: 'pc286.domain.com' (Got an error reading communication packets)
    Aborted connection 16274 to db: 'wms_mysql' user: 'mysql' host: 'pc828.domain.com' (Got timeout reading communication packets)
    Aborted connection 24916 to db: 'wms_mysql' user: 'mysql' host: 'pc830.domain.com' (Got an error reading communication packets)
    Aborted connection 19357 to db: 'wms_mysql' user: 'mysql' host: 'pc830.domain.com' (Got an error reading communication packets)
    Aborted connection 19343 to db: 'wms_mysql' user: 'mysql' host: 'pc830.domain.com' (Got an error reading communication packets)

Here are some additional environmental parameters (global status):

    Variable_name | Value
    Aborted_clients | 579
    Aborted_connects | 1
    Binlog_cache_disk_use | 0
    Binlog_cache_use | 0
    Binlog_stmt_cache_disk_use | 0
    Binlog_stmt_cache_use | 0
    Bytes_received | 112705860256
    Bytes_sent | 1320858513743
    Com_admin_commands | 6343
    Com_assign_to_keycache | 0
    Com_alter_db | 0
    Com_alter_db_upgrade | 0
    Com_alter_event | 0
    Com_alter_function | 0
    Com_alter_instance | 0
    Com_alter_procedure | 0
    Com_alter_server | 0
    Com_alter_table | 0
    Com_alter_tablespace | 0
    Com_alter_user | 0
    Com_analyze | 0
    Com_begin | 368010
    Com_binlog | 0
    Com_call_procedure | 0
    Com_change_db | 14
    Com_change_master | 0
    Com_change_repl_filter | 0
    Com_check | 0
    Com_checksum | 0
    Com_commit | 367880
    Com_create_db | 0
    Com_create_event | 0
    Com_create_function | 0
    Com_create_index | 0
    Com_create_procedure | 0
    Com_create_server | 0
    Com_create_table | 0
    Com_create_trigger | 0
    Com_create_udf | 0
    Com_create_user | 0
    Com_create_view | 0
    Com_dealloc_sql | 0
    Com_delete | 1899441
    Com_delete_multi | 0
    Com_do | 0
    Com_drop_db | 0
    Com_drop_event | 0
    Com_drop_function | 0
    Com_drop_index | 0
    Com_drop_procedure | 0
    Com_drop_server | 0
    Com_drop_table | 0
    Com_drop_trigger | 0
    Com_drop_user | 0
    Com_drop_view | 0
    Com_empty_query | 0
    Com_execute_sql | 0
    Com_explain_other | 0
    Com_flush | 0
    Com_get_diagnostics | 0
    Com_grant | 0
    Com_ha_close | 0
    Com_ha_open | 0
    Com_ha_read | 0
    Com_help | 0
    Com_insert | 5932889
    Com_insert_select | 0
    Com_install_plugin | 0
    Com_kill | 1
    Com_load | 0
    Com_lock_tables | 0
    Com_optimize | 0
    Com_preload_keys | 0
    Com_prepare_sql | 0
    Com_purge | 0
    Com_purge_before_date | 0
    Com_release_savepoint | 0
    Com_rename_table | 0
    Com_rename_user | 0
    Com_repair | 0
    Com_replace | 0
    Com_replace_select | 0
    Com_reset | 0
    Com_resignal | 0
    Com_revoke | 0
    Com_revoke_all | 0
    Com_rollback | 107
    Com_rollback_to_savepoint | 0
    Com_savepoint | 0
    Com_select | 305377361
    Com_set_option | 412902
    Com_signal | 0
    Com_show_binlog_events | 0
    Com_show_binlogs | 0
    Com_show_charsets | 9
    Com_show_collations | 9
    Com_show_create_db | 0
    Com_show_create_event | 0
    Com_show_create_func | 0
    Com_show_create_proc | 0
    Com_show_create_table | 0
    Com_show_create_trigger | 0
    Com_show_databases | 20
    Com_show_engine_logs | 0
    Com_show_engine_mutex | 0
    Com_show_engine_status | 0
    Com_show_events | 0
    Com_show_errors | 0
    Com_show_fields | 33672455
    Com_show_function_code | 0
    Com_show_function_status | 8
    Com_show_grants | 2
    Com_show_keys | 34409046
    Com_show_master_status | 0
    Com_show_open_tables | 0
    Com_show_plugins | 19
    Com_show_privileges | 0
    Com_show_procedure_code | 0
    Com_show_procedure_status | 8
    Com_show_processlist | 2
    Com_show_profile | 0
    Com_show_profiles | 0
    Com_show_relaylog_events | 0
    Com_show_slave_hosts | 0
    Com_show_slave_status | 6
    Com_show_status | 337719
    Com_show_storage_engines | 9
    Com_show_table_status | 0
    Com_show_tables | 11
    Com_show_triggers | 634
    Com_show_variables | 221
    Com_show_warnings | 0
    Com_show_create_user | 0
    Com_shutdown | 0
    Com_slave_start | 0
    Com_slave_stop | 0
    Com_group_replication_start | 0
    Com_group_replication_stop | 0
    Com_stmt_execute | 2
    Com_stmt_close | 2
    Com_stmt_fetch | 0
    Com_stmt_prepare | 2
    Com_stmt_reset | 0
    Com_stmt_send_long_data | 0
    Com_truncate | 0
    Com_uninstall_plugin | 0
    Com_unlock_tables | 0
    Com_update | 3160289
    Com_update_multi | 0
    Com_xa_commit | 0
    Com_xa_end | 0
    Com_xa_prepare | 0
    Com_xa_recover | 0
    Com_xa_rollback | 0
    Com_xa_start | 0
    Com_stmt_reprepare | 0
    Compression | OFF
    Connection_errors_accept | 0
    Connection_errors_internal | 0
    Connection_errors_max_connections | 94
    Connection_errors_peer_address | 0
    Connection_errors_select | 0
    Connection_errors_tcpwrap | 0
    Connections | 412961
    Created_tmp_disk_tables | 34412942
    Created_tmp_files | 53367
    Created_tmp_tables | 71692427
    Delayed_errors | 0
    Delayed_insert_threads | 0
    Delayed_writes | 0
    Flush_commands | 1
    Handler_commit | 315946909
    Handler_delete | 5540520
    Handler_discover | 0
    Handler_external_lock | 688867588
    Handler_mrr_init | 0
    Handler_prepare | 0
    Handler_read_first | 34976935
    Handler_read_key | 2188865194
    Handler_read_last | 17485
    Handler_read_next | 54290082542
    Handler_read_prev | 19085786
    Handler_read_rnd | 1186807611
    Handler_read_rnd_next | 2955796362
    Handler_rollback | 148
    Handler_savepoint | 0
    Handler_savepoint_rollback | 0
    Handler_update | 428611514
    Handler_write | 646923649
    Innodb_buffer_pool_dump_status | Dumping of buffer pool not started
    Innodb_buffer_pool_load_status | Buffer pool(s) load completed at 181219 19:56:57
    Innodb_buffer_pool_resize_status | 
    Innodb_buffer_pool_pages_data | 966720
    Innodb_buffer_pool_bytes_data | 2953838592
    Innodb_buffer_pool_pages_dirty | 0
    Innodb_buffer_pool_bytes_dirty | 0
    Innodb_buffer_pool_pages_flushed | 5329236
    Innodb_buffer_pool_pages_free | 8197
    Innodb_buffer_pool_pages_misc | 73659
    Innodb_buffer_pool_pages_total | 1048576
    Innodb_buffer_pool_read_ahead_rnd | 0
    Innodb_buffer_pool_read_ahead | 34382
    Innodb_buffer_pool_read_ahead_evicted | 0
    Innodb_buffer_pool_read_requests | 3904592079
    Innodb_buffer_pool_reads | 747465
    Innodb_buffer_pool_wait_free | 0
    Innodb_buffer_pool_write_requests | 1045142981
    Innodb_data_fsyncs | 1695449
    Innodb_data_pending_fsyncs | 0
    Innodb_data_pending_reads | 0
    Innodb_data_pending_writes | 0
    Innodb_data_read | 644895232
    Innodb_data_reads | 1088377
    Innodb_data_writes | 14566224
    Innodb_data_written | 3881918464
    Innodb_dblwr_pages_written | 4497979
    Innodb_dblwr_writes | 398532
    Innodb_log_waits | 0
    Innodb_log_write_requests | 22692404
    Innodb_log_writes | 8771102
    Innodb_os_log_fsyncs | 322535
    Innodb_os_log_pending_fsyncs | 0
    Innodb_os_log_pending_writes | 0
    Innodb_os_log_written | 18935997952
    Innodb_page_size | 16384
    Innodb_pages_created | 134590
    Innodb_pages_read | 1087932
    Innodb_pages_written | 5329236
    Innodb_row_lock_current_waits | 0
    Innodb_row_lock_time | 3466511
    Innodb_row_lock_time_avg | 35015
    Innodb_row_lock_time_max | 51754
    Innodb_row_lock_waits | 99
    Innodb_rows_deleted | 5540520
    Innodb_rows_inserted | 564189585
    Innodb_rows_read | 2017392003
    Innodb_rows_updated | 3601631
    Innodb_num_open_files | 300
    Innodb_truncated_status_writes | 0
    Innodb_available_undo_logs | 128
    Key_blocks_not_flushed | 0
    Key_blocks_unused | 6698
    Key_blocks_used | 4
    Key_read_requests | 124
    Key_reads | 29
    Key_write_requests | 0
    Key_writes | 0
    Last_query_cost | 0
    Last_query_partial_plans | 0
    Locked_connects | 0
    Max_execution_time_exceeded | 0
    Max_execution_time_set | 0
    Max_execution_time_set_failed | 0
    Max_used_connections | 701
    Max_used_connections_time | 12/19/2018 23:27
    Not_flushed_delayed_rows | 0
    Ongoing_anonymous_transaction_count | 0
    Open_files | 1
    Open_streams | 0
    Open_table_definitions | 876
    Open_tables | 2000
    Opened_files | 57710
    Opened_table_definitions | 876
    Opened_tables | 24810492
    Performance_schema_accounts_lost | 0
    Performance_schema_cond_classes_lost | 0
    Performance_schema_cond_instances_lost | 0
    Performance_schema_digest_lost | 0
    Performance_schema_file_classes_lost | 0
    Performance_schema_file_handles_lost | 0
    Performance_schema_file_instances_lost | 0
    Performance_schema_hosts_lost | 0
    Performance_schema_index_stat_lost | 0
    Performance_schema_locker_lost | 0
    Performance_schema_memory_classes_lost | 0
    Performance_schema_metadata_lock_lost | 0
    Performance_schema_mutex_classes_lost | 0
    Performance_schema_mutex_instances_lost | 0
    Performance_schema_nested_statement_lost | 0
    Performance_schema_prepared_statements_lost | 0
    Performance_schema_program_lost | 0
    Performance_schema_rwlock_classes_lost | 0
    Performance_schema_rwlock_instances_lost | 0
    Performance_schema_session_connect_attrs_lost | 0
    Performance_schema_socket_classes_lost | 0
    Performance_schema_socket_instances_lost | 0
    Performance_schema_stage_classes_lost | 0
    Performance_schema_statement_classes_lost | 0
    Performance_schema_table_handles_lost | 0
    Performance_schema_table_instances_lost | 0
    Performance_schema_table_lock_stat_lost | 0
    Performance_schema_thread_classes_lost | 0
    Performance_schema_thread_instances_lost | 0
    Performance_schema_users_lost | 0
    Prepared_stmt_count | 0
    Qcache_free_blocks | 0
    Qcache_free_memory | 0
    Qcache_hits | 0
    Qcache_inserts | 0
    Qcache_lowmem_prunes | 0
    Qcache_not_cached | 0
    Qcache_queries_in_cache | 0
    Qcache_total_blocks | 0
    Queries | 386369511
    Questions | 386351756
    Select_full_join | 8242
    Select_full_range_join | 97092
    Select_range | 57964836
    Select_range_check | 4
    Select_scan | 69826287
    Slave_heartbeat_period | 0
    Slave_last_heartbeat | 
    Slave_open_temp_tables | 0
    Slave_received_heartbeats | 0
    Slave_retried_transactions | 0
    Slave_running | OFF
    Slow_launch_threads | 0
    Slow_queries | 44
    Sort_merge_passes | 34863
    Sort_range | 1764316
    Sort_rows | 299335094
    Sort_scan | 2888984
    Ssl_accept_renegotiates | 0
    Ssl_accepts | 0
    Ssl_callback_cache_hits | 0
    Ssl_cipher | 
    Ssl_cipher_list | 
    Ssl_client_connects | 0
    Ssl_connect_renegotiates | 0
    Ssl_ctx_verify_depth | 0
    Ssl_ctx_verify_mode | 0
    Ssl_default_timeout | 0
    Ssl_finished_accepts | 0
    Ssl_finished_connects | 0
    Ssl_server_not_after | 
    Ssl_server_not_before | 
    Ssl_session_cache_hits | 0
    Ssl_session_cache_misses | 0
    Ssl_session_cache_mode | NONE
    Ssl_session_cache_overflows | 0
    Ssl_session_cache_size | 0
    Ssl_session_cache_timeouts | 0
    Ssl_sessions_reused | 0
    Ssl_used_session_cache_entries | 0
    Ssl_verify_depth | 0
    Ssl_verify_mode | 0
    Ssl_version | 
    Table_locks_immediate | 67608
    Table_locks_waited | 0
    Table_open_cache_hits | 386002409
    Table_open_cache_misses | 24810492
    Table_open_cache_overflows | 24808485
    Tc_log_max_pages_used | 0
    Tc_log_page_size | 0
    Tc_log_page_waits | 0
    Threads_cached | 8
    Threads_connected | 10
    Threads_created | 1388
    Threads_running | 1
    Uptime | 1018391
    Uptime_since_flush_status | 1018391

I'm somewhat at a loss as to what's happening. Any advice would be very useful!
                                

Shawn_M (9 rep)

Dec 22, 2018, 09:11 PM • Last activity: Jul 31, 2025, 11:06 PM

0 votes

1 answers

808 views

Safe method to rename a crashed myisam table

mysql myisam disaster-recovery crash

I have a large MyISAM table which has crashed. Repairing the table will take some time. The table only INSERTed to and SELECTed from, never updated. To allow the application to continue working, albeit with reduced capability, I thought of - renaming the crashed table - creating a new table with the...

                                  I have a large MyISAM table which has crashed. Repairing the table will take some time. The table only INSERTed to and SELECTed from, never updated. To allow the application to continue working, albeit with reduced capability, I thought of

 - renaming the crashed table
 - creating a new table with the original name
 - switching processing back on
 - repairing the backup table
 - switching off processing
 - merging the repaired and new data
 - switching on processing

The other steps in this process do not pose any risk due to the nature of the application.
Is it safe to rename a crashed MyISAM table? How?

I believe that I can't simply do ALTER TABLE...RENAME.... as this always does a row-by-row copy  into a new table.

Apparently Peter Zaitsev uses a "tiny script which moves out all MyISAM tables out of MySQL database directory " but doesn't seem to give details of what this script does (presumably stops database first?).

                                

symcbean (339 rep)

Sep 8, 2016, 09:51 AM • Last activity: Jun 10, 2025, 08:04 AM

1 votes

1 answers

254 views

postgresql crash in master node with streaming replication

postgresql replication crash

We have a master/slave postgresql cluster with streaming replication and pgpool. Versions in both postgresql servers are. postgres: 9.4 OS: Debian GNU/Linux 8 (jessie) PGPOOL server: version pgpool: pgpool-II version 3.3.3 os: Red Hat Enterprise Linux Server release 6.1 Problem: We suffer a crash do...

                                  We have a master/slave postgresql cluster with streaming replication and pgpool.

Versions in both postgresql servers are.

	postgres: 9.4
	OS: Debian GNU/Linux 8 (jessie)

PGPOOL server:
        version pgpool: pgpool-II version 3.3.3
	os: Red Hat Enterprise Linux Server release 6.1

Problem:

We suffer a crash down of the postgresql service at master node with subsequent failover done by pgpool  

Could not find the reason of this behavior in master node. We want to know if someone coult tell us, based on the recorded log messages, what could have caused this crash. 
Following, some relevant lines from the postgresql and linux log. 


postgresql.log

    2020-12-20 21:06:04 UYT : [4-1] user=,db=,app=,client= LOG:  server process (PID 19303) was terminated by signal 11: Segmentation fault
    2020-12-20 21:06:04 UYT : [5-1] user=,db=,app=,client= DETAIL:  Failed process was running: COPY public.act_hi_varinst (id_, proc_def_key_, proc_def_id_, proc_inst_id_, execution_id_, act_inst_id_, case_def_key_, case_def_id_, case_inst_id_, case_execution_id_, task_id_, name_, var_type_, rev_, bytearray_id_, double_, long_, text_, text2_, tenant_id_, state_, create_time_, root_proc_inst_id_, removal_time_) TO stdout;
    2020-12-20 21:06:04 UYT : [6-1] user=,db=,app=,client= LOG:  terminating any other active server processes
    2020-12-20 21:06:04 UYT : [3-1] user=dbusr,db=dbdatabase     ,app=[unknown],client= WARNING:  terminating connection because of crash of another server process
    2020-12-20 21:06:04 UYT : [4-1] user=dbusr,db=dbdatabase     ,app=[unknown],client= DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
    2020-12-20 21:06:04 UYT : [5-1] user=dbusr,db=dbdatabase     ,app=[unknown],client= HINT:  In a moment you should be able to reconnect to the database and repeat your command.
    2020-12-20 21:06:04 UYT : [3-1] user=dbusr,db=dbdatabase     ,app=[unknown],client= WARNING:  terminating connection because of crash of another server process
    2020-12-20 21:06:04 UYT : [4-1] user=dbusr,db=dbdatabase     ,app=[unknown],client= DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
    
    Error found at /var/log/messages
    
    Dec 20 21:06:04 server-name kernel: [20278231.774013] postgres: segfault at 7f7740354704 ip 00007f772c8ac700 sp 00007ffdac7ff4e0 error 4 in postgres[7f772c813000+5a1000]


Thank you all.

                                

ybs_antel (11 rep)

Jan 13, 2021, 07:04 PM • Last activity: May 24, 2025, 06:05 PM

23 votes

5 answers

4387 views

Can SQL Server's crash resilience be improved?

sql-server crash

We have PCs running SQL Server (2008 SP4 and 2016 SP1) which regularly lose power. Obviously, this sometimes leads to (index) corruption of the SQL Server database, which we need to restore afterwards. I am aware that SQL Server is not designed for such scenarios and the correct solution is to fix t...

                                  We have PCs running SQL Server (2008 SP4 and 2016 SP1) which regularly lose power. Obviously, this sometimes leads to (index) corruption of the SQL Server database, which we need to restore afterwards.

I am aware that SQL Server is not designed for such scenarios and the correct solution is to fix the cause of the power loss (more on that below, if you are curious). Nevertheless, **are there any tuning options in SQL Server that I can set to reduce the risk of database corruption on power loss**?

---

*Background: The "PC" is a Windows tablet mounted on a forklift. When the user turns off the forklift, the tablet loses power. We have tried to teach the users to properly shut down Windows before turning off the forklift, but failed (probably because just turning it off "works" most of the time). We are also currently investigating other options, such as adding a UPS which signals the tablet to shut down on power loss.*

Heinzi (3210 rep)

Jul 17, 2018, 01:03 PM • Last activity: May 15, 2025, 04:08 PM

0 votes

1 answers

1022 views

MySQL 8.0.15 starts normally but any connection hangs

mysql recovery crash

I dropped a large table (80G) Friday morning and got a crash: InnoDB: ###### Diagnostic info printed to the standard error stream 2019-04-19T18:48:55.445507Z 0 [ERROR] [MY-012872] [InnoDB] Semaphore wait has lasted > 37617336 seconds. We intentionally crash the server because it appears to be hung.[...

                                  I dropped a large table (80G) Friday morning and got a crash:

InnoDB: ###### Diagnostic info printed to the standard error stream
2019-04-19T18:48:55.445507Z 0 [ERROR] [MY-012872] [InnoDB] Semaphore wait has lasted > 37617336 seconds. We intentionally crash the server because it appears to be hung.[FATAL] Semaphore wait has lasted > 600 seconds. We intentionally crash the server because it appears to be hung.
2019-04-19T18:48:55.445541Z 0 [ERROR] [MY-013183] [InnoDB] Assertion failure: ut0ut.cc:625 thread 140252955932416
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to http://bugs.mysql.com .
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: http://dev.mysql.com/doc/refman/8.0/en/forcing-innodb-recovery.html 
InnoDB: about forcing recovery.
18:48:55 UTC - mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
Attempting to collect some information that could help diagnose the problem.
As this is a crash and something is definitely wrong, the information
collection process might fail.

...lots more...

The server came back up and the log indicated it was ready for connections:

...

    2019-04-22T12:53:51.792574Z 0 [System] [MY-010116] [Server] /usr/sbin/mysqld (mysqld 8.0.15) starting as process 26956
    2019-04-22T12:53:54.631860Z 0 [System] [MY-010229] [Server] Starting crash recovery...
    2019-04-22T12:53:54.639196Z 0 [System] [MY-010232] [Server] Crash recovery finished.
    2019-04-22T12:53:54.708512Z 0 [Warning] [MY-010068] [Server] CA certificate ca.pem is self signed.
    2019-04-22T12:53:54.754962Z 0 [System] [MY-010931] [Server] /usr/sbin/mysqld: ready for connections. Version: '8.0.15'  socket: '/data/var/lib/mysql/mysql.sock'  port: 3306  MySQL Community Server - GPL.                                            

I tried connecting locally and remotely with mysql client, various programs, etc., but cannot complete the connection process - it just hangs. The connection protocol initiates, but then just sits there after negotiation (this is from monitoring with tcpdump):

    10:44:02.814237 IP 10.70.250.33.snip-slave > 10.70.250.50.mysql: Flags [S], seq 3534733756, win 14600, options [mss 1460,sackOK,TS val 2565739642 ecr 0,nop,wscale 7], length 0
    10:44:02.814273 IP 10.70.250.50.mysql > 10.70.250.33.snip-slave: Flags [S.], seq 4211652446, ack 3534733757, win 28960, options [mss 1460,sackOK,TS val 239668057 ecr 2565739642,nop,wscale 7], length 0

...

    10:44:03.042941 IP 10.70.250.50.mysql > 10.70.250.33.snip-slave: Flags [P.], seq 2784:2819, ack 858, win 252, options [nop,nop,TS val 239668286 ecr 2565739871], length 35
    10:44:03.082582 IP 10.70.250.33.snip-slave > 10.70.250.50.mysql: Flags [.], ack 2819, win 160, options [nop,nop,TS val 2565739911 ecr 239668286], length 0

The mysqld process has been running at nearly 100% CPU, for over the weekend even, but nothing seems to change. I can't stop the server with systemd, as it'll just sit there too. I can (and have) killed the mysqld process several times today and started it back up, but again nothing seems to change. Most recently I did that to force innodb recovery, which shows in the log, but is no help since I still can't connect with _anything_.

Looking with iostat, it appears to be beating the daylights out of the general log (/dbdata/var/lib/mysql/data/mysql/general_log.CSV), which is quite large - over half a TB.

I don't really care about the table that was dropped, I just need the rest of the databases to be accessible, and I'm stuck.  Help, please!

                                

Bill (11 rep)

Apr 22, 2019, 02:00 PM • Last activity: Dec 28, 2024, 08:01 PM

1 votes

1 answers

53 views

Database crash just before appending the checkpoint entry to write ahead log

transaction-log transaction recovery write-ahead-logging crash

- From what I read about WAL, its an append-only file where all the operations to the DB are written to before the operations are actually performed to the data. - There is also a concept of a "checkpoint" which is when the DB actually writes the data to disk from memory, and appends a special check...

                                  - From what I read about WAL, its an append-only file where all the operations to the DB are written to before the operations are actually performed to the data.
- There is also a concept of a "checkpoint" which is when the DB actually writes the data to disk from memory, and appends a special checkpoint entry at the end of the WAL.
- Now if the DB crashes at any point, it can read the WAL starting from the latest checkpoint entry and redo all the subsequent operations.
- But how does the DB ensure that the checkpoint WAL entry and the actual flushing of the data to disk happen in a transactional way?
- What if the data is flushed but the DB crashes before the checkpoint entry is made in the WAL?
- Conversely, if the WAL is modified first, then what happens if the DB crashes after the checkpoint entry but before the data is actually flushed.

For example, consider the following case:

- We have a dummy table Person(name, age, salary).
- It has an entry John, 25, 100.
- At time T1, a new transaction arrives UPDATE Person SET salary += 100 WHERE name='John'.
- Assume that before T1, all the data had been flushed and the checkpoint entry had been appended to WAL.
- Now after this transaction, the DB will first append the log with the exact transaction statement UPDATE Person SET salary += 100 WHERE name='John'.
- Now the data become John, 25, 200.
- Then after some time, lets say the DB decides to flush the data to disk at time T2.
- Then at time T3 (just after T2), the DB attempts to write the checkpoint entry to the WAL.
- However, before it could finish, there was a power failure between T2 and T3.
- Now when the DB restarts and tries to recover, it will notice that there is one transaction after the latest checkpoint and will try to execute that: UPDATE Person SET salary += 100 WHERE name='John'
- But since the transaction was already executed before the crash, this time the salary will take the value 300, although it should have been 200.

How does the DB prevent these redundant updates during the recovery?
                                

Anmol Singh Jaggi (113 rep)

Oct 2, 2024, 05:25 AM • Last activity: Oct 2, 2024, 11:38 AM

3 votes

3 answers

1231 views

MariaDB Crashes Every Time Restarted

mariadb import corruption crash

- OSX Sonoma, OSX Monterey - Latest MariaDB from Homebrew (11.4.2-MariaDB, client 15.2 for osx10.17 (x86_64)) MariaDB won't start every time I restart the brew service or restart the computer. I experienced it on Sonoma (with OpenCore Legacy Patcher) and fresh native Monterey. ``` 240611 01:27:33 my...

240611 01:27:33 mysqld_safe mysqld from pid file /usr/local/var/mysql/penyo.local.pid ended
240611 01:27:43 mysqld_safe Starting mariadbd daemon with databases from /usr/local/var/mysql
2024-06-11  1:27:43 0 [Note] Starting MariaDB 11.4.2-MariaDB source revision 3fca5ed772fb75e3e57c507edef2985f8eba5b12 as process 6983
2024-06-11  1:27:43 0 [Note] InnoDB: Compressed tables use zlib 1.2.11
2024-06-11  1:27:43 0 [Note] InnoDB: Number of transaction pools: 1
2024-06-11  1:27:43 0 [Note] InnoDB: Using crc32 + pclmulqdq instructions
2024-06-11  1:27:43 0 [Note] InnoDB: Initializing buffer pool, total size = 2.000GiB, chunk size = 32.000MiB
2024-06-11  1:27:43 0 [Note] InnoDB: Completed initialization of buffer pool
2024-06-11  1:27:43 0 [ERROR] InnoDB: Missing FILE_CHECKPOINT(1170899079) at 1170899079
2024-06-11  1:27:43 0 [ERROR] InnoDB: Log scan aborted at LSN 1170899079
2024-06-11  1:27:43 0 [ERROR] InnoDB: Plugin initialization aborted with error Generic error
2024-06-11  1:27:43 0 [Note] InnoDB: Starting shutdown...
2024-06-11  1:27:43 0 [ERROR] Plugin 'InnoDB' registration as a STORAGE ENGINE failed.
2024-06-11  1:27:43 0 [Note] Plugin 'FEEDBACK' is disabled.
2024-06-11  1:27:43 0 [Note] Plugin 'wsrep-provider' is disabled.
2024-06-11  1:27:43 0 [ERROR] Unknown/unsupported storage engine: InnoDB
2024-06-11  1:27:43 0 [ERROR] Aborting
240611 01:27:43 mysqld_safe mysqld from pid file /usr/local/var/mysql/penyo.local.pid ended

my.cnf

[client-server]
!includedir /usr/local/etc/my.cnf.d

[mysqld]

bind-address = 127.0.0.1
# innodb_force_recovery = 6

[mariadb]
innodb_buffer_pool_size = 2G
innodb_ft_cache_size = 512000000

Steps to reproduce : Install MariaDB 11.4.2, start brew services, configure sudo mariadb-secure-installation, import the database. All databases are WordPress databases from cPanel hosting. All databases are fine if checked with mariadb-check. The database is also currently installed in cPanel and there are no problems at all. I have tried importing the database via mysql -u root -p all_db < all_db.sql as well as via phpMyAdmin. Everything was fine before I restarted the service or restarted the computer. As a precaution, I always backup the /usr/local/var/mysql folder before importing data. After restarting brew services restart mariadb, MariaDB won't start. The only way is to cleanup all MariaDB files then reinstall, OR replace ib_logfile0 with a fresh backup then MariaDB will run normally. But lots of errors:

2024-06-11 14:00:21 0 [ERROR] InnoDB: Your database may be corrupt or you may have copied the InnoDB tablespace but not the InnoDB log files. Please refer to https://mariadb.com/kb/en/library/innodb-recovery-modes/  for information about forcing recovery.
2024-06-11 14:00:21 0 [Note] InnoDB: Buffer pool(s) load completed at 240611 14:00:21
2024-06-11 14:00:32 16 [ERROR] InnoDB: Page [page id: space=150, page number=2] log sequence number 1496320315 is in the future! Current system log sequence number 48016.
2024-06-11 14:00:32 16 [ERROR] InnoDB: Your database may be corrupt or you may have copied the InnoDB tablespace but not the InnoDB log files. Please refer to https://mariadb.com/kb/en/library/innodb-recovery-modes/  for information about forcing recovery.
2024-06-11 14:00:32 16 [ERROR] InnoDB: Page [page id: space=151, page number=2] log sequence number 528233615 is in the future! Current system log sequence number 48016.
2024-06-11 14:00:32 16 [ERROR] InnoDB: Your database may be corrupt or you may have copied the InnoDB tablespace but not the InnoDB log files. Please refer to https://mariadb.com/kb/en/library/innodb-recovery-modes/  for information about forcing recovery.
2024-06-11 14:00:32 15 [ERROR] InnoDB: Page [page id: space=154, page number=4] log sequence number 528270232 is in the future! Current system log sequence number 48016.
2024-06-11 14:00:32 15 [ERROR] InnoDB: Your database may be corrupt or you may have copied the InnoDB tablespace but not the InnoDB log files. Please refer to https://mariadb.com/kb/en/library/innodb-recovery-modes/  for information about forcing recovery.
2024-06-11 14:02:17 43 [ERROR] InnoDB: Page [page id: space=1, page number=2] log sequence number 1542367351 is in the future! Current system log sequence number 48016.
2024-06-11 14:02:17 43 [ERROR] InnoDB: Your database may be corrupt or you may have copied the InnoDB tablespace but not the InnoDB log files. Please refer to https://mariadb.com/kb/en/library/innodb-recovery-modes/  for information about forcing recovery.

Before restart: ❯ brew services list Name Status User File httpd started penyo ~/Library/LaunchAgents/homebrew.mxcl.httpd.plist mariadb started penyo ~/Library/LaunchAgents/homebrew.mxcl.mariadb.plist After restart: ❯ brew services list Name Status User File httpd started penyo ~/Library/LaunchAgents/homebrew.mxcl.httpd.plist mariadb error 256 penyo ~/Library/LaunchAgents/homebrew.mxcl.mariadb.plist Ignored the error, tried dropping the database and importing again, then restarting the computer. MariaDB still won't start. And have to replace ib_logfile0 again. So over and over again. I have also tried exporting data via mariadb-dump when MariaDB was still running normally before restarting, then importing the database, but it still doesn't work. I have also tried it when in recovery mode 6. I've tried searching for 3 days, but couldn't find anything. Any suggestion?

Penyo (31 rep)

Jun 11, 2024, 09:53 AM • Last activity: Jun 26, 2024, 11:56 PM

0 votes

0 answers

115 views

MariaDB Crash Rolled back recovered transaction

mariadb crash

For months my MariaDB is running fine but for the last 2 - 3 weeks it crashes randomly 2 - 3 times in the week. i have observed the log files but i'm stuck for now. this is our main production server. i frustration at the moment. maybe some could help out Server info: - Ubuntu 22.04 LTS - MariaDB 10...

                                  For months my MariaDB is running fine but for the last 2 - 3 weeks it crashes randomly 2 - 3 times in the week.

i have observed the log files but i'm stuck for now. this is our main production server. i frustration at the moment. maybe some could help out

Server info: 
 - Ubuntu 22.04 LTS
 - MariaDB 10.6.16-MariaDB-0ubuntu0.22.04.1
 - Virtualmin 7.9

log:

    2024-02-16  7:25:04 0 [Note] Starting MariaDB 10.6.16-MariaDB-0ubuntu0.22.04.1 source revision  as process 4014
    
    2024-02-16  7:25:03 0 [Note] /usr/sbin/mariadbd: Shutdown complete
    2024-02-16  7:25:03 0 [Note] InnoDB: Shutdown completed; log sequence number 19930962113; transaction id 40459499
    2024-02-16  7:25:03 0 [Note] InnoDB: Removed temporary tablespace data file: "./ibtmp1"
    2024-02-16  7:25:00 0 [Note] InnoDB: Buffer pool(s) dump completed at 240216  7:25:00
    2024-02-16  7:25:00 0 [Note] InnoDB: Dumping buffer pool(s) to /var/lib/mysql/ib_buffer_pool
    2024-02-16  7:25:00 0 [Note] InnoDB: Starting shutdown...
    2024-02-16  7:25:00 0 [Note] InnoDB: FTS optimize thread exiting.
    2024-02-16  7:25:00 0 [Note] /usr/sbin/mariadbd (initiated by: unknown): Normal shutdown
    2024-02-16  8:13:06 0 [Note] InnoDB: Buffer pool(s) load completed at 240216  8:13:06
    2024-02-16  8:13:02 0 [Note] InnoDB: Rollback of non-prepared transactions completed
    2024-02-16  8:13:02 0 [Note] InnoDB: Rolled back recovered transaction 40457346
    2024-02-16  8:13:02 0 [Note] InnoDB: Rolled back recovered transaction 40457459
    2024-02-16  8:13:02 0 [Note] InnoDB: Rolled back recovered transaction 40457382
    2024-02-16  8:13:02 0 [Note] InnoDB: Rolled back recovered transaction 40457350
    2024-02-16  8:13:02 0 [Note] InnoDB: Rolled back recovered transaction 40457230
    2024-02-16  8:13:02 0 [Note] InnoDB: Rolled back recovered transaction 40457326
    2024-02-16  8:13:02 0 [Note] InnoDB: Rolled back recovered transaction 40457183
    2024-02-16  8:13:02 0 [Note] InnoDB: Rolled back recovered transaction 40457368
    2024-02-16  8:13:02 0 [Note] InnoDB: Rolled back recovered transaction 40457159
    2024-02-16  8:13:02 0 [Note] InnoDB: Rolled back recovered transaction 40457556
    2024-02-16  8:13:02 0 [Note] InnoDB: Rolled back recovered transaction 40457340
    2024-02-16  8:13:02 7 [Warning] Access denied for user 'root'@'localhost' (using password: NO)
    2024-02-16  8:13:02 6 [Warning] Access denied for user 'root'@'localhost' (using password: NO)
    2024-02-16  8:13:02 0 [Note] InnoDB: Rolled back recovered transaction 40457190
    Version: '10.6.16-MariaDB-0ubuntu0.22.04.1'  socket: '/run/mysqld/mysqld.sock'  port: 3306  Ubuntu 22.04
    2024-02-16  8:13:02 0 [Note] /usr/sbin/mariadbd: ready for connections.
    2024-02-16  8:13:02 0 [Note] InnoDB: Rolled back recovered transaction 40457236
    2024-02-16  8:13:01 0 [Note] Server socket created on IP: '::'.
    2024-02-16  8:13:01 0 [Note] Server socket created on IP: '0.0.0.0'.
    2024-02-16  8:13:01 0 [Warning] You need to use --log-bin to make --expire-logs-days or --binlog-expire-logs-seconds work.
    2024-02-16  8:13:01 0 [Note] Plugin 'FEEDBACK' is disabled.
    2024-02-16  8:13:01 0 [Note] InnoDB: Loading buffer pool(s) from /var/lib/mysql/ib_buffer_pool
    2024-02-16  8:13:01 0 [Note] InnoDB: 10.6.16 started; log sequence number 19927489307; transaction id 40457569
    2024-02-16  8:13:01 0 [Note] InnoDB: File './ibtmp1' size is now 12 MB.
    2024-02-16  8:13:01 0 [Note] InnoDB: Setting file './ibtmp1' size to 12 MB. Physically writing the file full; Please wait ...
    2024-02-16  8:13:01 0 [Note] InnoDB: Creating shared tablespace for temporary tables
    2024-02-16  8:13:01 0 [Note] InnoDB: Removed temporary tablespace data file: "./ibtmp1"
    2024-02-16  8:13:01 0 [Note] InnoDB: Starting in background the rollback of recovered transactions
    2024-02-16  8:13:01 0 [Note] InnoDB: 128 rollback segments are active.
    2024-02-16  8:12:50 0 [Note] InnoDB: To recover: 9912 pages
    2024-02-16  8:12:50 0 [Note] InnoDB: Trx id counter is 40457568


and
    

    MariaDB [(none)]> SHOW VARIABLES LIKE 'binlog_expire%';
    +----------------------------+--------+
    | Variable_name              | Value  |
    +----------------------------+--------+
    | binlog_expire_logs_seconds | 864000 |
    +----------------------------+--------+


                                

Stefx (9 rep)

Feb 16, 2024, 02:58 PM • Last activity: Feb 19, 2024, 07:51 PM

1 votes

1 answers

77 views

Mariadb semi-freezing while writing gigabytes per second according to iotop

performance mariadb crash hang

Recently, I've had a crash of mariadb that started a nightmare. When I restored dumps I had sometimes the daemon hang during bulk inserts. This happened only on the few "big" innodb tables (I have many small dbs that really don't contain much data and most others have a significant amount only in a...

                                  Recently, I've had a crash of mariadb that started a nightmare.
When I restored dumps I had sometimes the daemon hang during bulk inserts. This happened only on the few "big" innodb tables (I have many small dbs that really don't contain much data and most others have a significant amount only in a few tables).
The db usually recovered, eventually (sometimes after hours), but sometimes crashed.
I restored the dbs on a different machine without a hiccup in a matter of minutes then moved the data directory on the production machine. The only serious difference between the two is that the successful machine isn't a VM and sports SSD storage.

Since then, the problems as described still happen, at random times, always when accessing those large tables. Generally I'll have several queries frozen but only one is still enough sometimes to trigger the behaviour.

What I find actually hard to believe is the fact that iotop reports a mariadb thread writing **gigabytes per second**, steady, for the whole time the query hangs. This is physically impossible since the performance of the disks cannot even go near those figures. I also scoured the internet to find what I could about optimizing settings that until a few days ago seemed perfectly fine for my environment, I tweaked a few variables but nothing major, also the defaults on the "import successful" machine weren't much different to begin with.

The issue is similar to https://jira.mariadb.org/browse/MDEV-30884  however this has supposedly been fixed since I'm using 10.6.14 (I also tested 10.6.15) on Gentoo.

I'm at a loss, what could be causing this? Can the mechanical HDD storage somehow justify what is happening?
                                

meh (33 rep)

Dec 19, 2023, 10:41 PM • Last activity: Feb 1, 2024, 09:57 AM

1 votes

2 answers

271 views

Queries causing MariaDB to get signal 7

mariadb crash

I have a Redhat box running MariaDB 10.3.39-MariaDB-log. There are some queries that are running that seem to cause MariaDB to get a signal 7, MariaDB will crash mark a bunch of tables as crashed and restart. I opened up a ticket with Redhat, thinking that maybe something is wrong with our server, however we came to the same conclusion that a few queries seemed to be causing the identical issues. I thought maybe it was a space issue as that mountpoint was running low, I gave it another 10TB just be safe and still the issue persists. So that brings me here, desperate for some sort of clue. 1. I don't know how to stop these queries and they run every 30 mins. 2. I don't know how to see from where these queries are originating so I can't figure out if it's from a common source. I would highly appreciate if someone could kindly 'point' me in the right direction.

default-storage-engine = MYISAM
myisam_use_mmap=1
table_open_cache = 2048
open_files_limit = 6144
thread_concurrency=32
#key_buffer_size = 4096M
key_buffer_size = 21474836480
##myisam_sort_buffer_size = 32M
myisam_sort_buffer_size = 64M
##query_cache_size= 32M
query_cache_size= 0k
read_buffer_size = 32M
sort_buffer_size = 32M
max_allowed_packet = 256M
slow-query-log = ON
long_query_time = .2
read_rnd_buffer_size = 64K
tmp_table_size = 128M
core-file

Jan 29 22:35:45  mysqld: 240129 22:35:45 [ERROR] mysqld got signal 7 ; 
Jan 29 22:35:45  mysqld: This could be because you hit a bug. It is also possible that this binary
Jan 29 22:35:45  mysqld: or one of the libraries it was linked against is corrupt, improperly built,
Jan 29 22:35:45  mysqld: or misconfigured. This error can also be caused by malfunctioning hardware.
Jan 29 22:35:45  mysqld: To report this bug, see https://mariadb.com/kb/en/reporting-bugs 
Jan 29 22:35:45  mysqld: We will try our best to scrape up some info that will hopefully help
Jan 29 22:35:45  mysqld: diagnose the problem, but since we have already crashed,
Jan 29 22:35:45  mysqld: something is definitely wrong and this may fail.
Jan 29 22:35:45  mysqld: Server version: 10.3.39-MariaDB-log source revision: ca001cf2048f0152689e1895e2dc15486dd0b1a
f
Jan 29 22:35:45  mysqld: key_buffer_size=21474836480
Jan 29 22:35:45  mysqld: read_buffer_size=33554432
Jan 29 22:35:45  mysqld: max_used_connections=12
Jan 29 22:35:45  mysqld: max_threads=153
Jan 29 22:35:45  mysqld: thread_count=17
Jan 29 22:35:45  mysqld: It is possible that mysqld could use up to
Jan 29 22:35:45  mysqld: key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 31001963 K  bytes of m
Jan 29 22:35:45  mysqld: This could be because you hit a bug. It is also possible that this binary
Jan 29 22:35:45  mysqld: or one of the libraries it was linked against is corrupt, improperly built, 
Jan 29 22:35:45  mysqld: or misconfigured. This error can also be caused by malfunctioning hardware.
Jan 29 22:35:45  mysqld: To report this bug, see https://mariadb.com/kb/en/reporting-bugs  
Jan 29 22:35:45  mysqld: We will try our best to scrape up some info that will hopefully help
Jan 29 22:35:45  mysqld: diagnose the problem, but since we have already crashed,
Jan 29 22:35:45  mysqld: something is definitely wrong and this may fail.
Jan 29 22:35:45  mysqld: Server version: 10.3.39-MariaDB-log source revision: ca001cf2048f0152689e1895e2dc15486dd0b1a
f
Jan 29 22:35:45  mysqld: key_buffer_size=21474836480
Jan 29 22:35:45  mysqld: read_buffer_size=33554432
Jan 29 22:35:45  mysqld: max_used_connections=12
Jan 29 22:35:45  mysqld: max_threads=153
Jan 29 22:35:45  mysqld: thread_count=17
Jan 29 22:35:45  mysqld: It is possible that mysqld could use up to
Jan 29 22:35:45  mysqld: key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 31001963 K  bytes of m
emory
Jan 29 22:35:45  mysqld: Hope that's ok; if not, decrease some variables in the equation.
Jan 29 22:35:45  mysqld: Thread pointer: 0x7fb5b8000c48
Jan 29 22:35:45  mysqld: Attempting backtrace. You can use the following information to find out
Jan 29 22:35:45  mysqld: where mysqld died. If you see no messages after this, something went
Jan 29 22:35:45  mysqld: terribly wrong...
Jan 29 22:35:45  mysqld: stack_bottom = 0x7fb5bca02c48 thread_stack 0x49000
Jan 29 22:35:45  mysqld: /usr/libexec/mysqld(my_print_stacktrace+0x41)[0x55b55d773eb1]
Jan 29 22:35:45  mysqld: /usr/libexec/mysqld(handle_fatal_signal+0x4f5)[0x55b55d299c05]
Jan 29 22:35:45  mysqld: /lib64/libpthread.so.0(+0x12cf0)[0x7fbda6861cf0]
Jan 29 22:35:46  mysqld: :0(__memmove_avx_unaligned_erms)[0x7fbda6558e73]
Jan 29 22:35:46  mysqld: /usr/libexec/mysqld(+0xc00475)[0x55b55d6de475]
Jan 29 22:35:46  mysqld: /usr/libexec/mysqld(+0xc030e9)[0x55b55d6e10e9]
Jan 29 22:35:46  mysqld: /usr/libexec/mysqld(+0xc2309a)[0x55b55d70109a]
Jan 29 22:35:46  mysqld: /usr/libexec/mysqld(_ZN7handler17ha_index_read_mapEPhPKhm16ha_rkey_function+0x148)[0x55b55d29f448]
Jan 29 22:35:46  mysqld: /usr/libexec/mysqld(_ZN7handler16read_range_firstEPK12st_key_rangeS2_bb+0x66)[0x55b55d2a38c6]
Jan 29 22:35:46  mysqld: /usr/libexec/mysqld(_ZN7handler21multi_range_read_nextEPPv+0xbf)[0x55b55d1c5d6f]
Jan 29 22:35:46  mysqld: /usr/libexec/mysqld(_ZN23Mrr_simple_index_reader8get_nextEPPv+0x52)[0x55b55d1c5df2]
Jan 29 22:35:46  mysqld: /usr/libexec/mysqld(_ZN10DsMrr_impl10dsmrr_nextEPPv+0x4a)[0x55b55d1c722a]
Jan 29 22:35:46  mysqld: /usr/libexec/mysqld(_ZN18QUICK_RANGE_SELECT8get_nextEv+0x3c)[0x55b55d3987dc]
Jan 29 22:35:46  mysqld: /usr/libexec/mysqld(+0x8d581d)[0x55b55d3b381d]
Jan 29 22:35:46  mysqld: /usr/libexec/mysqld(_Z10sub_selectP4JOINP13st_join_tableb+0x18e)[0x55b55d0ed90e]
Jan 29 22:35:46  mysqld: /usr/libexec/mysqld(_ZN4JOIN10exec_innerEv+0xa9a)[0x55b55d10fc2a]
Jan 29 22:35:46  mysqld: /usr/libexec/mysqld(_ZN4JOIN4execEv+0x37)[0x55b55d10fed7]
Jan 29 22:35:46  mysqld: /usr/libexec/mysqld(_Z12mysql_selectP3THDP10TABLE_LISTjR4ListI4ItemEPS4_jP8st_orderS9_S7_S9_yP13select_resultP18st_select_lex_unitP13st_select_lex+0xff)[0x55b55d11002f]
Jan 29 22:35:46  mysqld: /usr/libexec/mysqld(_Z13handle_selectP3THDP3LEXP13select_resultm+0x165)[0x55b55d110955]
Jan 29 22:35:46  mysqld: /usr/libexec/mysqld(+0x5d035c)[0x55b55d0ae35c]
Jan 29 22:35:46  mysqld: /usr/libexec/mysqld(_Z21mysql_execute_commandP3THD+0x5373)[0x55b55d0baeb3]
Jan 29 22:35:46  mysqld: /usr/libexec/mysqld(_Z11mysql_parseP3THDPcjP12Parser_statebb+0x215)[0x55b55d0bd835]
Jan 29 22:35:46  mysqld: /usr/libexec/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcjbb+0x1304)[0x55b55d0bf8f4]
Jan 29 22:35:46  mysqld: /usr/libexec/mysqld(_Z10do_commandP3THD+0x126)[0x55b55d0c0f26]
Jan 29 22:35:46  mysqld: /usr/libexec/mysqld(_Z24do_handle_one_connectionP7CONNECT+0x252)[0x55b55d19a212]
Jan 29 22:35:46  mysqld: /usr/libexec/mysqld(handle_one_connection+0x41)[0x55b55d19a3b1]
Jan 29 22:35:46  mysqld: /lib64/libpthread.so.0(+0x81ca)[0x7fbda68571ca]
Jan 29 22:35:46  mysqld: :0(__GI___clone)[0x7fbda64c3e73]
Jan 29 22:35:46  mysqld: Trying to get some variables.
Jan 29 22:35:46  mysqld: Some pointers may be invalid and cause the dump to abort.
Jan 29 22:35:46  mysqld: Query (0x7fb5b800f680): SELECT st, et, sr, datatype, tracebuf FROM WEBT$VEP$AV$$2022_04_06 WHERE st>=7.025538E8 AND st<=7.025544E8 ORDER BY st ASC
Jan 29 22:35:46  mysqld: Connection ID (thread ID): 29
Jan 29 22:35:46  mysqld: Status: NOT_KILLED
Jan 29 22:35:46  mysqld: Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=on,condition_pushdown_for_derived=on,split_materialized=on
Jan 29 22:35:46  mysqld: The manual page at https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mysqld/  contains
Jan 29 22:35:46  mysqld: information that should help you find out what is causing the crash.
Jan 29 22:35:46  mysqld: Writing a core file...
Jan 29 22:35:46  mysqld: Working directory at /mnt/winston
Jan 29 22:35:46  mysqld: Resource Limits:
Jan 29 22:35:46  mysqld: Limit                     Soft Limit           Hard Limit           Units
Jan 29 22:35:46  mysqld: Max cpu time              unlimited            unlimited            seconds
Jan 29 22:35:46  mysqld: Max file size             unlimited            unlimited            bytes
Jan 29 22:35:46  mysqld: Max data size             unlimited            unlimited            bytes
Jan 29 22:35:46  mysqld: Max stack size            8388608              unlimited            bytes
Jan 29 22:35:46  mysqld: Max core file size        unlimited            unlimited            bytes
Jan 29 22:35:46  mysqld: Max resident set          unlimited            unlimited            bytes
Jan 29 22:35:46  mysqld: Max processes             319785               319785               processes
Jan 29 22:35:46  mysqld: Max open files            1048576              1048576              files
Jan 29 22:35:46  mysqld: Max locked memory         65536                65536                bytes
Jan 29 22:35:46  mysqld: Max address space         unlimited            unlimited            bytes
Jan 29 22:35:46  mysqld: Max file locks            unlimited            unlimited            locks
Jan 29 22:35:46  mysqld: Max pending signals       319785               319785               signals
Jan 29 22:35:46  mysqld: Max msgqueue size         819200               819200               bytes
Jan 29 22:35:46  mysqld: Max nice priority         0                    0   
Jan 29 22:35:46  mysqld: Max realtime priority     0                    0   
Jan 29 22:35:46  mysqld: Max realtime timeout      unlimited            unlimited            us  
Jan 29 22:35:46  mysqld: Core pattern: /tmp/corefiles/core
Jan 29 22:35:46  mysqld: Kernel version: Linux version 4.18.0-477.27.1.el8_8.x86_64 (mockbuild@x86-64-02.build.eng.rdu2.redhat.com) (gcc version 8.5.0 20210514 (Red Hat 8.5.0-18) (GCC)) #1 SMP Thu Aug 31 10:29:22 EDT 2023
Jan 29 22:35:47  systemd: mariadb.service: Main process exited, code=killed, status=7/BUS
Jan 29 22:35:47  systemd: mariadb.service: Failed with result 'signal'.
Jan 29 22:35:52  systemd: mariadb.service: Service RestartSec=5s expired, scheduling restart.
Jan 29 22:35:52  systemd: mariadb.service: Scheduled restart job, restart counter is at 608.
Jan 29 22:35:52  systemd: Stopped MariaDB 10.3 database server.
Jan 29 22:35:52  systemd: Starting MariaDB 10.3 database server...
Jan 29 22:35:52  mysql-check-socket: Socket file /var/lib/mysql/mysql.sock exists.
Jan 29 22:35:52  mysql-check-socket: No process is using /var/lib/mysql/mysql.sock, which means it is a garbage, so it will be removed automatically.
Jan 29 22:35:52  mysql-prepare-db-dir: Database MariaDB is probably initialized in /mnt/winston already, nothing is done.
Jan 29 22:35:52  mysql-prepare-db-dir: If this is not the case, make sure the /mnt/winston is empty before running mysql-prepare-db-dir.
Jan 29 22:35:52  mysqld: 2024-01-29 22:35:52 0 [Warning] 'THREAD_CONCURRENCY' is deprecated and will be removed in a future release.
Jan 29 22:35:52  mysqld: 2024-01-29 22:35:52 0 [Note] Starting MariaDB 10.3.39-MariaDB-log source revision ca001cf2048f0152689e1895e2dc15486dd0b1af as process 1021768
Jan 29 22:35:55  mysqld: 2024-01-29 22:35:55 0 [Note] InnoDB: Using Linux native AIO
Jan 29 22:35:55  mysqld: 2024-01-29 22:35:55 0 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
Jan 29 22:35:55  mysqld: 2024-01-29 22:35:55 0 [Note] InnoDB: Uses event mutexes
Jan 29 22:35:55  mysqld: 2024-01-29 22:35:55 0 [Note] InnoDB: Compressed tables use zlib 1.2.11
Jan 29 22:35:55  mysqld: 2024-01-29 22:35:55 0 [Note] InnoDB: Number of pools: 1
Jan 29 22:35:55  mysqld: 2024-01-29 22:35:55 0 [Note] InnoDB: Using SSE2 crc32 instructions
Jan 29 22:35:55  mysqld: 2024-01-29 22:35:55 0 [Note] InnoDB: Initializing buffer pool, total size = 128M, instances = 1, chunk size = 128M
Jan 29 22:35:55  mysqld: 2024-01-29 22:35:55 0 [Note] InnoDB: Completed initialization of buffer pool
Jan 29 22:35:55  mysqld: 2024-01-29 22:35:55 0 [Note] InnoDB: If the mysqld execution user is authorized, page cleaner thread priority can be changed. See the man page of setpriority(). 
Jan 29 22:35:55  mysqld: 2024-01-29 22:35:55 0 [Note] InnoDB: The log sequence number 1650206 in the system tablespace does not match the log sequence number 1654796 in the ib_logfiles! 
Jan 29 22:35:55  mysqld: 2024-01-29 22:35:55 0 [Note] InnoDB: 128 out of 128 rollback segments are active.
Jan 29 22:35:55  mysqld: 2024-01-29 22:35:55 0 [Note] InnoDB: Removed temporary tablespace data file: "ibtmp1"
Jan 29 22:35:55  mysqld: 2024-01-29 22:35:55 0 [Note] InnoDB: Creating shared tablespace for temporary tables
Jan 29 22:35:55  mysqld: 2024-01-29 22:35:55 0 [Note] InnoDB: Setting file './ibtmp1' size to 12 MB. Physically writing the file full; Please wait ...
Jan 29 22:35:55  mysqld: 2024-01-29 22:35:55 0 [Note] InnoDB: File './ibtmp1' size is now 12 MB.
Jan 29 22:35:55  mysqld: 2024-01-29 22:35:55 0 [Note] InnoDB: Waiting for purge to start
Jan 29 22:35:55  mysqld: 2024-01-29 22:35:55 0 [Note] InnoDB: 10.3.39 started; log sequence number 1654796; transaction id 38
Jan 29 22:35:55  mysqld: 2024-01-29 22:35:55 0 [Note] InnoDB: Loading buffer pool(s) from /mnt/winston/ib_buffer_pool
Jan 29 22:35:55  mysqld: 2024-01-29 22:35:55 0 [Note] InnoDB: Buffer pool(s) load completed at 240129 22:35:55
Jan 29 22:35:55  mysqld: 2024-01-29 22:35:55 0 [Note] Plugin 'FEEDBACK' is disabled.
Jan 29 22:35:55  mysqld: 2024-01-29 22:35:55 0 [Note] Recovering after a crash using tc.log
Jan 29 22:35:55  mysqld: 2024-01-29 22:35:55 0 [Note] Starting crash recovery...
Jan 29 22:35:55  mysqld: 2024-01-29 22:35:55 0 [Note] Crash recovery finished.
Jan 29 22:35:55  mysqld: 2024-01-29 22:35:55 0 [Note] Server socket created on IP: '::'.
Jan 29 22:35:55  mysqld: 2024-01-29 22:35:55 0 [Note] Reading of all Master_info entries succeeded
Jan 29 22:35:55  mysqld: 2024-01-29 22:35:55 0 [Note] Added new Master_info '' to hash table
Jan 29 22:35:55  mysqld: 2024-01-29 22:35:55 0 [Note] /usr/libexec/mysqld: ready for connections.
Jan 29 22:35:55  mysqld: Version: '10.3.39-MariaDB-log'  socket: '/var/lib/mysql/mysql.sock'  port: 3306  MariaDB Server
Jan 29 22:35:55  systemd: Started MariaDB 10.3 database server.
Jan 29 22:35:55  mysqld: 2024-01-29 22:35:55 8 [ERROR] mysqld: Table './WWS_HAG@0024SHE@0024AV/HAG@0024SHE@0024AV@0024@00242024_01_29' is marked as crashed and should be repaired

And then it goes on and on like this for a good long while. I get it if I get no answers, but if there's something there that is blatantly obvious, please share or point me in the right direction.

Anthony Benavidez (13 rep)

Jan 30, 2024, 12:07 AM • Last activity: Jan 31, 2024, 10:04 PM

2 votes

2 answers

731 views

CPU High usage crashing our server

mysql optimization crash cpu

The server where my database is running is suffering from CPU spikes. We're having trouble identifying what is causing these CPU spikes, and consequently how to mitigate them. I've tried adding some indexes, but maybe I forgot one or two. How do I check which table has any problems? Once a day I hav...

                                  The server where my database is running is suffering from CPU spikes.  We're having trouble identifying what is causing these CPU spikes, and consequently how to mitigate them.

I've tried adding some indexes, but maybe I forgot one or two.

How do I check which table has any problems?

Once a day I have a huge CPU jump to 700%.  We've been resolving it to date by restarting the server.

I can provide the necessary information to find the problem, but I don't know what info is needed. 

This is the MySQLTuner report:

     >>  MySQLTuner 1.6.18 - Major Hayden 
     >>  Bug reports, feature requests, and downloads at http://mysqltuner.com/ 
     >>  Run with '--help' for additional options and output filtering
    
    [[0;34m--[0m] Skipped version check for MySQLTuner script
    [[0;32mOK[0m] Logged in using credentials from debian maintenance account.
    [[0;32mOK[0m] Currently running supported MySQL version 10.1.47-MariaDB-0+deb9u1
    [[0;32mOK[0m] Operating on 64-bit architecture
     
    -------- Storage Engine Statistics -----------------------------------------------------------------
    [[0;34m--[0m] Status: [0;32m+Aria [0m[0;32m+CSV [0m[0;32m+InnoDB [0m[0;32m+MEMORY [0m[0;32m+MRG_MyISAM [0m[0;32m+MyISAM [0m[0;32m+PERFORMANCE_SCHEMA [0m[0;32m+SEQUENCE [0m
    [[0;34m--[0m] Data in InnoDB tables: 1G (Tables: 387)
    [[0;34m--[0m] Data in MyISAM tables: 1K (Tables: 1)
    [[0;32mOK[0m] Total fragmented tables: 0
     
    -------- Security Recommendations ------------------------------------------------------------------
    [[0;32mOK[0m] There are no anonymous accounts for any database users
    [[0;32mOK[0m] All database users have passwords assigned
    [[0;31m!![0m] User 'kacper@%' hasn't specific host restriction.
    [[0;34m--[0m] There are 612 basic passwords in the list.
     
    -------- CVE Security Recommendations --------------------------------------------------------------
    [[0;32mOK[0m] NO SECURITY CVE FOUND FOR YOUR VERSION
     
    -------- Performance Metrics -----------------------------------------------------------------------
    [[0;34m--[0m] Up for: 2h 17m 38s (1M q [236.530 qps], 53K conn, TX: 1G, RX: 285M)
    [[0;34m--[0m] Reads / Writes: 71% / 29%
    [[0;34m--[0m] Binary logging is disabled
    [[0;34m--[0m] Physical Memory     : 62.8G
    [[0;34m--[0m] Max MySQL memory    : 12.6G
    [[0;34m--[0m] Other process memory: 209.7M
    [[0;34m--[0m] Total buffers: 328.0M global + 2.8M per thread (4096 max threads)
    [[0;34m--[0m] P_S Max memory usage: 1G
    [[0;34m--[0m] Galera GCache Max memory usage: 0B
    [[0;32mOK[0m] Maximum reached memory usage: 7.1G (11.38% of installed RAM)
    [[0;32mOK[0m] Maximum possible memory usage: 12.6G (20.10% of installed RAM)
    [[0;32mOK[0m] Overall possible memory usage with other process is compatible with memory available
    [[0;32mOK[0m] Slow queries: 0% (0/1M)
    [[0;32mOK[0m] Highest usage of available connections: 51% (2102/4096)
    [[0;32mOK[0m] Aborted connections: 0.02%  (13/53000)
    [[0;31m!![0m] name resolution is active : a reverse name resolution is made for each new connection and can reduce performance
    [[0;32mOK[0m] Query cache efficiency: 30.4% (505K cached / 1M selects)
    [[0;31m!![0m] Query cache prunes per day: 511609
    [[0;32mOK[0m] Sorts requiring temporary tables: 5% (470 temp sorts / 9K sorts)
    [[0;31m!![0m] Joins performed without indexes: 48
    [[0;31m!![0m] Temporary tables created on disk: 76% (39K on disk / 51K total)
    [[0;32mOK[0m] Thread cache hit rate: 89% (5K created / 53K connections)
    [[0;32mOK[0m] Table cache hit rate: 81% (697 open / 860 opened)
    [[0;32mOK[0m] Open file limit used: 0% (61/16K)
    [[0;32mOK[0m] Table locks acquired immediately: 100% (998K immediate / 998K locks)
     
    -------- Performance schema ------------------------------------------------------------------------
    [[0;34m--[0m] Performance schema is enabled.
    [[0;34m--[0m] Memory used by P_S: 1.1G
    [[0;34m--[0m] Sys schema isn't installed.
     
    -------- ThreadPool Metrics ------------------------------------------------------------------------
    [[0;34m--[0m] ThreadPool stat is enabled.
    [[0;34m--[0m] Thread Pool Size: 8 thread(s).
    [[0;34m--[0m] Using default value is good enough for your version (10.1.47-MariaDB-0+deb9u1)
     
    -------- MyISAM Metrics ----------------------------------------------------------------------------
    [[0;31m!![0m] Key buffer used: 18.3% (3M used / 16M cache)
    [[0;32mOK[0m] Key buffer size / total MyISAM indexes: 16.0M/124.0K
    [[0;32mOK[0m] Read Key buffer hit rate: 97.5% (162 cached / 4 reads)
     
    -------- AriaDB Metrics ----------------------------------------------------------------------------
    [[0;34m--[0m] AriaDB is enabled.
    [[0;32mOK[0m] Aria pagecache size / total Aria indexes: 128.0M/1B
    [[0;31m!![0m] Aria pagecache hit rate: 83.9% (241K cached / 38K reads)
     
    -------- InnoDB Metrics ----------------------------------------------------------------------------
    [[0;34m--[0m] InnoDB is enabled.
    [[0;31m!![0m] InnoDB buffer pool / data size: 128.0M/1.9G
    [[0;31m!![0m] InnoDB buffer pool  16M)
        join_buffer_size (> 256.0K, or always use indexes with joins)
        tmp_table_size (> 16M)
        max_heap_table_size (> 16M)
        innodb_buffer_pool_size (>= 1G) if possible.
        innodb_buffer_pool_instances (=1)
                                

Kacper Kleszczyński (21 rep)

Dec 2, 2020, 08:47 PM • Last activity: Oct 4, 2023, 10:02 AM

0 votes

1 answers

775 views

Mysql 5.7.20 crashes after jemalloc installation

mysql crash

Recently i faced memory leak issue in one my mysql instance (5.7.20) where eventhough allocated buffer pool size was 50% of the RAM, but mysqld memory utilization was constantly pegging at 90%. I found similar bug https://bugs.mysql.com/bug.php?id=83047 and in my case also bulk load is the predomina...

                                  Recently i faced memory leak issue in one my mysql instance (5.7.20) where eventhough allocated buffer pool size was 50% of the RAM, but mysqld memory utilization was constantly pegging at 90%.
I found similar bug https://bugs.mysql.com/bug.php?id=83047  and in my case also bulk load is the predominant workload.
So i installed jemalloc and made changes to /etc/sysconfig/mysql so the mysqld uses jemalloc instead of malloc().

My memory leak issue is fixed now. But after this change i am noticing that mysql crashes often and from error log i could not interpret what is exactly causing the crash.

> 01:08:08 UTC - mysqld got signal 11 ; This could be because you hit a
> bug. It is also possible that this binary or one of the libraries it
> was linked against is corrupt, improperly built, or misconfigured.
> This error can also be caused by malfunctioning hardware. Attempting
> to collect some information that could help diagnose the problem. As
> this is a crash and something is definitely wrong, the information
> collection process might fail.
> 
> key_buffer_size=8388608 read_buffer_size=131072
> max_used_connections=18 max_threads=151 thread_count=16
> connection_count=16 It is possible that mysqld could use up to
> key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads =
> 338785 K  bytes of memory Hope that's ok; if not, decrease some
> variables in the equation.
> 
> Thread pointer: 0x7f6f6bc16000 Attempting backtrace. You can use the
> following information to find out where mysqld died. If you see no
> messages after this, something went terribly wrong... stack_bottom =
> 7f86e055ce30 thread_stack 0x40000
> /usr/sbin/mysqld(my_print_stacktrace+0x3b)[0xef8feb]
> /usr/sbin/mysqld(handle_fatal_signal+0x461)[0x7b0191]
> /lib64/libpthread.so.0(+0xf5e0)[0x7f86fee6b5e0]
> /usr/sbin/mysqld(_Z32innobase_parse_hint_from_commentP3THDP12dict_table_tPK11TABLE_SHARE+0x2d0)[0xf2ed50] /usr/sbin/mysqld(_ZN19create_table_info_t24create_table_update_dictEv+0x119)[0xf41c09]
> /usr/sbin/mysqld(_ZN11ha_innobase6createEPKcP5TABLEP24st_ha_create_information+0x127)[0xf436b7]
> /usr/sbin/mysqld(_ZN11ha_innopart20create_new_partitionEP5TABLEP24st_ha_create_informationPKcjP17partition_element+0xcd)[0xf53aad]
> /usr/sbin/mysqld(_ZN16Partition_helper17change_partitionsEP24st_ha_create_informationPKcPyS4_+0x489)[0xc255d9]
> /usr/sbin/mysqld[0xcccdee]
> /usr/sbin/mysqld(_Z26fast_alter_partition_tableP3THDP5TABLEP10Alter_infoP24st_ha_create_informationP10TABLE_LISTPcPKcP14partition_info+0x52c)[0xcd78cc]
> /usr/sbin/mysqld(_Z17mysql_alter_tableP3THDPKcS2_P24st_ha_create_informationP10TABLE_LISTP10Alter_info+0xd43)[0xd309e3]
> /usr/sbin/mysqld(_ZN19Sql_cmd_alter_table7executeEP3THD+0x4f8)[0xe2e648]
> /usr/sbin/mysqld(_Z21mysql_execute_commandP3THDb+0x5d0)[0xcc35d0]
> /usr/sbin/mysqld(_ZN18Prepared_statement7executeEP6Stringb+0x357)[0xcf1397]
> /usr/sbin/mysqld(_ZN18Prepared_statement12execute_loopEP6StringbPhS2_+0xda)[0xcf43ca]
> /usr/sbin/mysqld(_Z22mysql_sql_stmt_executeP3THD+0xfc)[0xcf48ac]
> /usr/sbin/mysqld(_Z21mysql_execute_commandP3THDb+0x198d)[0xcc498d]
> /usr/sbin/mysqld(_ZN13sp_instr_stmt9exec_coreEP3THDPj+0x50)[0xc4a1b0]
> /usr/sbin/mysqld(_ZN12sp_lex_instr23reset_lex_and_exec_coreEP3THDPjb+0x3fc)[0xc4be6c]
> /usr/sbin/mysqld(_ZN12sp_lex_instr29validate_lex_and_execute_coreEP3THDPjb+0xbb)[0xc4c85b]
> /usr/sbin/mysqld(_ZN13sp_instr_stmt7executeEP3THDPj+0x128)[0xc4da08]
> /usr/sbin/mysqld(_ZN7sp_head7executeEP3THDb+0x4f4)[0xc45a04]
> /usr/sbin/mysqld(_ZN7sp_head17execute_procedureEP3THDP4ListI4ItemE+0x777)[0xc49567]
> /usr/sbin/mysqld(_Z21mysql_execute_commandP3THDb+0x1c42)[0xcc4c42]
> /usr/sbin/mysqld(_ZN13sp_instr_stmt9exec_coreEP3THDPj+0x50)[0xc4a1b0]
> /usr/sbin/mysqld(_ZN12sp_lex_instr23reset_lex_and_exec_coreEP3THDPjb+0x3fc)[0xc4be6c]
> /usr/sbin/mysqld(_ZN12sp_lex_instr29validate_lex_and_execute_coreEP3THDPjb+0xbb)[0xc4c85b]
> /usr/sbin/mysqld(_ZN13sp_instr_stmt7executeEP3THDPj+0x128)[0xc4da08]
> /usr/sbin/mysqld(_ZN7sp_head7executeEP3THDb+0x4f4)[0xc45a04]
> /usr/sbin/mysqld(_ZN7sp_head17execute_procedureEP3THDP4ListI4ItemE+0x777)[0xc49567]
> /usr/sbin/mysqld(_Z21mysql_execute_commandP3THDb+0x1c42)[0xcc4c42]
> /usr/sbin/mysqld(_ZN13sp_instr_stmt9exec_coreEP3THDPj+0x50)[0xc4a1b0]
> /usr/sbin/mysqld(_ZN12sp_lex_instr23reset_lex_and_exec_coreEP3THDPjb+0x3fc)[0xc4be6c]
> /usr/sbin/mysqld(_ZN12sp_lex_instr29validate_lex_and_execute_coreEP3THDPjb+0xbb)[0xc4c85b]
> /usr/sbin/mysqld(_ZN13sp_instr_stmt7executeEP3THDPj+0x128)[0xc4da08]
> /usr/sbin/mysqld(_ZN7sp_head7executeEP3THDb+0x4f4)[0xc45a04]
> /usr/sbin/mysqld(_ZN7sp_head17execute_procedureEP3THDP4ListI4ItemE+0x777)[0xc49567]
> /usr/sbin/mysqld(_Z21mysql_execute_commandP3THDb+0x1c42)[0xcc4c42]
> /usr/sbin/mysqld(_ZN13sp_instr_stmt9exec_coreEP3THDPj+0x50)[0xc4a1b0]
> /usr/sbin/mysqld(_ZN12sp_lex_instr23reset_lex_and_exec_coreEP3THDPjb+0x3fc)[0xc4be6c]
> /usr/sbin/mysqld(_ZN12sp_lex_instr29validate_lex_and_execute_coreEP3THDPjb+0xbb)[0xc4c85b]
> /usr/sbin/mysqld(_ZN13sp_instr_stmt7executeEP3THDPj+0x128)[0xc4da08]
> /usr/sbin/mysqld(_ZN7sp_head7executeEP3THDb+0x4f4)[0xc45a04]
> /usr/sbin/mysqld(_ZN7sp_head17execute_procedureEP3THDP4ListI4ItemE+0x777)[0xc49567]
> /usr/sbin/mysqld(_Z21mysql_execute_commandP3THDb+0x1c42)[0xcc4c42]
> /usr/sbin/mysqld(_Z11mysql_parseP3THDP12Parser_state+0x3b5)[0xcc99a5]
> /usr/sbin/mysqld(_Z16dispatch_commandP3THDPK8COM_DATA19enum_server_command+0xa8a)[0xcca4aa]
> /usr/sbin/mysqld(_Z10do_commandP3THD+0x19f)[0xccbeef]
> /usr/sbin/mysqld(handle_connection+0x288)[0xd8b668]
> /usr/sbin/mysqld(pfs_spawn_thread+0x1b4)[0x126f4a4]
> /lib64/libpthread.so.0(+0x7e25)[0x7f86fee63e25]
> /lib64/libc.so.6(clone+0x6d)[0x7f86fd92034d]
> 
> Trying to get some variables. Some pointers may be invalid and cause
> the dump to abort. Query (7f69a47be040): ALTER TABLE HN_QOS_DATA_0666
> ADD PARTITION(  partition p737487 values less than ( '2019-03-04' ))
> Connection ID (thread ID): 51548 Status: NOT_KILLED
> 
> The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html 
> contains information that should help you find out what is causing the
> crash.

and then the recovery starts

> 2019-02-02T01:08:26.053673Z 0 [Warning] Could not increase number of
> max_open_files to more than 10000 (request: 10161)
> 2019-02-02T01:08:26.054509Z 0 [Warning] Changed limits:
> table_open_cache: 4919 (requested 5000) 2019-02-02T01:08:26.252510Z 0
> [Warning] The syntax '--log_warnings/-W' is deprecated and will be
> removed in a future release. Please use '--log_error_verbosity'
> instead. 2019-02-02T01:08:26.252626Z 0 [Warning] TIMESTAMP with
> implicit DEFAULT value is deprecated. Please use
> --explicit_defaults_for_timestamp server option (see documentation for more details). 2019-02-02T01:08:26.252704Z 0 [Warning] Insecure
> configuration for --secure-file-priv: Current value does not restrict
> location of generated files. Consider setting it to a valid, non-empty
> path. InnoDB: Progress in percent: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
> 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37
> 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60
> 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83
> 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99

I have PMM monitoring for this instance, i can certainly see a spike in swapping, IO and spike in some other charts as well. But im not able come to a conclusion on what is causing the crash. Whether a particular query is causing or it is because of memory pressure or for some other reason.

Even during the memory leak issue in the server, mysql never crashed but after started using jemalloc, mysql just crashes . 

1) what things should i look upon to find the exact cause of mysql crash (mysql and in PMM)
2) Does using jemalloc library causes mysql crashes
3) How can i rule out memory pressure as a cause of crash
4) Is it better to use tcmalloc() instead of jemalloc

Thanks in advance.
                                

udhayan dharmalingam (383 rep)

Feb 3, 2019, 04:43 PM • Last activity: Sep 7, 2023, 08:04 PM

0 votes

1 answers

623 views

What will happen when SQL Server log drive is full?

sql-server transaction-log recovery crash

One of my production database log file drive filled up and I noticed below messages in SQL Server error log. 2022-02-13 03:16:24.500 spid72 Error: 9002, Severity: 17, State: 4. 2022-02-13 03:16:24.500 spid72 The transaction log for database 'huge_database' is full due to 'ACTIVE_TRANSACTION'. 2022-0...

                                  One of my production database log file drive filled up and I noticed below messages in SQL Server error log.

    2022-02-13 03:16:24.500	spid72	Error: 9002, Severity: 17, State: 4.
    2022-02-13 03:16:24.500	spid72	The transaction log for database 'huge_database' is full due to 'ACTIVE_TRANSACTION'.
    2022-02-13 03:16:26.060	spid72	Error: 9002, Severity: 17, State: 4.
    2022-02-13 03:16:26.060	spid72	The transaction log for database 'huge_database' is full due to 'ACTIVE_TRANSACTION'.
    2022-02-13 03:16:26.070	spid72	Error: 3314, Severity: 21, State: 3.
    2022-02-13 03:16:26.070	spid72	During undoing of a logged operation in database 'huge_database', an error occurred at log record ID (2766:550754:254). Typically, the specific failure is logged previously as an error in the Windows Event Log service. Restore the database or file from a backup, or repair the database.
    2022-02-13 03:16:26.090	spid72	Database huge_database was shutdown due to error 3314 in routine 'XdesRMReadWrite::RollbackToLsn'. Restart for non-snapshot databases will be attempted after all connections to the database are aborted.
    2022-02-13 03:16:26.090	spid72	Error: 3314, Severity: 21, State: 5.
    2022-02-13 03:16:26.090	spid72	During undoing of a logged operation in database 'huge_database', an error occurred at log record ID (2678:51796:1). Typically, the specific failure is logged previously as an error in the Windows Event Log service. Restore the database or file from a backup, or repair the database.
    2022-02-13 03:16:32.900	spid48s	Starting up database 'huge_database'.
    2022-02-13 03:16:37.980	spid48s	Recovery of database 'huge_database' (20) is 0% complete (approximately 89573 seconds remain). Phase 2 of 3. This is an informational message only. No user action is required.
    2022-02-13 03:16:57.980	spid48s	Recovery of database 'huge_database' (20) is 0% complete (approximately 60835 seconds remain). Phase 2 of 3. This is an informational message only. No user action is required.
    ...
    2022-02-13 04:26:15.160	spid36s	Recovery of database 'huge_database' (20) is 85% complete (approximately 721 seconds remain). Phase 3 of 3. This is an informational message only. No user action is required.
    ...
    2022-02-13 04:36:35.230	spid36s	Recovery of database 'huge_database' (20) is 99% complete (approximately 1 seconds remain). Phase 3 of 3. This is an informational message only. No user action is required.
    2022-02-13 04:36:35.790	spid36s	1 transactions rolled back in database 'huge_database' (20:0). This is an informational message only. No user action is required.
    2022-02-13 04:36:35.790	spid36s	Recovery is writing a checkpoint in database 'huge_database' (20). This is an informational message only. No user action is required.
    2022-02-13 04:36:36.850	spid36s	Recovery completed for database huge_database (database ID 20) in 4804 second(s) (analysis 4881 ms, redo 1196931 ms, undo 3600721 ms.) This is an informational message only. No user action is required.  

So it seems that the database log drive is full, and it started a recovery process, which took more than an hour. Why it needs to recover in this case? I want to reproduce it but failed. Below is my code:

    CREATE DATABASE MyDatabase
    ON 
        (NAME = MyDatabase_Data,
        FILENAME = 'f:\mssql\data\MyDatabase_Data.mdf',
        SIZE = 10MB,
        MAXSIZE = 1000MB,
        FILEGROWTH = 5MB)
    LOG ON
        (NAME = MyDatabase_Log,
        FILENAME = 'U:\data\MyDatabase_Log.ldf',
        SIZE = 5MB,
        MAXSIZE = 500MB,
        FILEGROWTH = 1MB);
    GO
    
    USE MyDatabase;
    GO
    
    CREATE TABLE MyTable (
        ID INT IDENTITY(1,1) PRIMARY KEY,
        Data VARCHAR(MAX) NOT NULL
    );
    GO
    
    BEGIN TRANSACTION;
    
    DECLARE @i INT = 0;
    WHILE @i  Msg 9002, Level 17, State 4, Line 30
The transaction log for database 'MyDatabase' is full due to 'ACTIVE_TRANSACTION' and the holdup lsn is (39:24:1).

And this from the SQL Server error log:

    2023-08-18 02:59:54.200	spid84	Error: 17053, Severity: 16, State: 1.
    2023-08-18 02:59:54.200	spid84	U:\data\MyDatabase_Log.ldf: Operating system error 112(There is not enough space on the disk.) encountered.
    2023-08-18 02:59:55.200	spid84	Error: 9002, Severity: 17, State: 4.
    2023-08-18 02:59:55.200	spid84	The transaction log for database 'MyDatabase' is full due to 'ACTIVE_TRANSACTION' and the holdup lsn is (39:24:1).  

There is no recovery process. Why? How to mimic my production database issue? 

---

## Update:

Btw, during the recovery of my production database, I can still access it. It still have the same status as in other databases in sys.databases. Is this expected? I though databases in recovery is not accessible.


## Update2:
> During undoing of a logged operation in database 'huge_database', an error occurred at log record ID (2678:51796:1).

This error log sounds like SQL didn't reserve enough space to roll back. I remembered that Paul Randal said that SQL Server always reserve some log for rollback. If so, why this could happen?
                                

Fajela Tajkiya (1239 rep)

Aug 17, 2023, 07:13 PM • Last activity: Aug 17, 2023, 08:35 PM

0 votes

0 answers

193 views

mariadb 11.0.2 crashed without error message

mariadb crash

I am using MariaDB 11.0.2, upgraded from 10.5.x. some of the servers is good and working fast. the other one, is not. It start crashing after upgrade, but without an error message. when running as a service or when running directly in bash at root. $ mariadbd mariadbd: Please consult the Knowledge B...

                                  I am using MariaDB 11.0.2, upgraded from 10.5.x.
some of the servers is good and working fast.
the other one, is not.

It start crashing after upgrade, but without an error message. when running as a service or when running directly in bash at root.

    $ mariadbd
    mariadbd: Please consult the Knowledge Base to find out how to run mysqld as root!
    2023-07-04  0:42:07 0 [ERROR] Aborting

What I tried:

 - reinstall MariaDB  again
 - took server from backup, restore, and restart
 - restore a copy from working server via mariadb-backup
 - change the directory of  data and start the server (work one time, but not the second one)

where to start ? 

Debian 12, kernel 6.1.0-9 

Thanks for all.

Moshe L (126 rep)

Jul 3, 2023, 09:49 PM

-1 votes

1 answers

138 views

AlwaysOn proc doesn't stop, but it can't be telnet。SqlDumpExceptionHandler: Process {0} g fatal exception c0000094 EXCEPTION_INT_DIVIDE_BY_ZERO

sql-server availability-groups crash

I'm a DBA, I'm still learning, and I've recently encountered an Alwayson cluster that can't be used. Storage failure 1796932693059087230, type 5 Event name: SQLException64 Response: Not applicable Cab ID: 1527799436058424754 Dump: 2023-05-25 23:33:51.87 spid167s * BEGIN STACK DUMP: 2023-05-25 23:33:...

                                  I'm a DBA, I'm still learning, and I've recently encountered an Alwayson cluster that can't be used.

Storage failure 1796932693059087230, type 5

    Event name: SQLException64
    Response: Not applicable
    Cab ID: 1527799436058424754
    
    Dump:
    2023-05-25 23:33:51.87 spid167s    * BEGIN STACK DUMP:
    2023-05-25 23:33:51.87 spid167s    *   05/25/23 23:33:51 spid 167
    2023-05-25 23:33:51.87 spid167s    *
    2023-05-25 23:33:51.87 spid167s    *
    2023-05-25 23:33:51.87 spid167s    *   Exception Address = 00007FFCCAFC4AFB Module(sqldk+0000000000004AFB)
    2023-05-25 23:33:51.87 spid167s    *   Exception Code    = c0000094 EXCEPTION_INT_DIVIDE_BY_ZERO
    2023-05-25 23:33:51.87 spid167s    *  
    2023-05-25 23:33:51.87 spid167s    *
    2023-05-25 23:33:51.87 spid167s    *  MODULE                          BASE      END       SIZE
    2023-05-25 23:33:51.87 spid167s    * sqlservr                       00007FF7353E0000  00007FF73547CFFF  0009d000
    2023-05-25 23:33:51.87 spid167s    * ntdll                          00007FFCE2EB0000  00007FFCE30AFFFF  00200000
    2023-05-25 23:33:51.87 spid167s    * KERNEL32                       00007FFCE19F0000  00007FFCE1AABFFF  000bc000
    2023-05-25 23:33:51.87 spid167s    * KERNELBASE                     00007FFCE09F0000  00007FFCE0D52FFF  00363000
    2023-05-25 23:33:51.87 spid167s    * ADVAPI32                       00007FFCE1BD0000  00007FFCE1C7DFFF  000ae000
    2023-05-25 23:33:51.87 spid167s    * msvcrt                         00007FFCE14A0000  00007FFCE1542FFF  000a3000
    2023-05-25 23:33:51.87 spid167s    * sechost                        00007FFCE0E00000  00007FFCE0E9EFFF  0009f000
    2023-05-25 23:33:51.87 spid167s    * RPCRT4                         00007FFCE1C80000  00007FFCE1D9DFFF  0011e000
    2023-05-25 23:33:51.87 spid167s    * ole32                          00007FFCE0F40000  00007FFCE1074FFF  00135000
    2023-05-25 23:33:51.87 spid167s    * msvcp_win                      00007FFCE0D60000  00007FFCE0DFFFFF  000a0000
    2023-05-25 23:33:51.87 spid167s    * ucrtbase                       00007FFCE0480000  00007FFCE058FFFF  00110000
    2023-05-25 23:33:51.87 spid167s    * NETAPI32                       00007FFCD1B80000  00007FFCD1B99FFF  0001a000
    2023-05-25 23:33:51.87 spid167s    * GDI32                          00007FFCE1250000  00007FFCE127AFFF  0002b000
    2023-05-25 23:33:51.87 spid167s    * pdh                            00007FFCD2030000  00007FFCD207EFFF  0004f000
    2023-05-25 23:33:51.87 spid167s    * win32u                         00007FFCE08A0000  00007FFCE08C5FFF  00026000
    2023-05-25 23:33:51.87 spid167s    * gdi32full                      00007FFCE08D0000  00007FFCE09E0FFF  00111000
    2023-05-25 23:33:51.87 spid167s    * USER32                         00007FFCE1080000  00007FFCE1224FFF  001a5000
    2023-05-25 23:33:51.87 spid167s    * combase                        00007FFCE1680000  00007FFCE19EFFFF  00370000
    2023-05-25 23:33:51.87 spid167s    * SQLOS                          00007FFCD4E50000  00007FFCD4E58FFF  00009000
    2023-05-25 23:33:51.87 spid167s    * sqldk                          00007FFCCAFC0000  00007FFCCB4E5FFF  00526000
    2023-05-25 23:33:51.87 spid167s    * sqlTsEs                        00007FFCCB5A0000  00007FFCCBE68FFF  008c9000
    2023-05-25 23:33:51.87 spid167s    * OLEAUT32                       00007FFCE1AB0000  00007FFCE1B86FFF  000d7000
    2023-05-25 23:33:51.87 spid167s    * CRYPT32                        00007FFCE0650000  00007FFCE07AEFFF  0015f000
    2023-05-25 23:33:51.87 spid167s    * opends60                       00007FFCD4E40000  00007FFCD4E48FFF  00009000
    2023-05-25 23:33:51.87 spid167s    * qds                            00007FFCCAE90000  00007FFCCAFB8FFF  00129000
    2023-05-25 23:33:51.87 spid167s    * svl                            00007FFCCAE60000  00007FFCCAE8CFFF  0002d000
    2023-05-25 23:33:51.87 spid167s    * MSVCP140                       00007FFCD3C50000  00007FFCD3CEAFFF  0009b000
    2023-05-25 23:33:51.87 spid167s    * MPR                            00007FFCD6010000  00007FFCD602EFFF  0001f000
    2023-05-25 23:33:51.87 spid167s    * VCRUNTIME140                   00007FFCD4750000  00007FFCD4765FFF  00016000
    2023-05-25 23:33:51.87 spid167s    * WINMM                          00007FFCD35B0000  00007FFCD35D6FFF  00027000
    2023-05-25 23:33:51.87 spid167s    * WININET                        00007FFCCA960000  00007FFCCAE5CFFF  004fd000
    2023-05-25 23:33:51.87 spid167s    * sqlmin                         00007FFCCE440000  00007FFCD13F7FFF  02fb8000
    2023-05-25 23:33:51.87 spid167s    * WS2_32                         00007FFCE1420000  00007FFCE1490FFF  00071000
    2023-05-25 23:33:51.87 spid167s    * Secur32                        00007FFCD2020000  00007FFCD202BFFF  0000c000
    2023-05-25 23:33:51.87 spid167s    * WINHTTP                        00007FFCD76E0000  00007FFCD77E8FFF  00109000
    2023-05-25 23:33:51.87 spid167s    * sqllang                        00007FFCCBE70000  00007FFCCE437FFF  025c8000
    2023-05-25 23:33:51.87 spid167s    * ODBC32                         00007FFCCA880000  00007FFCCA939FFF  000ba000
    2023-05-25 23:33:51.87 spid167s    * secforwarder                   00007FFCD2120000  00007FFCD2130FFF  00011000
    2023-05-25 23:33:51.87 spid167s    * NETUTILS                       00007FFCDF7D0000  00007FFCDF7DBFFF  0000c000
    2023-05-25 23:33:51.87 spid167s    * WINTRUST                       00007FFCE07B0000  00007FFCE0819FFF  0006a000
    2023-05-25 23:33:51.87 spid167s    * SSPICLI                        00007FFCDFE40000  00007FFCDFE81FFF  00042000
    2023-05-25 23:33:51.87 spid167s    * DPAPI                          00007FFCE0100000  00007FFCE0109FFF  0000a000
    2023-05-25 23:33:51.87 spid167s    * bcrypt                         00007FFCDFD00000  00007FFCDFD26FFF  00027000
    2023-05-25 23:33:51.87 spid167s    * ncrypt                         00007FFCDFCD0000  00007FFCDFCF7FFF  00028000
    2023-05-25 23:33:51.87 spid167s    * USERENV                        00007FFCDFA60000  00007FFCDFA8DFFF  0002e000
    2023-05-25 23:33:51.87 spid167s    * AUTHZ                          00007FFCDF2A0000  00007FFCDF2ECFFF  0004d000
    2023-05-25 23:33:51.87 spid167s    * XmlLite                        00007FFCDCCC0000  00007FFCDCCF6FFF  00037000
    2023-05-25 23:33:51.87 spid167s    * dhcpcsvc                       00007FFCD83E0000  00007FFCD83FCFFF  0001d000
    2023-05-25 23:33:51.87 spid167s    * SAMCLI                         00007FFCD9640000  00007FFCD9658FFF  00019000
    2023-05-25 23:33:51.87 spid167s    * LOGONCLI                       00007FFCD6E20000  00007FFCD6E61FFF  00042000
    2023-05-25 23:33:51.87 spid167s    * NTASN1                         00007FFCDFC90000  00007FFCDFCC6FFF  00037000
    2023-05-25 23:33:51.87 spid167s    * psapi                          00007FFCE12F0000  00007FFCE12F7FFF  00008000
    2023-05-25 23:33:51.87 spid167s    * MSASN1                         00007FFCDFE20000  00007FFCDFE31FFF  00012000
    2023-05-25 23:33:51.87 spid167s    * kernel.appcore                 00007FFCDEDE0000  00007FFCDEDF6FFF  00017000
    2023-05-25 23:33:51.87 spid167s    * bcryptPrimitives               00007FFCE0820000  00007FFCE089CFFF  0007d000
    2023-05-25 23:33:51.87 spid167s    * instapi150                     00007FFCC7390000  00007FFCC73A3FFF  00014000
    2023-05-25 23:33:51.87 spid167s    * CRYPTSP                        00007FFCDFBC0000  00007FFCDFBD7FFF  00018000
    2023-05-25 23:33:51.87 spid167s    * rsaenh                         00007FFCDF4F0000  00007FFCDF524FFF  00035000
    2023-05-25 23:33:51.87 spid167s    * CRYPTBASE                      00007FFCDFBB0000  00007FFCDFBBBFFF  0000c000
    2023-05-25 23:33:51.87 spid167s    * imagehlp                       00007FFCE1550000  00007FFCE156EFFF  0001f000
    2023-05-25 23:33:51.87 spid167s    * gpapi                          00007FFCDFA00000  00007FFCDFA24FFF  00025000
    2023-05-25 23:33:51.87 spid167s    * wkscli                         00007FFCD7500000  00007FFCD7519FFF  0001a000
    2023-05-25 23:33:51.87 spid167s    * cscapi                         00007FFCC6F50000  00007FFCC6F61FFF  00012000
    2023-05-25 23:33:51.87 spid167s    * sqlevn70                       00000235191E0000  0000023519352FFF  00173000
    2023-05-25 23:33:51.87 spid167s    * sqlevn70                       0000023519370000  00000235196A8FFF  00339000
    2023-05-25 23:33:51.87 spid167s    * CLUSAPI                        00007FFCD2A30000  00007FFCD2B3CFFF  0010d000
    2023-05-25 23:33:51.87 spid167s    * RESUTILS                       00007FFCD2B40000  00007FFCD2BCFFFF  00090000
    2023-05-25 23:33:51.87 spid167s    * VERSION                        00007FFCD3EE0000  00007FFCD3EE9FFF  0000a000
    2023-05-25 23:33:51.87 spid167s    * hkruntime                      00007FFCC4EA0000  00007FFCC5172FFF  002d3000
    2023-05-25 23:33:51.87 spid167s    * hkcompile                      00007FFCC4D60000  00007FFCC4E9FFFF  00140000
    2023-05-25 23:33:51.87 spid167s    * hkengine                       00007FFCC4660000  00007FFCC4D55FFF  006f6000
    2023-05-25 23:33:51.87 spid167s    * dbghelp                        00007FFCD1BA0000  00007FFCD1DB1FFF  00212000
    2023-05-25 23:33:51.87 spid167s    * SHLWAPI                        00007FFCE2E10000  00007FFCE2E6EFFF  0005f000
    2023-05-25 23:33:51.87 spid167s    * ncryptprov                     00007FFCD48C0000  00007FFCD491DFFF  0005e000
    2023-05-25 23:33:51.87 spid167s    * msv1_0                         00007FFCDF8E0000  00007FFCDF96EFFF  0008f000
    2023-05-25 23:33:51.87 spid167s    * NtlmShared                     00007FFCDF8D0000  00007FFCDF8DDFFF  0000e000
    2023-05-25 23:33:51.87 spid167s    * cryptdll                       00007FFCDF9E0000  00007FFCDF9F4FFF  00015000
    2023-05-25 23:33:51.87 spid167s    * kerberos                       00007FFCDFA90000  00007FFCDFBACFFF  0011d000
    2023-05-25 23:33:51.87 spid167s    * schannel                       00007FFCDF400000  00007FFCDF49BFFF  0009c000
    2023-05-25 23:33:51.87 spid167s    * SECURITY                       000002620D1B0000  000002620D1B2FFF  00003000
    2023-05-25 23:33:51.87 spid167s    * MSCOREE                        00007FFCD1400000  00007FFCD1466FFF  00067000
    2023-05-25 23:33:51.87 spid167s    * mscoreei                       00007FFCCB4F0000  00007FFCCB599FFF  000aa000
    2023-05-25 23:33:51.87 spid167s    * SqlServerSpatial150            00007FFCC1170000  00007FFCC1211FFF  000a2000
    2023-05-25 23:33:51.87 spid167s    * clbcatq                        00007FFCE1570000  00007FFCE161EFFF  000af000
    2023-05-25 23:33:51.87 spid167s    * msxml3                         00007FFCC0210000  00007FFCC047BFFF  0026c000
    2023-05-25 23:33:51.87 spid167s    * ualapi                         00007FFCD6050000  00007FFCD6069FFF  0001a000
    2023-05-25 23:33:51.87 spid167s    * SHELL32                        00007FFCE1E90000  00007FFCE25EAFFF  0075b000
    2023-05-25 23:33:51.87 spid167s    * ntmarta                        00007FFCDE960000  00007FFCDE993FFF  00034000
    2023-05-25 23:33:51.87 spid167s    * ESENT                          00007FFCD5A40000  00007FFCD5DD9FFF  0039a000
    2023-05-25 23:33:51.87 spid167s    * msoledbsql                     00007FFCBFF60000  00007FFCC0202FFF  002a3000
    2023-05-25 23:33:51.87 spid167s    * COMDLG32                       00007FFCE28A0000  00007FFCE2982FFF  000e3000
    2023-05-25 23:33:51.87 spid167s    * shcore                         00007FFCE25F0000  00007FFCE26D9FFF  000ea000
    2023-05-25 23:33:51.87 spid167s    * COMCTL32                       00007FFCBF5F0000  00007FFCBF6A1FFF  000b2000
    2023-05-25 23:33:51.87 spid167s    * IPHLPAPI                       00007FFCDF6C0000  00007FFCDF6ECFFF  0002d000
    2023-05-25 23:33:51.87 spid167s    * imm32                          00007FFCE1B90000  00007FFCE1BC0FFF  00031000
    2023-05-25 23:33:51.87 spid167s    * MSOLEDBSQLR                    000002621D340000  000002621D350FFF  00011000
    2023-05-25 23:33:51.87 spid167s    * sqlncli11                      00007FFCBF290000  00007FFCBF5E3FFF  00354000
    2023-05-25 23:33:51.87 spid167s    * MSVCR100                       000000005CB60000  000000005CC31FFF  000d2000
    2023-05-25 23:33:51.87 spid167s    * SQLNCLIR11                     000002621D410000  000002621D437FFF  00028000
    2023-05-25 23:33:51.87 spid167s    * sqlnclirda11                   000000005C800000  000000005CB58FFF  00359000
    2023-05-25 23:33:51.87 spid167s    * SQLNCLIRDAR11                  000002621D530000  000002621D567FFF  00038000
    2023-05-25 23:33:51.87 spid167s    * windows.storage                00007FFCD21E0000  00007FFCD2A27FFF  00848000
    2023-05-25 23:33:51.87 spid167s    * clr                            00007FFCC9AC0000  00007FFCCA5F4FFF  00b35000
    2023-05-25 23:33:51.87 spid167s    * ucrtbase_clr0400               00007FFCC9920000  00007FFCC99DCFFF  000bd000
    2023-05-25 23:33:51.87 spid167s    * VCRUNTIME140_CLR0400           00007FFCC99E0000  00007FFCC99F5FFF  00016000
    2023-05-25 23:33:51.87 spid167s    * mscorlib.ni                    00007FFCC8170000  00007FFCC976FFFF  01600000
    2023-05-25 23:33:51.87 spid167s    * SqlAccess                      00007FFCBF210000  00007FFCBF285FFF  00076000
    2023-05-25 23:33:51.87 spid167s    * BatchParser                    00007FFCBEC70000  00007FFCBEC99FFF  0002a000
    2023-05-25 23:33:51.87 spid167s    * clrjit                         00007FFCC7240000  00007FFCC738EFFF  0014f000
    2023-05-25 23:33:51.87 spid167s    * profapi                        00007FFCE03B0000  00007FFCE03D0FFF  00021000
    2023-05-25 23:33:51.87 spid167s    * SRVCLI                         00007FFCDAFD0000  00007FFCDAFF7FFF  00028000
    2023-05-25 23:33:51.87 spid167s    * mskeyprotect                   00007FFCC2CB0000  00007FFCC2CC4FFF  00015000
    2023-05-25 23:33:51.87 spid167s    * mswsock                        00007FFCDF970000  00007FFCDF9D7FFF  00068000
    2023-05-25 23:33:51.87 spid167s    * ntdsapi                        00007FFCD4F40000  00007FFCD4F69FFF  0002a000
    2023-05-25 23:33:51.87 spid167s    * DSPARSE                        00007FFCD3ED0000  00007FFCD3EDBFFF  0000c000
    2023-05-25 23:33:51.87 spid167s    * ncryptsslp                     00007FFCC4630000  00007FFCC4655FFF  00026000
    2023-05-25 23:33:51.87 spid167s    * xpsqlbot                       00007FFCCA940000  00007FFCCA949FFF  0000a000
    2023-05-25 23:33:51.87 spid167s    * xpstar                         00007FFCBEB00000  00007FFCBEB71FFF  00072000
    2023-05-25 23:33:51.87 spid167s    * SQLSCM                         00007FFCBEC20000  00007FFCBEC33FFF  00014000
    2023-05-25 23:33:51.87 spid167s    * xpstar                         0000026221050000  0000026221058FFF  00009000
    2023-05-25 23:33:51.87 spid167s    * xplog70                        00007FFCBCE90000  00007FFCBCEA5FFF  00016000
    2023-05-25 23:33:51.87 spid167s    * xplog70                        00000262210A0000  00000262210A2FFF  00003000
    2023-05-25 23:33:51.87 spid167s    * sqlvdi                         00007FFCACA80000  00007FFCACABAFFF  0003b000
    2023-05-25 23:33:51.87 spid167s    * SAMLIB                         00007FFCD2C90000  00007FFCD2CB6FFF  00027000
    2023-05-25 23:33:51.87 spid167s    * xprepl                         00007FFCB3EB0000  00007FFCB3ECBFFF  0001c000
    2023-05-25 23:33:51.87 spid167s    *
    2023-05-25 23:33:51.87 spid167s    *     P1Home: 0000000000000040:  
    2023-05-25 23:33:51.87 spid167s    *     P2Home: 0000000001000000:  
    2023-05-25 23:33:51.87 spid167s    *     P3Home: 0000000000013855:  
    2023-05-25 23:33:51.87 spid167s    *     P4Home: 0000000000000007:  
    2023-05-25 23:33:51.87 spid167s    *     P5Home: 00000262275E4CF0:  0000000000000000  0000000000000000  000002359997EF98  000002359997EF98  000002445B380000  0000000000000000  
    2023-05-25 23:33:51.87 spid167s    *     P6Home: 000002445B3EA000:  0100A00000000101  008200010043E327  002200010043E331  1F37008500000491  000000010043E330  0001F99700004910  
    2023-05-25 23:33:51.87 spid167s    * ContextFlags: 000000000010005F:  
    2023-05-25 23:33:51.87 spid167s    *      MxCsr: 0000000000001FA8:  
    2023-05-25 23:33:51.87 spid167s    *      SegCs: 0000000000000033:  
    2023-05-25 23:33:51.87 spid167s    *      SegDs: 000000000000002B:  
    2023-05-25 23:33:51.87 spid167s    *      SegEs: 000000000000002B:  
    2023-05-25 23:33:51.87 spid167s    *      SegFs: 0000000000000053:  
    2023-05-25 23:33:51.87 spid167s    *      SegGs: 000000000000002B:  
    2023-05-25 23:33:51.87 spid167s    *      SegSs: 000000000000002B:  
    2023-05-25 23:33:51.87 spid167s    *     EFlags: 0000000000010206:  
    2023-05-25 23:33:51.87 spid167s    *        Rax: 00000000000270AA:  
    2023-05-25 23:33:51.87 spid167s    *        Rcx: 0000020528052E9A:  
    2023-05-25 23:33:51.87 spid167s    *        Rdx: 0000000000000000:  
    2023-05-25 23:33:51.87 spid167s    *        Rbx: 0000000000013855:  
    2023-05-25 23:33:51.87 spid167s    *        Rsp: 0000003C073FDC10:  FFFFFFFFFFFFFFFF  00007FFCCAFC2CC2  0000003C073FDF70  0000003C073FDF40  0000000000000000  0000000000000000  
    2023-05-25 23:33:51.87 spid167s    *        Rbp: 0000003C073FDD10:  0000000000000000  0000000000000000  0000000000000000  0000000000000000  0000000000000000  0000000000000000  
    2023-05-25 23:33:51.87 spid167s    *        Rsi: 0000F8C38978D4B3:  
    2023-05-25 23:33:51.87 spid167s    *        Rdi: 0000003C073FDDB0:  FFFFFFFFFFFFFFFF  FFFFFFFFFFFFFFFF  FFFFFFFFFFFFFFFF  FFFFFFFFFFFFFFFF  FFFFFFFFFFFFFFFF  FFFFFFFFFFFFFFFF  
    2023-05-25 23:33:51.87 spid167s    *         R8: 00000000000000AF:  
    2023-05-25 23:33:51.87 spid167s    *         R9: 0000000000000002:  
    2023-05-25 23:33:51.87 spid167s    *        R10: 0000024999580040:  00007FFCCB188ED8  0000024999595FE0  0000024999360050  00000249995A0050  000000000000000F  0000024999034398  
    2023-05-25 23:33:51.87 spid167s    *        R11: 0000000000000000:  
    2023-05-25 23:33:51.87 spid167s    *        R12: 0000003C073FDFD0:  0000F8C38A1094DF  0000000000000000  0000000000000000  0000000000000000  0000000000000028  0000F8C38A1094DF  
    2023-05-25 23:33:51.87 spid167s    *        R13: 0000000000000001:  
    2023-05-25 23:33:51.87 spid167s    *        R14: 00007FFCCB218C20:  00007FFCCB219998  00007FFCCB2180A0  0000000000000001  00007FFCCB219998  0000000000000003  0000024999031E50  
    2023-05-25 23:33:51.87 spid167s    *        R15: 00007FFCCB2173C0:  0000000000000000  0000000000000000  00007FFCCB2199B8  00007FFCCB2166F0  0000000000000006  00007FFCCB2199B8  
    2023-05-25 23:33:51.87 spid167s    *        Rip: 00007FFCCAFC4AFB:  3024448948F3F749  000013E882B70F41  2A058D48800C8D48  10C854B70F0025BA  828B4908C87C8B48  48C72348000013F0  
    2023-05-25 23:33:51.87 spid167s    * *******************************************************************************
    2023-05-25 23:33:51.87 spid167s    * -------------------------------------------------------------------------------
    2023-05-25 23:33:51.87 spid167s    * Short Stack Dump
    2023-05-25 23:33:51.88 spid167s    00007FFCCAFC4AFB Module(sqldk+0000000000004AFB)
    2023-05-25 23:33:51.88 spid167s    00007FFCCAFC46A5 Module(sqldk+00000000000046A5)
    2023-05-25 23:33:51.88 spid167s    00007FFCCAFC2E49 Module(sqldk+0000000000002E49)
    2023-05-25 23:33:51.88 spid167s    00007FFCCAFC1E37 Module(sqldk+0000000000001E37)
    2023-05-25 23:33:51.88 spid167s    00007FFCCAFC221A Module(sqldk+000000000000221A)
    2023-05-25 23:33:51.89 spid167s    00007FFCCAFC3C44 Module(sqldk+0000000000003C44)
    2023-05-25 23:33:51.89 spid167s    00007FFCCF8AF987 Module(sqlmin+000000000146F987)
    2023-05-25 23:33:51.89 spid167s    00007FFCCF8B1BA4 Module(sqlmin+0000000001471BA4)
    2023-05-25 23:33:51.89 spid167s    00007FFCCAFC6523 Module(sqldk+0000000000006523)
    2023-05-25 23:33:51.89 spid167s    00007FFCCAFC6E6D Module(sqldk+0000000000006E6D)
    2023-05-25 23:33:51.89 spid167s    00007FFCCAFC6C75 Module(sqldk+0000000000006C75)
    2023-05-25 23:33:51.89 spid167s    00007FFCCAFEB160 Module(sqldk+000000000002B160)
    2023-05-25 23:33:51.89 spid167s    00007FFCCAFEAA5B Module(sqldk+000000000002AA5B)
    2023-05-25 23:33:51.90 spid167s    00007FFCCAFEAFA4 Module(sqldk+000000000002AFA4)
    2023-05-25 23:33:51.90 spid167s    00007FFCE1A04ED0 Module(KERNEL32+0000000000014ED0)
    2023-05-25 23:33:51.90 spid167s    00007FFCE2F2E39B Module(ntdll+000000000007E39B)
    2023-05-25 23:33:51.96 spid167s    Stack Signature for the dump is 0x000000008E9BFFF3
    2023-05-25 23:33:56.41 spid167s    External dump process return code 0x20000001.
    External dump process returned no errors.
    
    2023-05-25 23:36:43.41 Server      CImageHelper::Init () Version-specific dbghelp.dll is not used
    2023-05-25 23:36:43.44 Server      **Dump thread - spid = 0, EC = 0x0000000000000000
    2023-05-25 23:36:43.44 Server      ***Stack Dump being sent to C:\Program Files\Microsoft SQL Server\MSSQL15.BETASQLSERVER\MSSQL\LOG\SQLDump0002.txt
    2023-05-25 23:36:43.44 Server      * *******************************************************************************
    2023-05-25 23:36:43.44 Server      *
    2023-05-25 23:36:43.44 Server      * BEGIN STACK DUMP:
    2023-05-25 23:36:43.44 Server      *   05/25/23 23:36:43 spid 3592
    2023-05-25 23:36:43.44 Server      *
    2023-05-25 23:36:43.44 Server      * Stalled IOCP Listener
    2023-05-25 23:36:43.44 Server      *
    2023-05-25 23:36:43.44 Server      * *******************************************************************************
    2023-05-25 23:36:43.44 Server      * -------------------------------------------------------------------------------
    2023-05-25 23:36:43.44 Server      * Short Stack Dump
    2023-05-25 23:36:43.44 Server      Stack Signature for the dump is 0x00000000000002E0
                                

Wayne Chen (3 rep)

May 26, 2023, 05:19 AM • Last activity: May 26, 2023, 08:08 AM

1 votes

2 answers

301 views

After server is restarted, the database and the backups are destroyed by the system crash. What ACID properties are broken in this situation?

transaction crash acid

Let's assume that two transactions execute on a database, both reading, and writing. At some time, the system crashes. After the server is restarted, the database and the backups were destroyed by the crash and the database does no longer exists. Which ACID properties would this break? My first gues...

                                  Let's assume that two transactions execute on a database, both reading, and writing. At some time, the system crashes. After the server is restarted, the database and the backups were destroyed by the crash and the database does no longer exists. Which ACID properties would this break?

My first guess is that it breaks only Durability since we can assume that transactions did nothing to the database ensuring atomicity, the database is definitely consistent, and I am not sure about isolation, but I don't see how would this break isolation.

ujoeja (11 rep)

Nov 24, 2021, 07:55 PM • Last activity: May 9, 2023, 12:33 PM

1 votes

0 answers

394 views

Trying to use the "Object Explorer" in pgAdmin crashes server when pg-strom is installed

postgresql crash postgresql-extensions

I'm trying to get to the bottom of some unusual behavior with pgAdmin4 when the GPU acceleration extension [pg-strom](https://heterodb.github.io/pg-strom/) is installed in PostgreSQL 14 and 15. Simple queries seem to work fine, but when trying to expand the "Schemas" section in pgAdmin4's "Object Ex...

                                  I'm trying to get to the bottom of some unusual behavior with pgAdmin4 when the GPU acceleration extension [pg-strom](https://heterodb.github.io/pg-strom/)  is installed in PostgreSQL 14 and 15. Simple queries seem to work fine, but when trying to expand the "Schemas" section in pgAdmin4's "Object Explorer" window, it ***consistently*** crashes the whole database (trying to expand "Catalogues" or "Casts" has as similar effect). 

I've tested it with a manual install on PostgreSQL 14 in a Debian 11 LXC container on Proxmox with an nVidia P1000 and on a native Debian 12 install using the unofficial [pg-strom/pg15 Docker image](https://github.com/murphye/pg-strom/tree/docker/docker)  and an RTX 3060 Ti. Both had the same issues, my application's normal queries work fine, but trying to expand the "Schemas" section in pgAdmin4 crashes the whole database...



The other sections seem to open just fine so I'm not sure what's different with the "Schemas" section. Dropping the pg-strom extension ***and*** removing it from the "shared preload libraries" fixes the issue but disables GPU acceleration so it's not an ideal solution.

I've pulled the section from the logs where pgAdmin4 tries to execute whatever query it wants to when you try and expand the "Schemas" section:

    2023-04-22 05:42:14.696 UTC  LOG:  server process (PID 3831) was terminated by signal 6: Aborted
    2023-04-22 05:42:14.696 UTC  DETAIL:  Failed process was running: SELECT
                nsp.oid,
                nsp.nspname as name,
                pg_catalog.has_schema_privilege(nsp.oid, 'CREATE') as can_create,
                pg_catalog.has_schema_privilege(nsp.oid, 'USAGE') as has_usage
            FROM
                pg_catalog.pg_namespace nsp
            WHERE
                         nspname NOT LIKE E'pg\\_%' AND
                        NOT (
            (nsp.nspname = 'pg_catalog' AND EXISTS
                    (SELECT 1 FROM pg_catalog.pg_class WHERE relname = 'pg_class' AND
                        relnamespace = nsp.oid LIMIT 1)) OR
                (nsp.nspname = 'pgagent' AND EXISTS
                    (SELECT 1 FROM pg_catalog.pg_class WHERE relname = 'pga_job' AND
                        relnamespace = nsp.oid LIMIT 1)) OR
                (nsp.nspname = 'information_schema' AND EXISTS
                    (SELECT 1 FROM pg_catalog.pg_class WHERE relname = 'tables' AND
                        relnamespace = nsp.oid LIMIT 1))
                )
    
                
            ORDER BY nspname;
    2023-04-22 05:42:14.696 UTC  LOG:  terminating any other active server processes
    2023-04-22 05:42:14.698 UTC  FATAL:  the database system is in recovery mode
    2023-04-22 05:42:14.717 UTC  LOG:  all server processes terminated; reinitializing
    2023-04-22 05:42:15.215 UTC  LOG:  database system was interrupted; last known up at 2023-04-22 05:35:54 UTC
    2023-04-22 05:42:15.216 UTC  FATAL:  the database system is in recovery mode
    2023-04-22 05:42:15.233 UTC  LOG:  database system was not properly shut down; automatic recovery in progress
    2023-04-22 05:42:15.236 UTC  LOG:  redo starts at 2/54CCC818
    2023-04-22 05:42:15.236 UTC  LOG:  invalid record length at 2/54CCC850: wanted 24, got 0
    2023-04-22 05:42:15.236 UTC  LOG:  redo done at 2/54CCC818 system usage: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s
    2023-04-22 05:42:15.239 UTC  LOG:  checkpoint starting: end-of-recovery immediate wait
    2023-04-22 05:42:15.264 UTC  LOG:  checkpoint complete: wrote 3 buffers (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.014 s, sync=0.004 s, total=0.028 s; sync files=2, longest=0.003 s, average=0.002 s; distance=0 kB, estimate=0 kB
    2023-04-22 05:42:15.268 UTC  LOG:  CUDA Program Builder-1 with NVRTC version 12.0
    2023-04-22 05:42:15.268 UTC  LOG:  CUDA Program Builder-0 with NVRTC version 12.0
    2023-04-22 05:42:15.268 UTC  LOG:  database system is ready to accept connections

Trying to run "EXPLAIN" on the above query ***also*** causes the database to crash which seems a bit odd to me (something with the query planner maybe?)

Also notable: DBeaver has a similar(ish) object explorer to pgAdmin4 except ***their*** one works fine, although I really like the pgAdmin4 UI and the graphical query plan viewer so it'd be a shame to have to give that up.

## Does anyone know what would cause pgAdmin4 to crash the database with the above query only when pg-strom is installed? ##
                                

Sam (111 rep)

Apr 22, 2023, 06:34 AM

1 votes

1 answers

852 views

Is a Postgresql UNLOGGED table completely lost on process crash?

postgresql crash unlogged-tables

I use UNLOGGED tables for a few very large tables in a data warehouse style application. Until recently, I understood UNLOGGED to mean “won’t write to the WAL” - which in turn means that *recent* changes maybe lost on a process crash / unclean termination, and that there will be no replication Maybe...

                                  I use UNLOGGED tables for a few very large tables in a data warehouse style application.

Until recently, I understood UNLOGGED to mean “won’t write to the WAL” - which in turn means that *recent* changes maybe lost on a process crash / unclean termination, and that there will be no replication

Maybe I’m misunderstanding the language in the documentation (or maybe I’m not) but when I read the documentation recently, I understood it to mean the *entire* table will be truncated on an unclean exit. Is that right?

The question is, **_on an unclean exit will the entire table will be TRUNCATEd (per the meaning of TRUNCATE in PostgreSQL) or does it mean truncated as in the everyday use- the end will abruptly terminate, only the most recent rows lost, due to there being no WAL_**

Surely it myst be the latter? Being completely TRUNCATEd makes no sense to me; if I have an UNLOGGED table with a year’s worth of data, and today the postgres process goes down, why would it destroy the entire contents of the table?

Or am I misunderstanding and it’s only recent changes (that would normally be restorable via WAL or similarly, by a spare) that would be lost?

The blurb from the official PostgreSQL documentation is as follows, the bit about truncation us what I’m referring to:

> If specified, the table is created as an unlogged table. Data written to unlogged tables is not written to the write-ahead log (see Chapter 30), which makes them considerably faster than ordinary tables. However, they are not crash-safe: an unlogged table is automatically truncated after a crash or unclean shutdown.

I’m not sure how I never noticed this language but it has me somewhat alarmed. Losing the most recent few days worth of data isn’t a problem for my use, but to have to restore the entire table (~60GB worth of data per-table) from the backups I keep (non-WAL backups, obviously) due to a simple process crash is alarming

mzpq (165 rep)

Apr 5, 2023, 11:52 AM • Last activity: Apr 6, 2023, 07:01 AM

4 votes

1 answers

5699 views

MongoDB crashes with out-of-memory or is being killed by oom-killer

mongodb sharding crash mongodb-3.6 google-cloud-platform

A two shards MongoDB database regularly crashes with out-of-memory error or is being killed by the oom-killer. The system runs on **GCE Debian 9.4** with **MongoDB v3.6.5**, **WiredTiger** storage engine and without swap (as is the practice on GCE). The servers are n1-highmem-4 (**4 vCPUs**, **26 GB...

                                  A two shards MongoDB database regularly crashes with out-of-memory error or is being killed by the oom-killer. The system runs on **GCE Debian 9.4** with **MongoDB v3.6.5**, **WiredTiger** storage engine and without swap (as is the practice on GCE). The servers are n1-highmem-4 (**4 vCPUs**, **26 GB memory**). On the server runs just mongod and there are no other services. **mongos** are on different servers.

Usually process exit/crash happens once a day. If mongod process is killed by **oom-killer** this can be seen in the logs:

    Jun 15 14:45:17 server4 kernel: [1731430.432189] Out of memory: Kill process 13130 (mongod) score 980 or sacrifice child
    Jun 15 14:45:17 server4 kernel: [1731430.441717] Killed process 13130 (mongod) total-vm:28280536kB, anon-rss:26174876kB, file-rss:0kB, shmem-rss:0kB

Sometimes mongod exits with leaving this in the mongod.log:

    2018-06-15T02:14:32.456+0200 F -        [rsSync] out of memory.
    
     0x55cbc8535751 0x55cbc8534d84 0x55cbc8623b4b 0x55cbc86c665c 0x55cbc70fccff 0x55cbc70f8b02 0x55cbc707b3f1 0x55cbc86449b0 0x7fbbf3507494 0x7fbbf3249acf
    ----- BEGIN BACKTRACE -----
    {"backtrace":[{"b":"55CBC6305000","o":"2230751","s":"_ZN5mongo15printStackTraceERSo"},{"b":"55CBC6305000","o":"222FD84","s":"_ZN5mongo29reportOutOfMemoryErrorAndExitEv"},{"b":"55CBC6305000","o":"231EB4B"},{"b":"55CBC6305000","o":"23C165C","s":"_Znam"},{"b":"55CBC6305000","o":"DF7CFF","s":"_ZN5mongo4repl8SyncTail7OpQueueC1Ev"},{"b":"55CBC6305000","o":"DF3B02","s":"_ZN5mongo4repl8SyncTail16oplogApplicationEPNS0_22ReplicationCoordinatorE"},{"b":"55CBC6305000","o":"D763F1","s":"_ZN5mongo4repl10RSDataSync4_runEv"},{"b":"55CBC6305000","o":"233F9B0"},{"b":"7FBBF3500000","o":"7494"},{"b":"7FBBF3161000","o":"E8ACF","s":"clone"}],"processInfo":{ "mongodbVersion" : "3.6.5", "gitVersion" : "a20ecd3e3a174162052ff99913bc2ca9a839d618", "compiledModules" : [], "uname" : { "sysname" : "Linux", "release" : "4.9.0-6-amd64", "version" : "#1 SMP Debian 4.9.88-1+deb9u1 (2018-05-07)", "machine" : "x86_64" }, "somap" : [ { "b" : "55CBC6305000", "elfType" : 3, "buildId" : "7D4592BDFAA6C15459D2319DEAB7F10E9EB4E7D7" }, { "b" : "7FFC48D98000", "path" : "linux-vdso.so.1", "elfType" : 3, "buildId" : "A3207CC9FE1CAA3374AE7061AA5C3C5619B8A0E5" }, { "b" : "7FBBF4743000", "path" : "/lib/x86_64-linux-gnu/libresolv.so.2", "elfType" : 3, "buildId" : "713D47D5F599289C0A91ADE8F0122B2B4AA78B2E" }, { "b" : "7FBBF42B0000", "path" : "/usr/lib/x86_64-linux-gnu/libcrypto.so.1.1", "elfType" : 3, "buildId" : "2CFE882A331D7857E9CE1B5DE3255E6DA76EF899" }, { "b" : "7FBBF4044000", "path" : "/usr/lib/x86_64-linux-gnu/libssl.so.1.1", "elfType" : 3, "buildId" : "E2AA3B39763D943F56B3BD05C8E36E639BA95E12" }, { "b" : "7FBBF3E40000", "path" : "/lib/x86_64-linux-gnu/libdl.so.2", "elfType" : 3, "buildId" : "B895F0831F623C5F23603401D4069F9F94C24761" }, { "b" : "7FBBF3C38000", "path" : "/lib/x86_64-linux-gnu/librt.so.1", "elfType" : 3, "buildId" : "5D83E0642E645026DBB11F89F7DF7106BD821495" }, { "b" : "7FBBF3934000", "path" : "/lib/x86_64-linux-gnu/libm.so.6", "elfType" : 3, "buildId" : "1B95E3A8B8788B07E4F59EE69B1877F9DEB42033" }, { "b" : "7FBBF371D000", "path" : "/lib/x86_64-linux-gnu/libgcc_s.so.1", "elfType" : 3, "buildId" : "51AD5FD294CD6C813BED40717347A53434B80B7A" }, { "b" : "7FBBF3500000", "path" : "/lib/x86_64-linux-gnu/libpthread.so.0", "elfType" : 3, "buildId" : "4285CD3158DDE596765C747AE210AB6CBD258B22" }, { "b" : "7FBBF3161000", "path" : "/lib/x86_64-linux-gnu/libc.so.6", "elfType" : 3, "buildId" : "AA889E26A70F98FA8D230D088F7CC5BF43573163" }, { "b" : "7FBBF495A000", "path" : "/lib64/ld-linux-x86-64.so.2", "elfType" : 3, "buildId" : "263F909DBE11A66F7C6233E3FF0521148D9F8370" } ] }}
     mongod(_ZN5mongo15printStackTraceERSo+0x41) [0x55cbc8535751]
     mongod(_ZN5mongo29reportOutOfMemoryErrorAndExitEv+0x84) [0x55cbc8534d84]
     mongod(+0x231EB4B) [0x55cbc8623b4b]
     mongod(_Znam+0x21C) [0x55cbc86c665c]
     mongod(_ZN5mongo4repl8SyncTail7OpQueueC1Ev+0x7F) [0x55cbc70fccff]
     mongod(_ZN5mongo4repl8SyncTail16oplogApplicationEPNS0_22ReplicationCoordinatorE+0x402) [0x55cbc70f8b02]
     mongod(_ZN5mongo4repl10RSDataSync4_runEv+0x111) [0x55cbc707b3f1]
     mongod(+0x233F9B0) [0x55cbc86449b0]
     libpthread.so.0(+0x7494) [0x7fbbf3507494]
     libc.so.6(clone+0x3F) [0x7fbbf3249acf]
    -----  END BACKTRACE  -----

We have tried to play with the cacheSizeGB parameter and reduced it to 10GB:

      cacheSizeGB: 10.0

But still the crash happens.

It is worth mentioning that there is a chunk moving process underway and that mongod instances that crash are the one from the shard that receive chunks.

What tuning options to use to avoid such crashes?

UPDATE: added small swap od 1GB, but still out-of-memory happens.
                                

ssasa (261 rep)

Jun 15, 2018, 01:32 PM • Last activity: Mar 11, 2023, 07:05 AM

Showing page 1 of 20 total questions