Database Administrators

Q&A for database professionals who wish to improve their database skills

Latest Questions

1 votes

1 answers

1707 views

How to identify file for "The system cannot find the file specified." during a pg_ctl -D start?

postgresql replication postgresql-9.2 troubleshooting

I'm getting one message during pg_ctl -D start of the standby server. Perhaps more of a general troubleshooting question. This is what I'm seeing when I do a `pg_ctl -D start`: D:\>pg_ctl -D d:\dev_data_standby start server starting D:\>2016-03-21 08:41:12 EDT 1016 LOG: loaded library "auto_explain"...

                                  I'm getting one message during pg_ctl -D start of the standby server. Perhaps more of a general troubleshooting question.

This is what I'm seeing when I do a pg_ctl -D start:  

    D:\>pg_ctl -D d:\dev_data_standby start
    server starting
    D:\>2016-03-21 08:41:12 EDT   1016 LOG:  loaded library "auto_explain"
            1 file(s) copied.
    The system cannot find the file specified.

The log file shows:

    2016-03-21 08:41:12 EDT   3564 LOG:  database system was interrupted; last known up at 2016-03-21 08:38:08 EDT
    2016-03-21 08:41:16 EDT   3564 LOG:  entering standby mode
    2016-03-21 08:41:16 EDT   3564 LOG:  restored log file "0000000100000002000000F2" from archive
    2016-03-21 08:41:16 EDT   3564 LOG:  redo starts at 2/F2000020
    2016-03-21 08:41:16 EDT   3564 LOG:  consistent recovery state reached at 2/F20000E0
    2016-03-21 08:41:16 EDT   3816 LOG:  streaming replication successfully connected to primary

How can I troubleshoot that error:  The system cannot find the file specified.?

                                

JC5577 (625 rep)

Mar 21, 2016, 01:23 PM • Last activity: Jun 20, 2025, 12:08 PM

1 votes

0 answers

605 views

Unable to create and install pgAgent service

postgresql jobs pgadmin scheduled-tasks troubleshooting

I'm trying to install pgAgent on my device. During the installation using the stackbuilder, the last step requires me to add the operating system username and password as below: [![last prompt window][1]][1] However, after entering the password of my postgres superuser account, i'm shown the below d...

                                  I'm trying to install pgAgent on my device. During the installation using the stackbuilder, the last step requires me to add the operating system username and password as below:

However, after entering the password of my postgres superuser account, i'm shown the below dialog box:

Every resource available online mentions that the username and password should be the superuser's. I've tried using the Windows account details as well as creating a new user within postgres and using its details. Any help would be greatly appreciated. I'm using PostgreQL v13, and pgAgent v4.2.1

Abhinav Shenoy (11 rep)

Jan 19, 2023, 09:52 AM • Last activity: Sep 25, 2024, 10:19 AM

0 votes

1 answers

65 views

Which parameters need to be changed to improve repairs?

cassandra troubleshooting

Could you please explain, what are the parameters need to be changed to improve repair in Casandra. ``` error: Repair job has failed with the error message: Repair command #1158 failed with error Repair session b836f570-700d-11ef-9447-377b26201515 for range [(9164824941183028599,-9164226683701868787...

Could you please explain, what are the parameters need to be changed to improve repair in Casandra.

error: Repair job has failed with the error message: 
Repair command #1158 failed with error Repair session 
b836f570-700d-11ef-9447-377b26201515 
for range [(9164824941183028599,-9164226683701868787], 
(6868832037534058657,6892120122899442963], 

(9091534174371255880,9164824941183028599]] failed with error 
[repair #b836f570-700d-11ef-9447-377b26201515 on test/tab_ext

[(9164824941183028599,-9164226683701868787], 

(6868832037534058657,6892120122899442963], (9091534174371255880,9164824941183028599]]] 
Validation failed in test-east-prod-cassandra-vm0.az.com/10.2.1.5.:7000. 

Check the logs on the repair

Balu (11 rep)

Sep 8, 2024, 01:26 PM • Last activity: Sep 12, 2024, 09:05 AM

0 votes

0 answers

274 views

SSRS - Microsoft.ReportingServices.Diagnostics.Utilities

sql-server ssrs ssrs-2016 troubleshooting

I am troubleshooting an error in SSRS we want to alter a subscription and set it to the next 5 min from now, but when it gets there it does not run. other than that, I see these error messages in the reporting services execution logs : > 0-27!4270!07/01/2024-12:54:47:: e ERROR: Throwing > Microsoft.ReportingServices.Diagnostics.Utilities.NotEnabledException: > , > Microsoft.ReportingServices.Diagnostics.Utilities.NotEnabledException: > The requested functionality is not currently enabled.; question: How to enable these diagnostics utilities? I can query the ReportServer database, I see the errors, but not the details:

select r.* 
from reportserver.dbo.ExecutionLog3 r
where status  'rsSuccess'
order by r.TimeStart DESC

when I open the file rsreportserver.config that in my environment lives at C:\Program Files\Microsoft SQL Server\MSRS13.MSSQLSERVER\Reporting Services\ReportServer I can only see the following line related to diagnostics:

Marcello Miorelli (17274 rep)

Jul 1, 2024, 12:24 PM • Last activity: Jul 1, 2024, 01:48 PM

23 votes

2 answers

24400 views

What can cause a mirroring session to timeout then failover?

sql-server sql-server-2005 mirroring troubleshooting

We have two production SQL Servers running SQL Server 2005 SP4 with cumulative update 3. Both servers run on physical machines that are identical. DELL PowerEdge R815 with 4 x 12 core CPUs and 512GB (yes GB) of ram, with 10GB iSCSI SAN connected drives for all SQL databases and logs. OS is Microsoft Windows Server 2008 R2 Enterprise edition with all SP's and windows updates. OS drive is a RAID 5 array of 3 x 72GB 2.5" 15k SAS drives. SAN is a Dell EqualLogic 6510 with 48 x 10K SAS 3.5" drives, configured in RAID 50, sliced into various LUNs for the 2 SQL Servers, and also shared with an Exchange machine and several VMWare servers. We have over 20 databases, 11 of which are mirrored with high availability using a witness server. The witness server is a lower powered machine running a SQL Server instance that is used for nothing other than providing witness services. The biggest mirrored database is 450GB and generates around 100-300 iops. Database Mirroring Monitor reports a current send rate around 100kb to 10mb per second, and a mirror commit overhead of (typically) 0 milliseconds. The mirror server has no problem keeping up with the principal. We are consistently experiencing mirroring failovers. Sometimes a single database will failover, other times almost all databases will failover simultaneously. For instance, last night, we had 10 of 11 databases failover, the remaining database stayed accessible until I manually failed it over. I have gone through several troubleshooting steps to attempt to identify the problem, but have so far not been able to resolve the issue: 1) The machine came with a Broadcom BCM5709C NetXtreme II 4 port Gigabit network adapter which we initially used as the primary network connection. We have since installed an Intel(R) PRO/1000 PT Dual Port Server Adapter on both machines to eliminate the NIC as the issue. 2) All databases have an automatic full backup nightly along with a log backup for databases involved in mirroring. Log file useage is monitored and rarely gets above 15% used. The log file for the main database is 125GB, consisting of 159 virtual log files that range in size from 511MB to 1GB. TempDB is on it's own LUN, and consists of 24 x 2GB files. 3) SQL Server log on the witness shows no errors other than: The mirroring connection to "TCP://SQL02.DOMAIN.INET:5022" has timed out for database "Data" after 30 seconds without a response. Check the service and network connections. SQL Server log on the primary and secondary servers show messages relating to mirroring: The mirroring connection to "TCP://SQL01.DOMAIN.INET:5022" has timed out for database "Data" after 30 seconds without a response. Check the service and network connections. The mirrored database "Data" is changing roles from "PRINCIPAL" to "MIRROR" due to Role Syncronization. (Synchronization is misspelled here on purpose since that is precisely how the actual message is display.) The mirrored database "Data" is changing roles from "PRINCIPAL" to "MIRROR" due to Failover. The mirrored database "Data" is changing roles from "MIRROR" to "PRINCIPAL" due to Failover from partner. The SQL Server services continue to run and network connections seem to stay up. We consistently have between 500 to 2500 sessions connected to each server (primarily robotic applications that connect to service broker queues on a single database). 4) TCP Chimney and RSS etc are disabled using NET SH syntax. 5) I have ran the SQL Server 2005 Best Practices Analyzer against both machines and find nothing other than the very occasional Application Event Log error 833, none of which are coincident with the failover events: SQL Server has encountered 1 occurrence(s) of I/O requests taking longer than 15 seconds to complete on file [F:\Data.MDF] in database [Data] (9). The OS file handle is 0x00000000000010A0. The offset of the latest long I/O is: 0x000007d4b10000). 6) Occasionally we see "The client was unable to reuse a session with SPID XXX, which had been reset for connection pooling. This error may have been caused by an earlier operation failing. Check the error logs for failed operations immediately before this error message." generated by both servers. There appears to be no "earlier" messages that indicate any issue. 7) Occasionally database mail writes an error to the Application event log: Exception Type: Microsoft.SqlServer.Management.SqlIMail.Server.Common.BaseException Message: There was an error on the connection. Reason: Timeout expired. The timeout period elapsed prior to completion of the operation or the server is not responding., connection parameters: Server Name: MGSQL02, Database Name: msdb Data: System.Collections.ListDictionaryInternal TargetSite: Void OpenConnection(Microsoft.SqlServer.Management.Common.SqlConnectionInfo) HelpLink: NULL Source: DatabaseMailEngine StackTrace Information at Microsoft.SqlServer.Management.SqlIMail.Server.DataAccess.ConnectionManager.OpenConnection(SqlConnectionInfo ci) at Microsoft.SqlServer.Management.SqlIMail.Server.DataAccess.DataAccessAdapter.OpenConnection(String dbServerName, String dbName, String userName, String password) at Microsoft.SqlServer.Management.SqlIMail.IMailProcess.QueueItemProcesser.ProcessQueueItems(String dbName, String dbServerName, Int32 lifetimeMinimumSec, LogLevel loggingLevel) I believe the Timeouts are causing the failover; what could cause these timeouts? Obviously if there was an actual network issue such as a bad cable, or a bad switch, that might cause packet loss and therefore a timeout, however what other things could cause timeouts? Blocking? If MSDB, or some other system database had an I/O timeout could that cause the mirroring failover? Thanks for any advice! MSDN has the following to say about the timeout mechanism itself : >The Mirroring Time-Out Mechanism > >Because soft errors are not detectable directly by a server instance, a soft error could potentially cause a server instance to wait indefinitely. To prevent this, database mirroring implements its own time-out mechanism, based on each server instance in a mirroring session sending out a ping on each open connection at a fixed interval. > >To keep a connection open, a server instance must receive a ping on that connection in the time-out period defined, plus the time that is required to send one more ping. Receiving a ping during the time-out period indicates that the connection is still open and that the server instances are communicating over it. On receiving a ping, a server instance resets its time-out counter on that connection. > >If no ping is received on a connection during the time-out period, a server instance considers the connection to have timed out. The server instance closes the timed-out connection and handles the time-out event according to the state and operating mode of the session. netsh interface tcp show global shows: Receive-Side Scaling State : disabled Chimney Offload State : disabled NetDMA State : enabled Direct Cache Acess (DCA) : disabled Receive Window Auto-Tuning Level : disabled Add-On Congestion Control Provider : ctcp ECN Capability : disabled RFC 1323 Timestamps : disabled netsh interface ipv4 show dynamicportrange tcp Protocol tcp Dynamic Port Range Start Port : 1025 Number of Ports : 64510 SELECT name, value_in_use FROM sys.configurations

    Ad Hoc Distributed Queries               0         
    affinity I/O mask                        0         
    affinity mask                            0         
    affinity64 I/O mask                      0         
    affinity64 mask                          0         
    Agent XPs                                1         
    allow updates                            0         
    awe enabled                              0         
    blocked process threshold                5         
    c2 audit mode                            0         
    clr enabled                              1         
    common criteria compliance enabled       0         
    cost threshold for parallelism           4         
    cross db ownership chaining              0         
    cursor threshold                         -1        
    Database Mail XPs                        1         
    default full-text language               1033      
    default language                         0         
    default trace enabled                    1         
    disallow results from triggers           0         
    fill factor (%)                          0         
    ft crawl bandwidth (max)                 100       
    ft crawl bandwidth (min)                 0         
    ft notify bandwidth (max)                100       
    ft notify bandwidth (min)                0         
    index create memory (KB)                 0         
    in-doubt xact resolution                 0         
    lightweight pooling                      0         
    locks                                    0         
    max degree of parallelism                6         
    max full-text crawl range                4         
    max server memory (MB)                   393216    
    max text repl size (B)                   65536     
    max worker threads                       0         
    media retention                          0         
    min memory per query (KB)                2048      
    min server memory (MB)                   52427     
    nested triggers                          1         
    network packet size (B)                  1400      
    Ole Automation Procedures                1         
    open objects                             0         
    PH timeout (s)                           60        
    precompute rank                          0         
    priority boost                           0         
    query governor cost limit                0         
    query wait (s)                           -1        
    recovery interval (min)                  0         
    remote access                            1         
    remote admin connections                 0         
    remote login timeout (s)                 20        
    remote proc trans                        0         
    remote query timeout (s)                 600       
    Replication XPs                          0         
    scan for startup procs                   0         
    server trigger recursion                 1         
    set working set size                     0         
    show advanced options                    1         
    SMO and DMO XPs                          1         
    SQL Mail XPs                             0         
    transform noise words                    0         
    two digit year cutoff                    2049      
    user connections                         0         
    user options                             4216      
    Web Assistant Procedures                 0         
    xp_cmdshell                              1

A while ago, I manually modified the mirroring_connection_timeout value for all mirrored databases to 30 seconds to attempt to remediate the problem; this has simply increased the amount of time between failover events. With the mirroring_connection_timeout setting set at the default of 10 seconds, we see a *lot* more failovers. A comment had asked me to ensure IPSec is disabled, so I'm posting the contents of several netsh commands that display the IPSec configuration of the operating system:


C:\>netsh ipsec dynamic show all
No currently assigned Policy
Mainmode Policies not available.
Quickmode Policies not available.
Generic Mainmode Filters not available.
Specific Mainmode Filters not available.
Generic Quickmode Filters not available.
Specific Quickmode Filters not available.
IPsec MainMode Security Associations not available.
IPsec QuickMode Security Associations not available.

IPsec Configuration Parameters
------------------------------
StrongCRLCheck         : 1
IPsecexempt            : 3

IPsec Statistics
----------------
Active Assoc                : 0
Offload SAs                 : 0
Pending Key                 : 0
Key Adds                    : 0
Key Deletes                 : 0
ReKeys                      : 0
Active Tunnels              : 0
Bad SPI Pkts                : 0
Pkts not Decrypted          : 0
Pkts not Authenticated      : 0
Pkts with Replay Detection  : 0
Confidential Bytes Sent     : 0
Confidential Bytes Received : 0
Authenticated Bytes Sent    : 0
Authenticated Bytes Received: 0
Transport Bytes Sent        : 0
Transport Bytes Received    : 0
Bytes Sent In Tunnels       : 0
Bytes Received In Tunnels   : 0
Offloaded Bytes Sent        : 0
Offloaded Bytes Received    : 0

C:\>netsh ipsec static show all
ERR IPsec : No Policies in Policy Store

---------------------------------------------------- UPDATE: 2012-12-20 We've now moved our production systems onto SQL Server 2012. We've been running this since the morning of December 17th - so far no failovers. However, a couple of days is well within what we saw with the 2005-based systems. In an effort to document the performance of our new systems, I've been looking at sys.dm_os_wait_stats more carefully; and noticed DBMIRROR_DBM_EVENT, which is an undocumented wait type. Graham Kent at Microsoft has an interesting article regarding troubleshooting unexpected failovers and this wait type. I shall recap his findings here: >The customer was experiencing a huge blocking chain built on a high volume OLTP database where all the head blockers were waiting on DBMIRROR_DBM_EVENT. Here is the sequence of events I went through: 1. Review the blocking chain itself - ho help here as all we can see is that we're waiting on DBMIRROR_DBM_EVENT 2. Review the source for the undocumented wait type. Obviously you can't do this outside of MS, but I can say that at the time of writing this wait type represents the wait used when the principal is waiting for the mirror to harden an LSN, meaning that the transaction it's part of cannot commit. This immediately points quite specifically to the problem that the principal cannot commit transactions as it's waiting on the mirror. Now we need to investigate why the mirror is not committing transactions or why the principal doesn't know whether it is. 3. Review the msdb system tables >(a) Look at the [backupset] table to see if the size of the logs produced at the time of the problem are significantly higher then normal. If they were exceptionally large it may be that the mirror was flooded with transactions and could simply not keep up with the volume. This is why books online will tell you sometimes to disable mirroring if you need to do an exceptionally large logged operation such as an index rebuild. (reference for why this is at http://technet.microsoft.com/en-us/library/cc917681.aspx) . Here i used the following TSQL SELECT backup_set_id,backup_start_date,database_name,has_bulk_logged_data,backup_size / 1000 FROM [backupset] where backup_start_date between '2011-01-05 14:00:00' and '2011-01-05 19:30:00' go select round((AVG(backup_size)/1000),0) FROM [backupset] where database_name = 'mydatabase' >(b) secondly I looked at the data in the tables [dbm_monitor_data]. The key here is to locate the timeframe in which we had a problem and then see if we were significant experiencing changes in any of the following: log_flush_rate send_queue_size send_rate redo_queue_size redo_rate >These are all indicators similar to part (a) in that they might show a component or piece of architecture that wasn't responding. For example if the send_queue suddenly starts to grow but the re_do queue doesn't grow, then it would imply that the the principal cannot send the log records to the mirror so you'd want to look at connectivity maybe, or the service broker queues dealing with the actual transmissions. >In this particular scenario we noted that all the counters appeared to have strange values, in that there were log backups going on of normal sizes, but there were no status changes, 0 send queue, 0 redo queue, a flat send rate and a flat redo rate. This is very strange as it implies that the DBM Monitor could not record any values from anywhere over the problem period. >4. Review the SQL Server error logs. In this case there were no errors or information messages whatsoever, but in other scenarios such as this, it’s very common for errors in the 1400 range to be reported, examples of which you can find in other places in my other mirroring blogs, such as this Error 1413 example >5. Review the default trace files – in this scenario I was not provided the default traces, however they are fantastic sources of DBM problem information, as they record state change events on all the partners.This is documented here: >Database Mirroring State Change Event Class >This often gives you a great picture of scenarios such as when network connectivity failed between one or all of the partners and then what the state of the partnership became afterwards. >CONCLUSIONS: >In this particular scenario I’m currently missing 2 key points of data, but that apart I can still make a reasonable hypothesis on the above information. We certainly can say that the blocking was caused by the fact that DBM was enabled to the due the blockers all waiting on the DBMIRROR_DBM_EVENT wait type. Since we know we didn’t flood the mirror with a large logged operation and that this deployment normally runs happily in this mode, we can exclude unusual large operations. This means that we have 2 potential candidates at this stage: 1. Hardware problems on the connectivity between some or all of the partners. 2. CPU exhaustion on the mirror server – simply unable to keep up with redos – the CPU exhaustion could itself be from a process outside of SQL Server or outside of the this mirror partnership. 3. A problem with the mirroring code itself (we’d really need some memory dumps to confirm this though). >Based upon experience I’d suspect 1 or 2, but I always keep an open mind about 3 as well, we’re trying to collect some more data now to look at this problem in more detail.

Hannah Vernon (70988 rep)

Aug 13, 2012, 06:54 PM • Last activity: May 22, 2024, 01:56 PM

2 votes

1 answers

3758 views

Replicaton - The current transaction cannot be committed and cannot be rolled back to a savepoint

sql-server replication transactional-replication scripting troubleshooting

After creating a replication publication, I go ahead and create a subscription using the script below: -----------------BEGIN: Script to be run at Publisher 'MY_PUBLISHER_SERVER'-------------- use [MY_PUBLICATION_DB] exec sp_addsubscription @publication = N'MY_PUBLICATION', @subscriber = N'MY_SUBSCR...

                                  After creating a replication publication, I go ahead and create a subscription using the script below:

    -----------------BEGIN: Script to be run at Publisher 'MY_PUBLISHER_SERVER'--------------
    use [MY_PUBLICATION_DB]
    exec sp_addsubscription 
     @publication = N'MY_PUBLICATION', 
     @subscriber = N'MY_SUBSCRIBER_SERVER', 
     @destination_db = N'ORCA_Repl_Sub', 
     @subscription_type = N'Push', 
     @sync_type = N'automatic', 
     @article = N'all', 
     @update_mode = N'read only', 
     @subscriber_type = 0
    
    
    
    exec sp_addpushsubscription_agent 
     @publication = N'MY_PUBLICATION', 
     @subscriber = N'MY_SUBSCRIBER_SERVER', 
     @subscriber_db = N'ORCA_Repl_Sub', 
     @job_login = N'MY_COMPANY\sqlrpl', 
     @job_password = N'MYP455w0rd',
     @subscriber_security_mode = 1, 
     @frequency_type = 64, 
     @frequency_interval = 0, 
     @frequency_relative_interval = 0, 
     @frequency_recurrence_factor = 0, 
     @frequency_subday = 0, 
     @frequency_subday_interval = 0, 
     @active_start_time_of_day = 0, 
     @active_end_time_of_day = 235959, 
     @active_start_date = 20190423, 
     @active_end_date = 99991231, 
     @enabled_for_syncmgr = N'False', 
     @dts_package_location = N'Distributor'
    GO
    --------------END: Script to be run at Publisher 'MY_PUBLISHER_SERVER'-----------------

to my surprise I get all the following errors when running the second procedure:

    Msg 50000, Level 16, State 1, Procedure msdb.dbo.sp_add_jobstep_internal, Line 255 [Batch Start Line 14]
    The current transaction cannot be committed and cannot support operations that write to the log file. Roll back the transaction.
    Msg 3931, Level 16, State 1, Procedure sys.sp_MSadd_repl_job_unsafe, Line 376 [Batch Start Line 14]
    The current transaction cannot be committed and cannot be rolled back to a savepoint. Roll back the entire transaction.




Basically Replication Scripts Stopped Working.. 

How could be a solution for this?
where to find more information about these errors?

                                

Marcello Miorelli (17274 rep)

Apr 23, 2019, 10:58 PM • Last activity: May 15, 2024, 07:06 PM

0 votes

1 answers

99 views

what sql server is holding in the used memory?

sql-server sql-server-2016 memory dmv troubleshooting

I have a query that shows me how much memory sql server is using, the settings like max memory and min memory, cpu, etc. here is the query: ``` --============================================================ --checking current memory settings and usage --marcelo miorelli --16-feb-2014 --=============...

I have a query that shows me how much memory sql server is using, the settings like max memory and min memory, cpu, etc. here is the query:

--============================================================
--checking current memory settings and usage
--marcelo miorelli
--16-feb-2014
--============================================================
			   
SELECT R.[instance],
       R.[Logical CPU Count],
	   R. [Hyperthread Ratio],
	   R.[Physical CPU Count],
	   R.[Physical Memory (GB)],
	   k.[Max server memory (GB)],
	 FOO.Memory_usedby_Sqlserver_MB,
	 FOO.Locked_pages_used_Sqlserver_MB,
	   k.[Min server memory (GB)],
	   k.[optimize for ad hoc workloads],
	   R.affinity_type_desc,
	FOO.process_physical_memory_low,
	FOO.process_virtual_memory_low,
	   R.virtual_machine_type_desc,
	   R.sqlserver_start_time
FROM (

				SELECT [instance]=@@servername,
					   cpu_count AS [Logical CPU Count], 
					   hyperthread_ratio AS [Hyperthread Ratio],
				cpu_count/hyperthread_ratio AS [Physical CPU Count], 
				CONVERT(decimal(12,2),physical_memory_kb/1024.00/1024.00) AS [Physical Memory (GB)], 
				affinity_type_desc, 
				virtual_machine_type_desc, 
				sqlserver_start_time
				FROM sys.dm_os_sys_info WITH (NOLOCK) 
				

) R 
INNER JOIN (


				SELECT [instance] = @@servername,
				       [Max server memory (GB)]   = CONVERT(decimal(12,2),CAST(p.[max server memory (MB)] AS DECIMAL(12,2))/1024.00),
					   [Min server memory (GB)]   = CONVERT(decimal(12,2),CAST(P.[min server memory (MB)] AS DECIMAL(12,2))/1024.00),
					   [min memory per query (MB)]= CONVERT(decimal(12,2),CAST(P.[min memory per query (KB)] AS DECIMAL(12,2))/1024.00),
					   p.[optimize for ad hoc workloads]

				FROM ( SELECT name, [value_in_use] 
				         FROM sys.configurations) t 
						PIVOT (MAX([value_in_use]) 
							   FOR name IN (
											 [min server memory (MB)], 
											 [min memory per query (KB)], 
											 [max server memory (MB)], 
											 [optimize for ad hoc workloads]
											)) p


			) K
ON R.instance = K.instance

INNER JOIN (

							SELECT [instance] = @@servername,
							CONVERT(decimal(12,2),CAST(physical_memory_in_use_kb AS DECIMAL(12,2))/1024.00) AS Memory_usedby_Sqlserver_MB,
							CONVERT(decimal(12,2),CAST(locked_page_allocations_kb AS DECIMAL(12,2))/1024.00) AS Locked_pages_used_Sqlserver_MB,
							CONVERT(decimal(18,2),CAST(total_virtual_address_space_kb AS DECIMAL(18,2))/1024.00/1024.00) AS Total_VAS_in_GB,
							process_physical_memory_low,
							process_virtual_memory_low
							FROM sys.dm_os_process_memory t
		

           ) FOO
ON R.instance = foo.instance
OPTION (RECOMPILE);

this query gives me this - currently on a not busy dev server:

My question is related to physical_memory_in_use_kb and locked_page_allocations_kb from the dmv sys.dm_os_process_memory. How can I find what objects or whatever sql server is holding in that memory? I can see that in my case there is a minimum memory set to 2 Gb but the memory in use is over 5Gb. It might as well be that I actually do not need to know as there is probably no memory pressure at the moment, but still, is there a way to find out what sql server is holding in there memory?

Marcello Miorelli (17274 rep)

Feb 18, 2024, 07:35 PM • Last activity: Feb 19, 2024, 10:33 AM

2 votes

2 answers

4482 views

How do I find out what's causing processes to wait for table flush?

mysql mariadb mariadb-10.3 troubleshooting

I'm currently using MariaDB 10.3 and am troubleshooting this case where threads will start to build up until the MySQL service eventually has to be restarted. I was able to capture the process list before restarting MySQL, but I have a few questions that I hope will help me head in the right directi...

                                  I'm currently using MariaDB 10.3 and am troubleshooting this case where threads will start to build up until the MySQL service eventually has to be restarted. I was able to capture the process list before restarting MySQL, but I have a few questions that I hope will help me head in the right direction.

In the screenshot below, I've included a portion of the process list, which I sorted by time. 

The same thing has happened two different days, and on both days, the output is very similar. From looking at the processes, there are two things I've observed:

1. There is a longer running process in the statistics state.
2. All of the processes are waiting for the table flush after our database backup process, which seems to be waiting for a table flush itself. I need to confirm with the team what tool is being used for backups, but I believe it might be something from Percona.

The main issue I'm trying to solve here is figuring out what's causing all of these processes to wait for a table flush, but I'd also like to understand the output here. Does this output produce something that looks obvious in terms of what's causing it or is there a particular way you would go about figuring this out? 

Thanks in advance!

kenshin9 (119 rep)

Mar 11, 2022, 12:19 AM • Last activity: Jan 10, 2024, 11:02 PM

1 votes

4 answers

5405 views

Rows deleted in the subscriber - what to do? - SQL Server Transactional Replication

sql-server replication transactional-replication troubleshooting

Someone has delete by mistake a row of a table in the subscribed server/database. I am getting the following error message: > The row was not found at the Subscriber when applying the replicated > (null) command for Table '(null)' with Primary Key(s): (null) (Source: > MSSQLServer, Error number: 205...

                                  Someone has delete by mistake a row of a table in the subscribed server/database.

I am getting the following error message:

> The row was not found at the Subscriber when applying the replicated
> (null) command for Table '(null)' with Primary Key(s): (null) (Source:
> MSSQLServer, Error number: 20598)

Now I have restored the deleted row in the replicated server, but the distributor to the subscriber agent is taking so long to apply the changes (currently nearly 1.5 hours).

How can I make it do this update immediately?

Marcello Miorelli (17274 rep)

Dec 22, 2016, 06:57 PM • Last activity: Oct 13, 2023, 07:34 PM

6 votes

2 answers

4568 views

Increase memory limit when server is stopped

sql-server memory configuration troubleshooting

There was a local SQL Server 2012 instance. Once memory has decreased to critical 128 MB and then has stopped by me. I can not start it (idea: not enough memory). Is it possible to increase this limit in config file or in system registry?

                                  There was a local SQL Server 2012 instance. Once memory has decreased to critical 128 MB and then has stopped by me.

I can not start it (idea: not enough memory).

Is it possible to increase this limit in config file or in system registry?

Aleksandro M Granda (163 rep)

May 13, 2015, 02:42 PM • Last activity: Jun 12, 2023, 08:36 PM

10 votes

4 answers

22733 views

SSRS error "something went wrong please try again later" in web portal

sql-server sql-server-2019 ssrs reporting troubleshooting

I have a brand new installation of SQL Server Reporting Services 2019 on top of MSSQL 2019 Enterprise CU1. SSRS has been installed and configured using default settings, apart from the service account it should use. Both AD and MSSQL instance are configured in minimum permissions only (or close to t...

                                  I have a brand new installation of SQL Server Reporting Services 2019 on top of MSSQL 2019 Enterprise CU1. 

SSRS has been installed and configured using default settings, apart from the service account it should use. 

Both AD and MSSQL instance are configured in minimum permissions only (or close to this state). 

When I am accessing the Web Portal I receive an error saying "please try again later", which isn't much of help. 

I found many similar posts asking for help on the same issue, but wanted to see the answer here.

Bartosz X (503 rep)

Jan 31, 2020, 10:39 AM • Last activity: Jan 28, 2023, 03:59 PM

0 votes

1 answers

97 views

Average Density & cardinality estimation

sql-server cardinality-estimates troubleshooting variable

In a large web application we had a query that was running slow in code, that seemed to be working blazing fast in `SSMS`. After checking the basics like the same `SET` options, we found that there was one difference: `SSMS` used `declare @variable1` and the code used an inline parameter. I never th...

                                  In a large web application we had a query that was running slow in code, that seemed to be working blazing fast in SSMS. After checking the basics like the same SET options, we found that there was one difference: SSMS used declare @variable1 and the code used an inline parameter.

I never though that would make a difference, but the microsoft docs  gave the solution and even a short explanations.

> The following queries are different. The first query uses Average
> Density from the histogram for cardinality estimation, while the
> second query uses the histogram step for cardinality estimation:

SQL 

    declare @variable1 = 123
    select * from table where c1 = @variable1

SQL

    select * from table where c1 = 123

Great, problem solved. But I have no idea what the explanation actually means. I've read up on cardinality estimation  and understand that it's used to select that best query plan. But how does that change when using a variable or not?

- Is using a variable always better?
- If not, is there a better way to find out what to use other then "use one until it fails and check the other"?

This is the first time I really noticed this because it changed from <10ms to 10 seconds, but I'm wondering if other queries are affected by this, but just a lot less.

Hugo Delsing (113 rep)

Dec 20, 2022, 03:41 PM • Last activity: Dec 20, 2022, 07:41 PM

0 votes

1 answers

161 views

Operational stats mention the same table twice, what does this mean?

sql-server statistics troubleshooting

While investigating why my application mentions deadlocking, I fell upon [this question][1], whose answer mentions an SQL query, showing operational logs. However, the results of that query contain the same tablename twice, for some tables. How should I interpret that result? Hereby the mentioned SQ...

                                  While investigating why my application mentions deadlocking, I fell upon this question , whose answer mentions an SQL query, showing operational logs. However, the results of that query contain the same tablename twice, for some tables.

How should I interpret that result?

Hereby the mentioned SQL query:

    SELECT t.name AS [TableName],
       fi.page_count AS [Pages],
       fi.record_count AS [Rows],
       CAST(fi.avg_record_size_in_bytes AS int) AS [AverageRecordBytes],
       CAST(fi.avg_fragmentation_in_percent AS int) AS [AverageFragmentationPercent],
       SUM(iop.leaf_insert_count) AS [Inserts],
       SUM(iop.leaf_delete_count) AS [Deletes],
       SUM(iop.leaf_update_count) AS [Updates],
       SUM(iop.row_lock_count) AS [RowLocks],
       SUM(iop.page_lock_count) AS [PageLocks]
    FROM sys.dm_db_index_operational_stats(DB_ID(),NULL,NULL,NULL) AS iop
    JOIN sys.indexes AS i ON iop.index_id = i.index_id AND
                           iop.object_id = i.object_id
    JOIN sys.tables AS t ON i.object_id = t.object_id AND
                          i.type_desc IN ('CLUSTERED', 'HEAP')
    JOIN sys.dm_db_index_physical_stats(DB_ID(), NULL, NULL, NULL, 'SAMPLED') AS fi ON fi.object_id=CAST(t.object_id AS int) AND
                                                                                     fi.index_id=CAST(i.index_id AS int)
    GROUP BY t.name, fi.page_count, fi.record_count, fi.avg_record_size_in_bytes, fi.avg_fragmentation_in_percent
    ORDER BY [RowLocks] desc
                                

Dominique (609 rep)

Dec 5, 2022, 11:07 AM • Last activity: Dec 5, 2022, 01:27 PM

1 votes

3 answers

540 views

Find which IP added a specific record to a table in SQL Server

sql-server azure-sql-database connections logging troubleshooting

Let's say I have an SQL Server running on Azure and multiple applications add records to it. The problem is that some records come from an unknown application. Is there a nice way of finding which IP added *a specific record*?

                                  Let's say I have an SQL Server running on Azure and multiple applications add records to it. 
The problem is that some records come from an unknown application.

Is there a nice way of finding which IP added *a specific record*?

Dimitrios Arampatzis (23 rep)

Aug 10, 2022, 08:32 AM • Last activity: Aug 16, 2022, 03:00 PM

3 votes

2 answers

3692 views

SQL SERVER FileTable/FILESTREAM share suddenly becomes unavailable

sql-server sql-server-2014 troubleshooting filetable

We had difficulty setting up a FileTable share on a SQL 2014 instance and after giving up on trying to name it after our instance name, we left the name of the share as mssqlserver, but it finally came online, online meaning if you right click in SSMS on the FileTable table you can "Explore FileTabl...

                                  We had difficulty setting up a FileTable share on a SQL 2014 instance and after giving up on trying to name it after our instance name, we left the name of the share as mssqlserver, but it finally came online, online meaning if you right click in SSMS on the FileTable table you can "Explore FileTable Directory", however it seems to randomly go offline again after some time and you receive the usual canned error message:
![](https://raresql.files.wordpress.com/2014/02/file-location-cannot-be-opened-1-1.png) 

So, after magically disabling/enabling FILESTREAM at the server level, database level via t-sql/ssms/configuration manager it some how comes back, but I have to mess with the setting at all of those spots for it to come back! and there seems to be no magical sequence of steps that makes it come back.

Has anyone out there experienced this? Will a restart of the service/server help this? If not, will a repair of sql server help?

Eric Swiggum (694 rep)

Mar 3, 2017, 02:36 PM • Last activity: Apr 25, 2022, 04:29 PM

2 votes

1 answers

10626 views

Activity Monitor paused

sql-server sql-server-2016 ssms troubleshooting activity-monitor

Some of our end-users still like to use the Activity Monitor in SQL Server. When setting up our SQL-server 2016 servers the activity monitor always ends up in a paused state when the users are not part of the "Administrators" group in Windows (on the server). When the users are part of the Administr...

                                  Some of our end-users still like to use the Activity Monitor in SQL Server.
When setting up our SQL-server 2016 servers the activity monitor always ends up in a paused state when the users are not part of the "Administrators" group in Windows (on the server).

When the users are part of the Administrator-group all data gets through (so probably not a Firewall-problem).
We also tried:

- Adding the users to the "Performance Monitor Users" and "Distributed COM Users"
- In COM: Changing the limits of "My Computer" and the permissions on "Windows Management and Instrumentation".
- The users also have the "View server state" permission in SQL Server.

Do you have any other suggestions? 

Kind Regards,

Tim

chittybang (151 rep)

Apr 2, 2020, 02:39 PM • Last activity: Apr 8, 2022, 08:45 AM

1 votes

0 answers

565 views

Potential causes for unusually long queries (postgres)

postgresql query-performance postgresql-performance performance-tuning troubleshooting

Recently we've noticed some `SELECT` queries taking an unusually long time to complete and are a bit stumped on what the reason could be. Looking at DB (and server) CPU and IO during the time the queries are ran we don't see any spikes; so it doesn't look like it has to do with a heavy DB load. Furt...

Recently we've noticed some SELECT queries taking an unusually long time to complete and are a bit stumped on what the reason could be. Looking at DB (and server) CPU and IO during the time the queries are ran we don't see any spikes; so it doesn't look like it has to do with a heavy DB load. Further, while we don't believe the rows are locked during these queries, if they were they'd be locked for updates, which shouldn't be affecting these SELECT queries. Unusually long in this case means we're seeing what should be very light, simple queries taking around 45s. Is there anything else besides a heavy DB load and lock (both of which seem to not apply to our case here) that could cause a query to lag like this? **Update** with a bit more context that may be helpful: These aren't just some one-off queries. They are run thousands of times a day, for literally the exact same record, and normally complete perfectly fine. Then, seemingly out of nowhere, with no extra load on the DB or anything else obvious that may affect its performance, the query lags. We have appropriate indexes in place on the columns that are in the WHERE clause and running EXPLAIN ANALYZE returns nothing odd. And when we say "simple" we mean something as straightforward as this:

SELECT * FROM abcd WHERE example = 'ex' LIMIT 1;

tmns (11 rep)

Mar 15, 2022, 09:19 AM • Last activity: Mar 16, 2022, 08:07 AM

1 votes

0 answers

445 views

Why do some SQL Server databases go into a "Recovery Pending" state after shutting down / restarting on Linux?

sql-server linux ubuntu sql-server-2019 troubleshooting

I built a secondary PC from old parts as a server for SQL Server databases and I went with Linux/Ubuntu 20.04 because it's free. I knew it would not be easy to set this up (I'm rather a Windows user) and I had quite some issues/challenges along the way but I got it working. My only headache now is t...

                                  I built a secondary PC from old parts as a server for SQL Server databases and I went with Linux/Ubuntu 20.04 because it's free. I knew it would not be easy to set this up (I'm rather a Windows user) and I had quite some issues/challenges along the way but I got it working.

My only headache now is that basically every time I shutdown or reboot the PC, some of the databases are getting the "Recovery pending" state and then I have to run the usual script to fix that. (which for some of my big DBs are taking quite a long time to finish)

I thought I need to manually stop the SQL Server services every time I shutdown the PC so I tried this before that:

    sudo systemctl stop mssql-server

Yet I still had one of the DBs in this pending state after booting. Now maybe the issue is that DB is on a different drive? The DBs not affected by this state are on the main SSD where Ubuntu is installed, if that makes any difference. I store the DBs on internal storage devices (1 SSD + 2 HDD).
Someone hinted on stackoverflow that maybe the HDDs were mounted AFTER the SQL Server services has been started? How could I fix that?

Thanks for any helpful feedback.

EDIT:
I checked the logs, the database files cannot be found:

This might prove that the SQL Server services start to early, I guess.

Ha'gel (19 rep)

Dec 7, 2021, 07:36 PM • Last activity: Dec 8, 2021, 08:50 AM

0 votes

2 answers

1213 views

When queries start executing on standby server the replay lag starts to increase

postgresql replication high-availability master-slave-replication troubleshooting

I have a two servers with following specs: * 8 vCPU, 32768 MB RAM, 640 GB SSD The master Postgres 13.3 database (db1) is installed on first server (Ubuntu 16.04.7) with the following config: ``` shared_buffers = 16GB work_mem = 128MB maintenance_work_mem = 8GB effective_cache_size = 16GB effective_i...

I have a two servers with following specs: * 8 vCPU, 32768 MB RAM, 640 GB SSD The master Postgres 13.3 database (db1) is installed on first server (Ubuntu 16.04.7) with the following config:

shared_buffers = 16GB 
work_mem = 128MB
maintenance_work_mem = 8GB
effective_cache_size = 16GB

effective_io_concurrency = 400
max_worker_processes = 8
max_parallel_workers_per_gather = 4
max_parallel_workers = 8

wal_level = logical
synchronous_commit = on
max_wal_size = 4GB
min_wal_size = 32MB
wal_keep_size = 16384
wal_sender_timeout = 60s
checkpoint_completion_target = 0.7

synchronous_standby_names = 'FIRST 1 (db2_slave)'
max_standby_archive_delay = 1800s
max_standby_streaming_delay = 1800s

The standby is a Postgres 13.4 database (db2) installed on second server (Ubuntu 20.04.3) with the following config:

shared_buffers = 24GB
work_mem = 128MB
maintenance_work_mem = 16GB
effective_cache_size = 24GB

effective_io_concurrency = 400
max_worker_processes = 8
max_parallel_workers_per_gather = 4
max_parallel_workers = 8

wal_level = logical
synchronous_commit = on
max_wal_size = 4GB
min_wal_size = 32MB
checkpoint_completion_target = 0.7

primary_conninfo = 'host=... port=5432 user=repluser passfile=''...'' application_name=db2_slave'
primary_slot_name = 'db2'
hot_standby = on
max_standby_archive_delay = 1800s
max_standby_streaming_delay = 1800s

If I run iotop -u postgresql on the standby, I see two processes:

2229172 postgres: 13/main: walreceiver streaming DDFD/8E9FE9E0
2229138 postgres: 13/main: startup recovering 000000010000DDFD0000008E

After I run **read request which takes a few seconds** on the standby (SELECT COUNT(*) FROM big_table;), the walreceiver streaming continues to work, but the replica stops syncing:

2229138 postgres: 13/main: startup recovering 000000010000DE0400000017 waiting

I ran this query on master:

SELECT client_addr                                                       as client,
       usename                                                           as user,
       application_name                                                  as name,
       state,
       sync_state                                                        as mode,
       pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), sent_lsn))   as pending,
       pg_size_pretty(pg_wal_lsn_diff(sent_lsn, write_lsn))              as write,
       pg_size_pretty(pg_wal_lsn_diff(write_lsn, flush_lsn))             as flush,
       pg_size_pretty(pg_wal_lsn_diff(flush_lsn, replay_lsn))            as replay,
       pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), replay_lsn)) as total_lag
FROM pg_stat_replication;

And the output was:

client    |   user   |   name    |   state   | mode | pending |  write  |  flush  | replay | total_lag 
-------------+----------+-----------+-----------+------+---------+---------+---------+--------+-----------
 ...         | repluser | db2_slave | streaming | sync | 0 bytes | 0 bytes | 0 bytes | 21 MB  | 21 MB
(1 row)

If I execute this request several times, the replay and total lag increases all the time during execution this query (SELECT COUNT(*) FROM big_table). Therefore, I want to know the answers to the questions: 1) Why does the replay lag keep increasing during the execution of an analytical query for replica? 2) Why is the recovery process in the "waiting" state as soon as I start a request to the standby?

Andrei (111 rep)

Oct 30, 2021, 10:12 PM • Last activity: Nov 2, 2021, 05:21 AM

1 votes

2 answers

1969 views

DBA - Checklist Question- TempDB Is Full

sql-server troubleshooting

I'm trying to create some checklists for troubleshooting various issues, in this case, when TempDb is full. I'd like a prioritized repeatable process where possible to help me out in a hurry. When TempDB is full, what are the steps you take to troubleshoot that and fix it? Here is what I have so far...

                                  I'm trying to create some checklists for troubleshooting various issues, in this case, when TempDb is full.   I'd like a prioritized repeatable process where possible to help me out in a hurry.

When TempDB is full, what are the steps you take to troubleshoot that and fix it?  Here is what I have so far, but I know it's far from being adequate as I don't yet know the best steps to take to troubleshoot it.

To me, there are a couple of checklists needed for this.  When the server will let you connect, and when it won't.

**When the server will let you connect:**

 1. Run Sp_Who2 - check for any blocks 
 2. Run Sp_WhoIsActive to try to see what is filling up TempDB 
 3. Kill any SPID's that could be filling up TempDb 
 4. Restart the SQL Server service.

**When the server won't let you connect:**

 1. Restart the SQL Server service.

What other steps do you take, in order, to help you troubleshoot and fix full TempDB?

Thanks for the help!

JM1 (253 rep)

Oct 12, 2021, 12:59 PM • Last activity: Oct 13, 2021, 12:59 PM

Showing page 1 of 20 total questions