Database Administrators

Q&A for database professionals who wish to improve their database skills

Latest Questions

3 votes

2 answers

1607 views

DBCC consistency error in SQL Server

sql-server errors dbcc-checkdb data-integrity dbcc-checktable

I ran DBCC CheckDB in our Production SQL Server 2014. It reported back with one consistency error: Failed:(-1073548784) Executing the query "DBCC CHECKDB(N'MYDB') WITH NO_INF..." failed with the following error: "Table error: Object ID 629577392, index ID 1, partition ID 72057594387039319, alloc unit ID 72057594045596759 (type LOB data). The off-row data node at page (1:6009), slot 24, text ID 15754068079 is not referenced. CHECKDB found 0 allocation errors and 1 consistency errors in table 'MYTABLE' (object ID 629577281). CHECKDB found 0 allocation errors and 1 consistency errors in database 'MYDB'. repair_allow_data_loss is the minimum repair level for the errors found by DBCC CHECKDB (MYDB).". Possible failure reasons: Problems with the query, "ResultSet" property not set correctly, parameters not set correctly, or connection not established correctly. Since we do not have a good backup for that DB, we have to route to the below scenario. What I did to troubleshoot: 1. I restored the recent backup file from Prod to UAT and ran DBCC CheckDB again to replicate the error. The same consistency error came back. 2. Ran DBCC CHECKTABLE (MYTABLE) = same consistency error shows:

Table error: Object ID 629577392, index ID 1, partition ID 72057594387039319, alloc unit ID 72057594045596759 (type LOB data). 

The off-row data node at page (1:6009), slot 24, text ID 15754068079 is not referenced.
DBCC results for 'MYTABLE'.
There are 53635 rows in 2705 pages for object "MYTABLE".
CHECKTABLE found 0 allocation errors and 1 consistency errors in table 'MYTABLE' (object ID 629577281).
repair_allow_data_loss is the minimum repair level for the errors found by DBCC CHECKTABLE

3. Then implemented the following steps below: -- step 1 ALTER DATABASE [MYDB] SET SINGLE_USER WITH ROLLBACK IMMEDIATE; -- step 2 DBCC CHECKDB(N'[MYDB]', REPAIR_ALLOW_DATA_LOSS);

Msg 8964, Level 16, State 1, Line 3
Table error: Object ID 629577392, index ID 1, partition ID 72057594387039319, alloc unit ID 72057594045596759 (type LOB data). The off-row data node at page (1:6009), slot 24, text ID 15754068079 is not referenced.
        The error has been repaired.
There are 53635 rows in 2705 pages for object "MYTABLE".

-- step 3 DBCC CHECKDB ('[MYDB]') ---(Re-run to check for any additional errors = no errors reported) -- step 4 ALTER DATABASE [MYDB] SET MULTI_USER; ---(setting it back to multi user mode = users can access the DB) My questions/concerns: 1. How do I know which data has been lost? From my understanding it doesn't seem like any data was lost because before the repair the table had 53635 rows. After the repair it still has 53635 rows. 2. Do we need to to schedule downtime/ outage for that exact site when setting the DB to single user mode? 3. After doing the repair in Production, what are some best practices to keep aware of?

sqllover2020 (73 rep)

Aug 17, 2021, 03:25 PM • Last activity: Jan 6, 2025, 09:04 PM

19 votes

2 answers

30237 views

How do I store phone numbers in PostgreSQL?

postgresql datatypes data-integrity data-validation

Let's assume I want to store phone numbers in a database. I may accept phone numbers from outside of the United States. How would I go about storing these phone numbers?

                                  Let's assume I want to store phone numbers in a database. I may accept phone numbers from outside of the United States. How would I go about storing these phone numbers?

                                

Evan Carroll (65502 rep)

Feb 18, 2017, 03:58 PM • Last activity: Jul 14, 2024, 05:10 PM

0 votes

0 answers

861 views

Upon PC restart, MySQL service on local computer "started and then stopped"

mysql mysql-8.0 delete disk-space data-integrity

I'm working in MySQL on a local (Windows) computer. I have a somewhat recurring issue I have with managing the server. I cannot replicate this error because I do not know what causes it. Sometimes, I'll be doing a data manipulation in the server. And then, I'll shut down the PC after I am done. Upon...

                                  I'm working in MySQL on a local (Windows) computer. I have a somewhat recurring issue I have with managing the server. I cannot replicate this error because I do not know what causes it.

Sometimes, I'll be doing a data manipulation in the server. And then, I'll shut down the PC after I am done. Upon PC restart, I get the infamous "the MySQL service on local computer started and stopped" error when trying to start the service via services.msc (or the command line). This error occurs even without editing the my.ini file or any other under-the-hood components of MySQL. It occurs even when there is no mysqld or mysql-esque tasks in the Task Manager. Most times, I just have to wait an indeterminate amount of time for this error to go away so that I can restart the service successfully. But it is quite annoying.

Virtually all of the advice toward a workaround involves re-initializing the server, which would wipe the database. That is NOT an option.

Are there any resources out there to successfully start the service when this error shows up randomly? I think I've noticed that when I force-stop a query in Workbench that involves large tables, this error can occur more frequently.

**To anyone who comes across this problem in the future: MySQL apparently continues to complete a process even after you force-stop a query (e.g. in Workbench). If you shut off your PC before this process completes, you'll get the "started and then stopped" error when trying to start the MySQL service in services.msc. Do not re-initialize your server; just wait few hours, days, etc. for the process to complete. Then you'll be able to restart the service (you'll just have to keep trying to "Start" the service on services.msc until successful)**

ZZ Top (1 rep)

May 24, 2024, 07:29 PM • Last activity: May 30, 2024, 10:33 AM

0 votes

0 answers

49 views

Data Loss and data redeundancy when forced failed over to secondary replica on DR Data Center in SQL Server 2016

sql-server sql-server-2016 high-availability disaster-recovery data-integrity

We are using SQL Server 2016 Enterprise edition and planning to have 2 datacenters primary and the remote Disaster recovery(DR). We have 2 replicas (primary replica and secondary replica) in the primary datacenter and it is running under synchronous-commit mode with automatic failover to provide hig...

                                  We are using SQL Server 2016 Enterprise edition and planning to have 2 datacenters primary and the remote Disaster recovery(DR). We have 2 replicas (primary replica and secondary replica) in the primary datacenter and it is running under synchronous-commit mode with automatic failover to provide high availability. The third replica (secondary replica) is in a disaster recovery datacenter and is running under asynchronous commit mode to provide datacenter level disaster recovery. I have below questions with respect to data integrity in case of forced manual failover when whole primary datacenter is completely down.

 Scenario 1:
  Say we have table named Table1 and it has 4 records; Record 1,2,3 and 4 whereas record4 was committed on Table1 on primary replica and before committing to secondary data center; whole server was down. How to Sync these data once the Primary datacenter is up again. What is the process we need to follow and is there any automated way to do this. 
 
 Scenario 2:
  Say I make DR as primary replica and as there are no Record4 in DR replica and it will start insert Record4 with different data. Even though Id is same for Record4 but the data in Primary data center and secondary data center for Record4 is different and how to sync it as lot of other tables depends on Record4. I am thinking to reseed DR to some number say 100 and then changed DR to primary one. Is there any automated or alternate better way to do this. 
  
  What is your experience on above scenarios. Can you please provide some suggestions?
                                

Manoj Jeganathan (1 rep)

Apr 1, 2024, 09:09 AM • Last activity: May 3, 2024, 11:21 AM

4 votes

1 answers

4290 views

Complex constraint across all data in a table

postgresql transaction constraint postgresql-8.4 data-integrity

We have a table to record processing that occurs on a system and we need to ensure that only a single row has an 'in process' status. I want to ensure that the result of this is always either zero or one: select count(id) from jobs where status in ('P', 'G'); We're using explicit transactions so ide...

                                  We have a table to record processing that occurs on a system and we need to ensure that only a single row has an 'in process' status.

I want to ensure that the result of this is always either zero or one:

    select count(id) from jobs where status in ('P', 'G');

We're using explicit transactions so ideally this check would happen on commit and abort the transaction if the invariant doesn't hold. Handling any racy processing that throws the occasional error is much easier for us to deal with than suddenly ending up with more than one job that is 'in process'.

The solution only needs to work with Postgres so we're quite happy to take a non-standard solution to this. We're currently on 8.4, but we'll be upgrading at some point to 9.x if that makes any difference.

KayEss (145 rep)

Aug 9, 2013, 02:53 AM • Last activity: Jan 22, 2024, 11:35 PM

12 votes

5 answers

32142 views

When is it safe to disable InnoDB doublewrite buffering?

mysql innodb locking transaction-log data-integrity

MySQL InnoDB allows us to disable doublewrite buffering by setting [`innodb_doublewrite = 0`][1]. Other databases [doesn't seem to allow this setting to be tweaked][2]. How could InnoDB still be able to maintain **data integrity** and [ACID][3] if we disable doublewrite buffering? In what situations...

                                  
MySQL InnoDB allows us to disable doublewrite buffering by setting innodb_doublewrite = 0 . Other databases doesn't seem to allow this setting to be tweaked .

How could InnoDB still be able to maintain **data integrity** and ACID  if we disable doublewrite buffering?

In what situations will it be **safe** to turn off InnoDB doublewrite buffer?

Pacerier (491 rep)

Dec 19, 2014, 06:32 PM • Last activity: Oct 23, 2023, 08:12 PM

4 votes

2 answers

696 views

MySQL: Overlay modifications on original data while indexing on a field that itself has modifications

mysql index view data-integrity

I have a MySQL table of original_data that is important for historical/auditing purposes and should not ever be changed. I want to be able to mark up these data with modifications, e.g. to correct errors in the original data entry, and view the overall composite dataset (changes overlaid on original...

                                  I have a MySQL table of original_data that is important for historical/auditing purposes and should not ever be changed. I want to be able to mark up these data with modifications, e.g. to correct errors in the original data entry, and view the overall composite dataset (changes overlaid on original data), without making any modifications to the table of original_data.

My fields are a mix of int, varchar, and datetime; nothing bizarre.

The system currently accomplishes this with the following:

  

>  * A table *original_data* (which is never modified). This is indexed
> on a datetime field.
> 
> 
>  * A table *modifications*, which includes all fields that
> original_data has. Each row in modifications references the primary
> key of the row in original_data that is modified.
> 
> 
>  * A view *mods_overlay* that joins original_data to modifications on
> a key, displaying modified data (where it exists) in place of original
> data. I'm using CASE WHEN statements to return fields from
> modifications where they exists for a particular row, else return the
> field from original_data.

So far, so good! The problem is, I have millions of rows, and one of the fields I need to modify is the datetime field on which original_data is indexed. As soon as my mods_overlay view overlays this datetime field, I can no longer efficiently select my data on a specific period of time; runtime for select statements increases from a few seconds to 30+ minutes. MySQL does not support indexing for views, nor does it support materialized views, either of which could help here.

One workaround: I could implement mods_overlay as a table that is pre-computed overnight, but then new modifications cannot be seen until the next day, and I'm not sure if this will be acceptable.

Are there other ways to solve this problem without making significant tooling changes?

Stated differently: is there a best practice in MySQL for providing a way to overlay a set of modifications on a set of original data (in a table that is never modified), while indexing on a field that can itself have modifications?
                                

Chris Martin (141 rep)

Oct 26, 2015, 02:15 AM • Last activity: Feb 17, 2023, 06:04 PM

2 votes

0 answers

456 views

Setting up Database Integrity Checks

sql-server sql-server-2012 ola-hallengren-scripts maintenance-plans data-integrity

I'm relatively new to SQL and, unfortunately, am definitely not at the level of a DBA - However, it's become my job to look into some SQL issues... There's a Database Integrity Check set to run for one of our customers every evening from 21:30 - This usually runs for approximately 2-5 hours (give or...

                                  I'm relatively new to SQL and, unfortunately, am definitely not at the level of a DBA - However, it's become my job to look into some SQL issues...

There's a Database Integrity Check set to run for one of our customers every evening from 21:30 - This usually runs for approximately 2-5 hours (give or take) however in recent times has increased in duration reaching up to 8-9 hours - I'm not too clear on causation here, but I believe the databases are quite large and this could potentially just be down to how large those databases are.

From reading through https://social.technet.microsoft.com/wiki/contents/articles/54312.sql-server-database-integrity-check-best-practices.aspx  - It suggests separating the Integrity checks into phases - My question is this:

**Would I be setting up separate scheduled jobs on separate times/days to space these checks out?**

Presumably running the below would be no different than me simply running CHECKDB, correct?

        EXECUTE dbo.DatabaseIntegrityCheck
    @Databases = 'ALL_DATABASES',
    @CheckCommands = 'CHECKALLOC,CHECKTABLE,CHECKCATALOG'

So, in theory I would guess that I want three separate scheduled jobs at different times like so:

        EXECUTE dbo.DatabaseIntegrityCheck
    @Databases = 'ALL_DATABASES',
    @CheckCommands = 'CHECKALLOC'

        EXECUTE dbo.DatabaseIntegrityCheck
    @Databases = 'ALL_DATABASES',
    @CheckCommands = 'CHECKTABLE'

        EXECUTE dbo.DatabaseIntegrityCheck
    @Databases = 'ALL_DATABASES',
    @CheckCommands = 'CHECKCATALOG'

Would we say that this is accurate? Thanks for any time spent on answering this query. 

1) Microsoft SQL Server 2012, exact version would be 11.0.5058.0 (for MSSMS) 

2) There's one single Step in the job named "Integrity Check" of the type "T-SQL" and within that is the script I list above (dbo.DatabseIntegrityCheck across ALL_DATABASES) 

3) Total size of all DB's from what I can see would be 1.1 TB
                                

Yeldur (21 rep)

Dec 14, 2022, 11:37 AM • Last activity: Jan 6, 2023, 06:59 AM

0 votes

1 answers

181 views

Best practices for tracking ingested data sources

postgresql data-integrity data-validation

I am in the process of creating an ingestion pipeline wherein there exists a step of periodically reading new .csv files and storing them into a postgres database. This step is functioning, however it is currently impractical/time-consuming to verify if the data in any certain file has been fully an...

                                  I am in the process of creating an ingestion pipeline wherein there exists a step of periodically reading new .csv files and storing them into a postgres database. This step is functioning, however it is currently impractical/time-consuming to verify if the data in any certain file has been fully and correctly ingested into the database. I am essentially operating in blind trust that the database is a point of truth, but I would like to be able to be a little more certain.

The first step I was planning to take is to store runtime metadata during ingestion jobs (e.g. filename, time of ingestion, job result) in its own table in the database. While this won't speak to the _integrity_ of the data, it would at least allow some insight into what has been processed. 

Any guidance on best practices and what else I can do re: data validation for a setup like this would be greatly appreciated!

SIRCAVEC (101 rep)

Oct 31, 2022, 07:31 PM • Last activity: Nov 1, 2022, 02:11 AM

2 votes

1 answers

1984 views

How to verify a SSAS Backup and how to perform Integrity Checks in SSAS?

sql-server ssas backup powershell data-integrity

I have a [procedure to backup SSAS Databases][1] that is working successfully. I also [delete the SSAS backup files that are older than 2 days][2]. What I am looking forward to implement is: 1) How to verify my SSAS backups? (restore verify only) 2) How to run a DBCC CheckDB in SSAS? (database integ...

                                  I have a procedure to backup SSAS Databases  that is working successfully.

I also delete the SSAS backup files that are older than 2 days .

What I am looking forward to implement is:

1) How to verify my SSAS backups? (restore verify only)

2) How to run a DBCC CheckDB in SSAS? (database integrity)

But I have not found any substantial documentation for these.

Any suggestions?

I know about the right-click/ Validate database on Visual Studio but I would like an automated process, not manually clicking.

Marcello Miorelli (17274 rep)

Dec 1, 2014, 12:51 PM • Last activity: Jul 27, 2022, 02:00 PM

1 votes

0 answers

42 views

Is creating table entry from another table entry and copying its value good practices?

postgresql database-design data-integrity rails

Lets say I have `inventory_receives` table which contain many `inventory_receive_entry` and also I have `inventory` table which contain one `inventory_receive_entry` Then in my application `inventory_receives` instance has a method with function to loop for each `inventory_receive_entry` that it hav...

Lets say I have inventory_receives table which contain many inventory_receive_entry and also I have inventory table which contain one inventory_receive_entry Then in my application inventory_receives instance has a method with function to loop for each inventory_receive_entry that it have and creates inventory by copying the attributes values from inventory_receive_entry to inventory Is this a good practices? I'm confused should I copying the value from inventory_receive_entry to inventory or should I just refer the inventory_receive_entry from inventory my consideration about copying the value instead referring is the practicality since I not need to write complex query and such but there are people that said about single source of truth and something like that My schema from ruby on rails:

create_table "inventory_receives", force: :cascade do |t|
    t.bigint "id"
    t.bigint "supplier_id"
    t.bigint "branch_id"
  end

create_table "inventory_receives_entries", force: :cascade do |t|
    t.bigint "id"
    t.bigint "inventory_receive_id", null: false
    t.bigint "product_id", null: false
    t.integer "quantity", default: 0, null: false
    t.datetime "expiry_date"
    t.string "barcode", default: "", null: false
    t.decimal "price", default: "0.0", null: false
    t.string "batch_code", default: "", null: false
    t.decimal "vat", default: "10.0", null: false
    t.decimal "discount", default: "0.0", null: false
  end

  create_table "inventories", force: :cascade do |t|
    t.bigint "id"
    t.bigint "inventory_receive_entry_id", null: false
    t.integer "quantity", default: 0, null: false
    t.decimal "selling_price", default: "0.0", null: false
  end

joyoy (13 rep)

Feb 7, 2022, 01:50 PM • Last activity: Apr 25, 2022, 10:15 AM

10 votes

2 answers

2306 views

What is the difference between integrity and consistency?

constraint terminology data-integrity consistency

Whether I read about CAP or ACID, I see that consistency is referred to ensure the DB integrity constraints. So, I do not understand, why two terms are used to refer the same thing or there is a difference between the integrity and consistency? [I read][1] that > Anyhow, Atomic, Consistent, Isolated...

                                  Whether I read about CAP or ACID, I see that consistency is referred to ensure the DB integrity constraints. So, I do not understand, why two terms are used to refer the same thing or there is a difference between the integrity and consistency?

I read  that 

> Anyhow, Atomic, Consistent, Isolated, Durable are properties of the transactions. It is true that atomicity + isolation is enough for you to roll-your-own consistency. But we can also roll our own atomicity, roll our own isolation, and roll our own persistence (durability). When we roll our own, we must pay for features with our own blood, sweat, and parentheses. We don't say the properties were given to us by a transaction system.

which suggests that consistency is what user application can provide himself on top of database integrity constraints. This is not property that is provided by database, as AID properties are. Why to give C the title as you do to the other, system-provided AID properties?

Little Alien (299 rep)

Dec 2, 2016, 08:57 PM • Last activity: Sep 15, 2021, 03:31 PM

2 votes

2 answers

1452 views

How do you handle "undo" in databases in general, and PostgreSQL specifically?

postgresql backup data-integrity

I have a complex PostgreSQL-based system which I made myself from the ground up. It contains tons of invaluable (to me) data which I don't want to lose or get corrupted. I'm always paranoid about potential bugs which may, for example, `UPDATE` every single record in a table instead of a specific one...

                                  I have a complex PostgreSQL-based system which I made myself from the ground up. It contains tons of invaluable (to me) data which I don't want to lose or get corrupted.

I'm always paranoid about potential bugs which may, for example, UPDATE every single record in a table instead of a specific one, or other things like that. Any mistake which I would notice shortly after making it, basically. (Not to mention the ones I wouldn't notice until much later... but that's a different question.)

Luckily, that has not *yet* really happened, but that's the keyword: *luckily*. It seems like it could happen at any time, by making a small mistake, just once.

If it *did* happen, I would first panic. Then I'd shut down the system ASAP so it's no longer "live" and messing with the PostgreSQL database, while I think through what to do next. Next, I'd probably try to restore a recent backup. I do keep those. But, inevitably, I would lose at least about a day's updates to the database. And it would be very scary and tedious to restore the database, something which I want to do as infrequently as possible.

I've spent countless hours of my life trying to achieve a more "robust" backup mechanism, where it continuously makes small updates to it, but every time I try to achieve that, I'm overwhelmed and confused by the documentation and online guides to the point where I shamefully go back to my "full dump about once a day" method which I at least am familiar with and which works.

Frequently, I wish there were some way to have an "undo" button which could actually restore the database to how it was just a minute ago, after I make a mistake. Perhaps even on a per-table basis, if that makes logical sense.

You're gonna say something about "transactions", but, well... I don't see how you can notice the error before the change has been "committed". Therein lays the main issue. And PostgreSQL (the only database I use and know anything about) doesn't appear to have a built-in "undo" feature or the ability to revert back to a recent state without fiddling with backups.

Is there something important which I've missed?

user16508174 (287 rep)

Aug 22, 2021, 05:49 AM • Last activity: Aug 23, 2021, 02:37 PM

22 votes

1 answers

82455 views

Use of log_bin_trust_function_creators in MySQL

mysql mysql-5.5 security data-integrity

What is the use of `log_bin_trust_function_creators`? Also, Is the value of the parameter dynamic? Will it change if I restart my MySQL service?

                                  What is the use of log_bin_trust_function_creators? 
Also, Is the value of the parameter dynamic? 
Will it change if I restart my MySQL service?
                                

tesla747 (1910 rep)

Jul 28, 2015, 11:47 AM • Last activity: Apr 1, 2021, 07:18 AM

0 votes

2 answers

1537 views

Check for spaces or zeros in all columns in all tables

oracle data-integrity oracle-18c

I have multiple tables in an Oracle schema that have columns with incorrect values: - Number columns where the value is zero, instead of null. - Text columns where the value is a space, instead of null. I’m in the process of cleaning up the data. I will eventually create constraints on the columns t...

                                  I have multiple tables in an Oracle schema that have columns with incorrect values:

 - Number columns where the value is zero, instead of null.
 - Text columns where the value is a space, instead of null.

I’m in the process of cleaning up the data. I will eventually create constraints on the columns to prevent bad data being entered in the first place. But before I do that, I need to find the columns that have existing problems and investigate them.

To start, I would like to produce a list of columns that have incorrect values. 

 - For each table in a schema, provide a list of columns that have zeros or spaces as values.

>     table_name    column_name    datatype    value    count
>     —————————————————————————————————-————————————-———————-
>     roads         width          number      0        500
>     sidewalks     description    varchar2    [space]  10000
>     sidewalks     const_year     number      0        2000

Question:

What would it take to produce a list like this in Oracle?

*(Note: I’m a public works technician by day, not a database administrator. Layman’s terms would be appreciated.)*

User1974 (1527 rep)

Oct 31, 2020, 02:49 PM • Last activity: Nov 4, 2020, 12:15 AM

0 votes

0 answers

194 views

Ensure data integrity in relational DB?

mysql database-design referential-integrity data-integrity

Let's imagine I want to store in a relational database the following : - A list of quizzes - A list of questions associated with quizzes - A list of users - The answers to questions made from users. I could have a model like this: [![proposed data model][1]][1] However, this doesn't ensure any integ...

                                  Let's imagine I want to store in a relational database the following :

- A list of quizzes
- A list of questions associated with quizzes
- A list of users
- The answers to questions made from users. 

I could have a model like this:

However, this doesn't ensure any integrity in the following cases:

- A question has been modified so the corresponding answer is now wrong
- New questions are added to an existing quiz, so the score students will be different

With MySQL I don't know any way to prevent this kind of data corruption unless to put the database in read-only mode. The only alternative I can think of is to use a revision column for both the quizzes and the questions. Then to use a composite primary key that identifies a question from its revision and its id. 

My current idea is to have a JSON column in the activities table that freeze the state of a quiz at a certain moment in time. Then the hash of this activity can be stored somewhere in a blockchain. 

What would be the alternatives?

nowox (123 rep)

Sep 17, 2020, 01:20 PM • Last activity: Sep 17, 2020, 04:05 PM

1 votes

2 answers

1820 views

How to ensure that two or more simultaneous requests to a MySQL table don't interfere

mysql php data-integrity

Suppose I have about a 1000 simultaneous requests to a table in my DB which consists of these statements (in order): SELECT INSERT or UPDATE (could be multiple) DELETE and I want to make sure that none of the requests interfere with one another and the table is in an updated state for the next incom...

                                  Suppose I have about a 1000 simultaneous requests to a table in my DB which consists of these statements (in order):

    SELECT

    INSERT or UPDATE (could be multiple)

    DELETE

and I want to make sure that none of the requests interfere with one another and the table is in an updated state for the next incoming request. For example:

    Request #1 comes in.

    Request #2 comes in.

    Request #1 starts to get processed.

    Request #2 should start only when Request #1 is processed so that the data view is consistent and updated for the next request.

I searched for this situation around the web and found out two things that could be of help:

1. Transactions

2. Table Locking

But when I read about Transactions, I found out that while a row is being processed by a transaction, other transactions can still come along and modify that row which defeats my purpose.

Also in case of locking, a column can only be locked in one mode i.e. either in READ or WRITE mode, which again fails to keep the data consistent as while reading a row it may get updated and the data which was just read would become an older version of what the table actually is right now.

Is there a solid way to achieve this which can make sure that no request interferes with another and the table is in an updated state for the next incoming request?

I am using MySQL with PHP if this helps.

FunnelScr (13 rep)

Jul 6, 2020, 03:14 AM • Last activity: Jul 6, 2020, 04:33 PM

6 votes

3 answers

12373 views

How to verify that postgres 9.3 db has no corruption?

postgresql postgresql-9.3 corruption data-integrity

I have a postgres 9.3 db running on ubuntu server. About a month ago we've had some problems with the hardware on the server which was resolved by our VPS hosting company. The problem was resolved fairly quickly and everything seemed to be working fine. We have backups running using barman that is o...

                                  I have a postgres 9.3 db running on ubuntu server.  
About a month ago we've had some problems with the hardware on the server which was resolved by our VPS hosting company.  

The problem was resolved fairly quickly and everything seemed to be working fine.  
We have backups running using barman that is on a different server - and backups and restores were working (I checked).

The first sign of data corruption was few days ago: I decided to do a full pg_dump on our DB as I do every once in a while, and it failed (invalid page header in block...) - The data seemed to have been corrupted a long time ago - around the time of the hardware problem (that was the date on the corrupted record). I resorted to finding the corrupted record and I deleted it and restored it manually.  

After that I was able to do a full pg_dump.

In order to check for additional corruption - I've set up different db server from a backup and run pg_repack on all the tables to verify that I'm able to rebuild all the indexes and tables.

**My questions are:  
1. How can I be certain that I don't have any additional corruption in my DB?**  
   **2. What can I do to periodically check my data integrity?**  
**3. What else can I do to verify the integrity of our DB besides dumping the whole DB and re-indexing it (which I already did)?**  

P.S - I don't have block checksums enabled.  
                                

Rubinsh (171 rep)

May 13, 2017, 06:52 PM • Last activity: Jun 9, 2020, 11:12 AM

0 votes

1 answers

42 views

Database system checksumming to prevent network corruption not caught by TCP checksumming

corruption database-agnostic data-integrity tcpip checksum

Since TCP checksumming is [not sufficient to detect/prevent all corruption][1], do modern database systems (e.g. SQL Server, MariaDB / MySQL, Oracle, etc.) use any additional checksumming of network traffic as part of their client/server protocol? I did some searching but wasn't finding anything whi...

                                  Since TCP checksumming is not sufficient to detect/prevent all corruption , do modern database systems (e.g. SQL Server, MariaDB / MySQL, Oracle, etc.) use any additional checksumming of network traffic as part of their client/server protocol?  

I did some searching but wasn't finding anything which made me wonder if perhaps they don't and just roll the dice. 

Got a good answer on Oracle. Would like to hear about SQL Server and MariaDB  / MySQL as well.

g491 (121 rep)

Mar 14, 2020, 04:25 AM • Last activity: Mar 15, 2020, 07:00 PM

0 votes

2 answers

113 views

Is unique index combined with a "not null" constraint equivalent to a PK constraint?

index primary-key constraint azure-sql-database data-integrity

I have these legacy table definitions ([from here][1]): CREATE TABLE [dbo].[JobItems] ( [ItemId] UNIQUEIDENTIFIER NOT NULL, -- lots of other columns CONSTRAINT [PrimaryKey_GUID_HERE] PRIMARY KEY NONCLUSTERED ([ItemId] ASC) ); CREATE UNIQUE CLUSTERED INDEX [JobItemsIndex] ON [dbo].[JobItems]([ItemId]...

                                  I have these legacy table definitions (from here ):

    CREATE TABLE [dbo].[JobItems] (
       [ItemId]            UNIQUEIDENTIFIER NOT NULL,
       -- lots of other columns
       CONSTRAINT [PrimaryKey_GUID_HERE] PRIMARY KEY NONCLUSTERED ([ItemId] ASC)
    );

    CREATE UNIQUE CLUSTERED INDEX [JobItemsIndex]
    ON [dbo].[JobItems]([ItemId] ASC);

which effectively yields two identical indexes. I want to get rid of one of them to reduce insertion time and free up disk space.

The table is stored in a production SQL Azure database, so I cannot rebuild the table and I cannot drop the clustered index (SQL Azure requires a clustered index for every table).

My requirement is I keep all the data integrity guarantees which were provided by the original definition and get rid of the non-clustered index. It looks like having a NOT NULL constraint and a unique index gets me the same guarantees as a PK constraint. So the PK constraint is redundant and I can drop it and the underlying index.

Can I just drop the PK constraint here without expecting anything to break?
                                

sharptooth (1331 rep)

Aug 10, 2015, 01:58 PM • Last activity: Mar 13, 2020, 10:12 PM

Showing page 1 of 20 total questions