Sample Header Ad - 728x90

Database Administrators

Q&A for database professionals who wish to improve their database skills

Latest Questions

0 votes
1 answers
144 views
Trying to load a file into a database on virtual machine
I have set up a Postgres db on a linux vm and have been having no issues using a GUI to connect to it. However, I am trying to load a large, 32GB, file onto it and so am skeptical of the old way I was doing as it takes a lot of bandwith. I set up a dropbox folder to sync on the VM, which it has, and...
I have set up a Postgres db on a linux vm and have been having no issues using a GUI to connect to it. However, I am trying to load a large, 32GB, file onto it and so am skeptical of the old way I was doing as it takes a lot of bandwith. I set up a dropbox folder to sync on the VM, which it has, and tried to COPY the file and got error message: ERROR: could not open file "~/Dropbox/0ptimus-Jaspin/nation/VoterMapping--NH--03-17-2014-HEADERS.tab" for reading: No such file or directory I used the following to try to do the copy: COPY nh FROM '/Dropbox/0ptimus-Jaspin/VoterMapping--NH--03-17-2014-HEADERS.tab'; Thanks!
Ron (1 rep)
Sep 20, 2014, 05:57 PM • Last activity: Jul 20, 2025, 03:09 AM
0 votes
0 answers
30 views
Copying millions of rows to a new table
I have a mysql table that has 100+ million rows and we need to add a new column to this table. ```ALTER TABLE``` will lock the table and I can't have that, so my next idea was to make a new table that has the new field and then just move the contents over using a script of some kind. however, I want...
I have a mysql table that has 100+ million rows and we need to add a new column to this table.
TABLE
will lock the table and I can't have that, so my next idea was to make a new table that has the new field and then just move the contents over using a script of some kind. however, I wanted to see if I had some other (possibly better/faster) options available to me. The MySQL database is an RDS on AWS (So it's an Aurora Database that's MySQL-flavored), so if someone know of some AWS magic, I'm all ears. There's not that many indexes or any relationship setup (this system is a legacy system that never had a DBA design things). I have considered using mysqldump to extract the data, but wasn't sure if this is a worthwhile route to go. Any suggestions/ideas would be appreciated.
eman86 (1 rep)
Jun 30, 2025, 11:07 PM • Last activity: Jun 30, 2025, 11:24 PM
0 votes
1 answers
280 views
PostgreSQL: Non-continious replication of new data with COPY
I have a database tables with millions of measurements. New data is coming in every day. For analysis, I want to replicate the data to my laptop (local postgres db). No need for automatic replication. It's OK to start a script for this. I feel like the standard replication solutions are inadequate a...
I have a database tables with millions of measurements. New data is coming in every day. For analysis, I want to replicate the data to my laptop (local postgres db). No need for automatic replication. It's OK to start a script for this. I feel like the standard replication solutions are inadequate and overengineered for my use because - I need asynchron replication, - some solutions are inefficient because they replicate row-wise (yet data on the server is bulk-inserted), - and they need too much configuration. I would like to use COPY TO STDOUT | COPY FROM STDIN but here I have to select only the new data. How can I do that? The table has this form: Column | Type ------------+-------------------------- devicename | character varying(30) id | integer timestamp | timestamp with time zone value | numeric variable | text PK would be (devicename, id). Note that id alone is not unique because the data is coming from multiple devices. How can I select only new data for COPY? Any other approaches for this replication requirements?
schoettl (101 rep)
Jul 10, 2018, 11:26 AM • Last activity: May 11, 2025, 04:04 AM
0 votes
0 answers
66 views
SqlBulkCopy - How to force it to use a renamed bcp file
I'm trying to read data from XEL files and input to a SQL Server table. ``` $bcp = New-Object System.Data.SqlClient.SqlBulkCopy($connectionString) $bcp.DestinationTableName = "dbo.InputXELData" $bcp.WriteToServer($dt); $bcp.Dispose() ``` Our servers has both ASE and MSSQL bcp utilities installed on...
I'm trying to read data from XEL files and input to a SQL Server table.
$bcp = New-Object System.Data.SqlClient.SqlBulkCopy($connectionString)
$bcp.DestinationTableName = "dbo.InputXELData"
$bcp.WriteToServer($dt);
$bcp.Dispose()
Our servers has both ASE and MSSQL bcp utilities installed on the same server, so we had to rename the SQL Server bcp as "mssqlbcp". Now here is the problem, SqlBulkCopy is not working and it is using ASE bcp and no data is been read out of extended event files. Tested on the server which do not have ASE bcp, and it worked fine. My question is how can we force SqlBulkCopy to read the "mssqlbcp" rather than ASE "bcp". Tried looking here but no luck, https://learn.microsoft.com/en-us/dotnet/api/system.data.sqlclient.sqlbulkcopy?view=netframework-4.8.1 Thanks for your help.
Kris (452 rep)
Oct 18, 2024, 08:20 PM • Last activity: Oct 30, 2024, 02:29 PM
2 votes
1 answers
402 views
Should I delete index before bulk-loading data with COPY?
My pet project has one table which is totally rewritten once a day. This project uses a PostgreSQL database. I decided to use `COPY` to insert data from 4 CSV files (about 3 million rows). At the same time there are maybe some queries to that table from clients. This table has a composite index. I u...
My pet project has one table which is totally rewritten once a day. This project uses a PostgreSQL database. I decided to use COPY to insert data from 4 CSV files (about 3 million rows). At the same time there are maybe some queries to that table from clients. This table has a composite index. I use COPY in a transaction. - Should the index be deleted before COPY and then recreated? - Or index recreated automatically after new data inserted? - Maybe there is better solution than I have chosen? Queries need the index. I don't understand whether the index will be rebuilt automatically or whether it will need to be rebuilt manually. If manually, then maybe delete it first and then create it after copy. But I see that in my case I can't delete the index because there are concurent queries.
Rash-77 (21 rep)
May 10, 2024, 10:35 AM • Last activity: May 11, 2024, 09:14 AM
9 votes
5 answers
9074 views
Copying a table (and all of its data) from one server to another?
I have a massive table, let's say 500,000 rows. I want to copy it (schema and data) from one server to another. This is not an upsert or any kind of update; It's a one-off straight copy and paste. What are the idiomatic approaches to this? I've tried: - Restoring a backup from one server on to anoth...
I have a massive table, let's say 500,000 rows. I want to copy it (schema and data) from one server to another. This is not an upsert or any kind of update; It's a one-off straight copy and paste. What are the idiomatic approaches to this? I've tried: - Restoring a backup from one server on to another. This is impractical, because SQL Server notoriously cannot restore tables from a backup; It can only restore databases. And my database is huge! - Using SSMS to script the table's data as a sequence of INSERT statements. This is impractical, because the inserts have to be done row by agonising row. I suspect that this also does awful things to the transaction log, but nobody has attacked me for this yet (I'm running such a script right now, it's going to take hours).
J. Mini (1237 rep)
Nov 23, 2023, 07:00 PM • Last activity: Apr 9, 2024, 07:49 PM
1 votes
2 answers
6721 views
Where to download bcp command line package for SQL Server 2019?
I have a Windows 10 machine where I would like to run bcp from powershell to perform bulk copy operations against SQL Server 2019. Which package do I install to get bcp support Windows 10? Would rather not install full blown SQL server if possible.
I have a Windows 10 machine where I would like to run bcp from powershell to perform bulk copy operations against SQL Server 2019. Which package do I install to get bcp support Windows 10? Would rather not install full blown SQL server if possible.
user2368632 (1133 rep)
Oct 24, 2022, 10:09 PM • Last activity: May 18, 2023, 04:59 PM
11 votes
4 answers
115339 views
ORA-01502: index or partition of such index is in usable state problem
I have a table in my Oracle database, where select pkcol, count(*) from myTable group by pkcol having count(*) > 1; yields PKCOL COUNT(*) ------- ---------- 1 2 2 2 Trying to remove the duplicate rows delete myTable where pkcol = 1; Yields: > ORA-01502: index 'MYTABLE.PK_MT' or partition of such ind...
I have a table in my Oracle database, where select pkcol, count(*) from myTable group by pkcol having count(*) > 1; yields PKCOL COUNT(*) ------- ---------- 1 2 2 2 Trying to remove the duplicate rows delete myTable where pkcol = 1; Yields: > ORA-01502: index 'MYTABLE.PK_MT' or partition of such index is in > usable state. I'm using Oracle.DataAccess.Client.OracleBulkCopy to fill the table. As far as I understand documentation from Oracle PRIMARY KEY constraints had to be checked. Obviously they are not checked, as I found by doing the same bulkcopy two times in succession which ended in duplicates in all row. Now I'm only using it after deleting all rows and I'm using a table with a similar primary key as source. As result I expect no problems. But embedded deep inside my MS Build scripts, I end up with just 2 duplicates out of 2210 rows. I guess that ignoring the primary key in the first place is a clear bug. No Bulkcopy should be allowed to ignore primary key constraints. **Edit:** Meanwhile I found, that the 2 conflicting rows where normally inserted by some script before bulkcopy was called. The problem reduces to my known problem, that bulkcopy doesn't check primary keys here.
bernd_k (12389 rep)
Jul 10, 2011, 08:56 PM • Last activity: Nov 2, 2022, 12:18 PM
1 votes
1 answers
1098 views
Inspecting the status of a long-running COPY statement
I need to migrate a table from one PostgreSQL database to another. There was a chance I would need to fix some data, so I exported to a CSV. Then I imported the CSV in the second database with a `COPY` statement. This process has been running for 5 days now. The only way I found to inspect its progr...
I need to migrate a table from one PostgreSQL database to another. There was a chance I would need to fix some data, so I exported to a CSV. Then I imported the CSV in the second database with a COPY statement. This process has been running for 5 days now. The only way I found to inspect its progress was to compare the sizes on disk. The original table was 95 GB (from psql's \dt+), and the CSV was 40 GB. So I thought I could compare the new table size with those numbers. I thought that the new table would stop at 95 GB, or even before. Instead, it's now at 103 GB and who knows when it will stop. Of course, select count(*) does not work because the copy happens in its own transaction so the rows are shielded until it's done. But I know that the the table has about 1500 million rows. So if somehow I could get an estimate of the number of rows currently in the new table I could compare.
rubik (535 rep)
Jul 8, 2019, 03:15 PM • Last activity: Jul 1, 2022, 05:45 AM
0 votes
1 answers
3332 views
Most efficient method to import bulk JSON data from different sources in postgresql?
I need to import data from thousands of URLs, here is an example of the data: >[{"date":"20201006T120000Z","uri":"secret","val":"1765.756"},{"date":"20201006T120500Z","uri":"secret","val":"2015.09258"},{"date":"20201006T121000Z","uri":"secret","val":"2283.0885"}] Since COPY doesn't support JSON form...
I need to import data from thousands of URLs, here is an example of the data: >[{"date":"20201006T120000Z","uri":"secret","val":"1765.756"},{"date":"20201006T120500Z","uri":"secret","val":"2015.09258"},{"date":"20201006T121000Z","uri":"secret","val":"2283.0885"}] Since COPY doesn't support JSON format, i've been using this to import the data from some of the URLs: CREATE TEMP TABLE stage(x jsonb); COPY stage FROM PROGRAM 'curl https:// .....'; insert into test_table select f.* from stage, jsonb_populate_recordset(null::test_table, x) f; But it is inefficient since it creates a table for every import and it imports a single url at a time. I would like to know if it is possible (through a tool, script or command) to read a file with all the URLs and copy their data into the database.
Lautaro Aguilera (1 rep)
Apr 6, 2021, 03:56 AM • Last activity: Jun 19, 2022, 04:10 AM
2 votes
2 answers
8402 views
Connecting remote MySQL database to local MySQL database?
I want to write PHP code to be embedded in a Drupal7 module. I want to call a procedure which can copy newly generated data in the local MySQL database to a remote MySQL database. When data is inserted in table `A` of my local database it should be copied to the specific table `B` on the remote MySQ...
I want to write PHP code to be embedded in a Drupal7 module. I want to call a procedure which can copy newly generated data in the local MySQL database to a remote MySQL database. When data is inserted in table A of my local database it should be copied to the specific table B on the remote MySQL database. Table 'A' is on local host. Table 'B' is on remote server. INSERT data on 'A' -> copied to 'B' Is this possible?
Shashank (23 rep)
Jun 28, 2012, 10:41 AM • Last activity: Jun 7, 2022, 09:53 AM
3 votes
2 answers
9425 views
Copy millions of rows to another table in batches mySQL
`Table A` which is always getting updated (records being inserted or updated). `Table A` contains millions of records. I'd like to copy some of these records to a new table `Table B`. `Table A` and `Table B` has exact same schema. How can I copy records from `Table A` to `Table B`? I don't want to c...
Table A which is always getting updated (records being inserted or updated). Table A contains millions of records. I'd like to copy some of these records to a new table Table B. Table A and Table B has exact same schema. How can I copy records from Table A to Table B? I don't want to consider the data which keeps getting updated in Table A. I only want to copy the data which is there when I first queried Table A. I'm trying to copy data in batches. So everytime I query for a batch of 500 records from Table A and copy them to Table B. The next time I query Table A to get next 500 records using offset. There is no guarentee that the new set of records are exactly next batch of 500 records since Table A is always getting updated. The task is to be able to ensure we are fetching batches in sequential way and it guarentees we have exactly next 500 records.
INSERT INTO Table B FROM SELECT * FROM Table A WHERE ...
doesn't work. Because as I mentioned Table A has a lot of data and running this query timesout. It needs to be carried out in batches. Creating a temporary table would also require to copy it in batches. I tried to use mySQL views but they also have the same problem. The view fetches data from the underlying table. If the underlying table gets updated the view fetches the updated data.
BountyHunter (33 rep)
Nov 10, 2021, 09:29 AM • Last activity: Nov 11, 2021, 12:20 AM
0 votes
1 answers
284 views
PostgreSQL - Best way to incremental export 1000+ tables every 5 mins
I have 6 PostgreSQL database servers(v11) hosted on some third-party vendor. I don't have access to setup `pg_logical`. Each server has 1 database but 1000+ tables. So I want to get the data from these 6 servers to my central PostgreSQL database. The sources tables are having PK and last_updated_tim...
I have 6 PostgreSQL database servers(v11) hosted on some third-party vendor. I don't have access to setup pg_logical. Each server has 1 database but 1000+ tables. So I want to get the data from these 6 servers to my central PostgreSQL database. The sources tables are having PK and last_updated_timestamp columns. Im just trying to find the best approach to get the data at every 5 or 15mins interval from these tables to my central database server. My goal is to sync the whole database to Report database server with every5mins interval.
TheDataGuy (1986 rep)
Jun 24, 2020, 01:16 PM • Last activity: Jun 25, 2020, 05:40 AM
0 votes
1 answers
384 views
Best way to copy MS SQL (2000) database files to an external drive
We have a system running on Windows 2003, with a MS SQL server 2000 (Standard) instance as the data source. As we have sold the product, we are in the process of moving everything to the purchasing party. We have cloned the virtual machines, and when they set them up, they are stating that the datab...
We have a system running on Windows 2003, with a MS SQL server 2000 (Standard) instance as the data source. As we have sold the product, we are in the process of moving everything to the purchasing party. We have cloned the virtual machines, and when they set them up, they are stating that the databases are corrupt and would like copies of the MDF/LDF files directly. What is the cleanest way to do this? There are approximately 90 separate databases configured, and I don't really want to go in, stop/detach/copy/attach/restart, especially as I've heard that reattaching doesn't always go well. I also don't know if every database is stored in the same place. I am looking for the least involved method of confirming the storage location and copy process for 90+ database files to a connected external hard drive.
JohnP (159 rep)
Jan 31, 2020, 05:20 PM • Last activity: Feb 3, 2020, 02:00 PM
1 votes
2 answers
373 views
SQL Server bulk transfer help needed
We've got this very badly designed logging table that we want to add functionality to. The problem is that it's already a scalability nightmare and we want to fix the design before adding to it, but we only have the nightly upgrade window to do it in. I've seen a lot of articles about the various bu...
We've got this very badly designed logging table that we want to add functionality to. The problem is that it's already a scalability nightmare and we want to fix the design before adding to it, but we only have the nightly upgrade window to do it in. I've seen a lot of articles about the various bulk copy options with SQL Server, claiming "We could move 80M rows in 10 minutes!" but so far my testing doesn't get anywhere near that, and I'd like suggestions on how to improve on what I'm seeing. Before the upgrade, there's always a full backup. I'm only interested in the end result and don't want a huge transaction log. I also don't want it to take too long and I don't want to blow out the disk space with transaction logs or temp files. The table's been out there a while, so in our bigger customer dbs, it's already over 50 million rows. Each row is about 350-400 bytes. The columns are something like this IdentityColID int, [type] int, [subtype] int, created datetime, author nvarchar(100), Message nvarchar(max) The problems with the design are - The primary clustered key is (type, subtype, created, identitycolid), so it's an insert nightmare. Blocksplits all over the place. And even doing a SELECT COUNT(*) takes like 8 minutes. - There aren't good indexes to support the types of queries desired I wanted to make a new table where the primary clustered index is the IdentityColId and add the indexes to support the type of necessary queries, and then copy the existing data over and drop the old table. So far, I tried using BCP to get the data out, and importing with * BCP * BULK INSERT * INSERT INTO ... FROM OPENROWSET The bcp export took about 25 minutes and the imports all took about 1.3 hour - about 1.5 hours. With Recovery Model Simple, the transaction log didn't grow but the cpu consumption was in the 60-65% range most of the time. I tried just using T-SQL INSERT INTO NewTable SELECT * FROM OldTable, but even with Recovery Model Simple, the transaction log gets to 100 gig. I tried using SSIS data import packages with the from/to model, and the net time was about an hour 20 minutes. With Recovery Model Simple, the transaction log stayed small. Then I tried an SSIS Execute SQLTask package to effectively do the INSERT INTO NewTable... line within SSIS. That got the execution time down to about 1:15, but no matter what the recovery model, the transaction log ended up around 100 gig, though CPU consumption stays modest. I'd like the end result to be one new table, so the suggestion from some of the articles I've read to parallelize into multiple result tables doesn't seem a profitable path. But so far, I just can't seem to approach those stats from the articles I've read. Anyone have any suggestions on how I can goose this a bit?
user1664043 (379 rep)
Dec 21, 2018, 09:29 PM • Last activity: Dec 17, 2019, 06:02 PM
-1 votes
1 answers
548 views
Move data chunk from table with 200+million rows to a new table MySQL
Our DB is hosted in AWS MySQL RDS. The data is such that each company that signs up gets its own set of tables which then receive enormous amount of data in them. Currently we want to move a chunk of rows from existing company table (having 200+million rows) to a new company table. We need to be car...
Our DB is hosted in AWS MySQL RDS. The data is such that each company that signs up gets its own set of tables which then receive enormous amount of data in them. Currently we want to move a chunk of rows from existing company table (having 200+million rows) to a new company table. We need to be careful of the auto increment ID in the new table and make sure all data is moved else rollback will be required.
Asad M (1 rep)
Apr 22, 2017, 05:27 PM • Last activity: Nov 2, 2019, 05:01 PM
0 votes
1 answers
101 views
Copy result of a query to a new PostgreSQL database
I have a Postgres database A with 2 tables, and I need to copy part of these tables (recent records in 1 month) to the new database with the same structures. How to achieve this, as I know cross database references are not implemented. So I can't select records from the old database to copy to the n...
I have a Postgres database A with 2 tables, and I need to copy part of these tables (recent records in 1 month) to the new database with the same structures. How to achieve this, as I know cross database references are not implemented. So I can't select records from the old database to copy to the new one. I can code, so any ideas are welcome.
Tiana987642 (131 rep)
Jun 5, 2019, 11:58 AM • Last activity: Jun 5, 2019, 12:30 PM
1 votes
1 answers
2969 views
SELECT vs COPY (SELECT) TO STDOUT
I need to read more rows than fit in memory, resonably fast. I see two options: * `SELECT ...` where my bindings use a cursor to stream the rows; * `COPY (SELECT ...) TO STDOUT (FORMAT binary)` where my bindings decode the binary format into rows. What are the theoretical differences in performance...
I need to read more rows than fit in memory, resonably fast. I see two options: * SELECT ... where my bindings use a cursor to stream the rows; * COPY (SELECT ...) TO STDOUT (FORMAT binary) where my bindings decode the binary format into rows. What are the theoretical differences in performance and behaviour between the above approaches, or are they effectively identical?
Max (119 rep)
Feb 13, 2019, 11:58 AM • Last activity: Feb 14, 2019, 09:48 AM
-1 votes
1 answers
2445 views
Copy in SQL with index vs without index
Recently I have received a data dump in csv format that I am trying to import in my PSQL database. The dump consists of 8 csv file each of size 7GB. I have found that using the Copy command on one file is incredibly slow if the indexes are defined on the table. I have 2 indexes, one on 3 fields and...
Recently I have received a data dump in csv format that I am trying to import in my PSQL database. The dump consists of 8 csv file each of size 7GB. I have found that using the Copy command on one file is incredibly slow if the indexes are defined on the table. I have 2 indexes, one on 3 fields and one on 2. Whereas copy on table with no index takes roughly 2 minutes. I was wondering, what is the best practice to copy and index big data files from CSV, for now, copying then reindexing seems like my best options. Is there any best practice for that. Why is copy incredibly slow when indexing data?
user163436 (1 rep)
Oct 22, 2018, 08:05 PM • Last activity: Oct 23, 2018, 04:00 AM
1 votes
1 answers
319 views
What data is duplicated when MySQL/MariaDB BLOB columns are copied?
Let `table_1` be created as follows: CREATE TABLE table_1 ( id INT AUTO_INCREMENT PRIMARY KEY, some_blob BLOB ); Let `table_2` be created as follows: CREATE TABLE table_2 ( id INT AUTO_INCREMENT PRIMARY KEY, some_blob BLOB ); What I want to know is, after I run this table-copying query INSERT INTO t...
Let table_1 be created as follows: CREATE TABLE table_1 ( id INT AUTO_INCREMENT PRIMARY KEY, some_blob BLOB ); Let table_2 be created as follows: CREATE TABLE table_2 ( id INT AUTO_INCREMENT PRIMARY KEY, some_blob BLOB ); What I want to know is, after I run this table-copying query INSERT INTO table_2 (id, some_blob) SELECT id, some_blob FROM table_1; will the actual text within each some_blob field of the table_1 table be duplicated and stored on disk, or will the DB have only duplicated pointers to the disk locations containing the BLOB data? One argument for why BLOB copying must involve the duplication of actual content reasons as follows: > Duplication of BLOB content is necessary because changes to BLOB data in table_1 should not also take place in table_2. If only the disk pointers were duplicated then content changes in one table would be reflected in the other table, which violates the properties of a correct copy operation. Now I present an alternative method that the DB could implement to satisfy this copy operation. This alternative shows **the above argument is not necessarily true**. The DB could only duplicate disk pointers during the execution of the given INSERT statement, then whenever an UPDATE occurs which seeks to modify the BLOB data in one of the tables, the DB would only then allocate more space on disk to store the new data which is part of the UPDATE query. A BLOB data segment then is only deleted when there no longer exists any disk pointers to it, and a particular BLOB data segment could potentially have many disk pointers pointing to it. So which of these strategies does MySQL/MariaDB use when executing the given INSERT statement, or does it use a different strategy? ### Why I am asking this question Currently I am running a couple of UPDATE queries which are copying large amounts of BLOB data from one table to another in the same database (over 10 million rows of BLOB data). The queries have been running for a while. I am curious about whether the performance is so slow because some of the columns I am comparing are poorly indexed, because these queries are literally copying over the content instead of disk pointers, or perhaps because of both of these reasons. I use an INSERT in the question's example because this simplifies the concept of database internals that I am trying to understand.
AjaxLeung (111 rep)
Jul 25, 2018, 09:40 PM • Last activity: Jul 25, 2018, 10:40 PM
Showing page 1 of 20 total questions