Database Administrators

Q&A for database professionals who wish to improve their database skills

Latest Questions

0 votes

2 answers

126 views

Options for syncing from a MASTER table to a COPY table?

oracle oracle-12c data-synchronization integration

I have a MASTER table and a COPY table in separate databases (both are Oracle 12c). I want to sync the MASTER table with the COPY table on a weekly basis. ------------ I can think of a few options for syncing from MASTER to COPY: - **Sync All:** Delete all from COPY and insert all from MASTER. - **P...

                                  I have a MASTER table and a COPY table in separate databases (both are Oracle 12c).

I want to sync the MASTER table with the COPY table on a weekly basis. 

------------

I can think of a few options for syncing from MASTER to COPY:

 -  **Sync All:** Delete all from COPY and insert all from MASTER.
 - **Period-based:** Replace (or add) rows in COPY where sysdate - editdate < 7 in MASTER.
 - **Period-based with redundancy:** Replace (or add) rows in COPY where sysdate - editdate < 31 in MASTER.
      - This applies in situations where rows in previous syncs have failed. The failed rows in MASTER will have been corrected, but we still need to sync them to COPY in future syncs.
 - **Manual flag:** Users manually flag rows in MASTER that need to by synced to COPY. The sync process resets the flag in MASTER once the edits have been synced to COPY.
 - **Date comparison:** If date in MASTER is newer than date in COPY then sync those rows.

---------------------

Are there any other options or details that I've missed?
                                

User1974 (1527 rep)

Oct 11, 2019, 03:08 PM • Last activity: Apr 11, 2023, 07:38 PM

1 votes

1 answers

2158 views

SQL Server Integration Services (SSIS) Package Failing with no error

ssis integration

I have some SSIS packages that are executed as steps in a Server Agent Job. The Steps often fail with no error and I have had to add retries. Now the jobs complete but I get the warning exclamation mark telling me there has been a failure and its these steps that nearly always require a retry. When...

                                  I have some SSIS packages that are executed as steps in a Server Agent Job. The Steps often fail with no error and I have had to add retries. Now the jobs complete but I get the warning exclamation mark telling me there has been a failure and its these steps that nearly always require a retry.

When looking at the failed step in the Job History I can see the package execution log and then it just randomly cuts off and the step beings a retry. There is no consistency with where it stops and I know there isn't a problem with the package as it eventually succeeds.

Does this sound like a network related issue? Is there anyway to find out what is causing these failures to happen?

Michael (11 rep)

Sep 19, 2019, 08:32 AM • Last activity: Jan 31, 2022, 02:07 PM

0 votes

1 answers

58 views

SSIS Deploy Error

ssis integration

I made a package in SSIS that convert from a DBF file to the SQL server. Well in visual studio it works perfectly, but after i deployed my project in sql server it doesn't work. See the images:[![enter image description here][1]][1] [1]: https://i.sstatic.net/8VaI1.png Someone know what is wrong? tk...

                                  I made a package in SSIS that convert from a DBF file to the SQL server.
Well in visual studio it works perfectly, but after i deployed my project in sql server it doesn't work.
See the images:

Someone know what is wrong?

tks for the attention

rlfaria (1 rep)

Feb 15, 2021, 11:30 AM • Last activity: Feb 23, 2021, 11:46 PM

7 votes

5 answers

19738 views

Does MySQL have a version of Change Data Capture?

mysql mysql-5.1 change-data-capture integration

We're in the process of phasing out an old system and migrating onto a new one. The last time that we phased out an old system, we ran both systems in parallel and integrated data between both until everything was fully migrated out in our field. During that process, I was able to build an integrati...

                                  We're in the process of phasing out an old system and migrating onto a new one. The last time that we phased out an old system, we ran both systems in parallel and integrated data between both until everything was fully migrated out in our field. During that process, I was able to build an integration between our legacy system and our new system that leveraged SQL Server's Change Data Capture to track changes and integrate those over incrementally.

For this next migration, the legacy system that we will be phasing out is based on MySQL v5.1.69 instead of SQL Server. I am not familiar with MySQL and I was wondering if there are any technologies similar to CDC that can be leveraged on MySQL in our current version or a newer version that would be worth migrating towards?

njkroes (655 rep)

Apr 22, 2014, 08:30 PM • Last activity: Feb 17, 2021, 12:20 PM

0 votes

1 answers

414 views

Materialized view logs — with condition?

oracle materialized-view where oracle-19c integration

I've created materialized view logs on a WORKORDER table in an Oracle 19c database. create materialized view log on my_workorder_system.workorder with rowid; The plan is to integrate the WORKORDER records to a separate GIS system (Oracle 18c) via a materialized view. The catch: The WORKORDER table h...

                                  I've created materialized view logs on a WORKORDER table in an Oracle 19c database. 

    create materialized view log on my_workorder_system.workorder with rowid;  

The plan is to integrate the WORKORDER records to a separate GIS system (Oracle 18c) via a materialized view.

The catch:

The WORKORDER table has both spatial and non-spatial workorder records in it (ISGIS = 0 or 1). I only need to sync the records to the GIS datdabase WHERE ISGIS = 1.

If I understand correctly, my materialized view in GIS will tell the MV logs in the workorder system to purge the log records after they're synced. However, the sync will only ever happen for workorders where ISGIS=1. The workorders where ISGIS=0 will **never get synced** — and therefore are not needed in the MV logs.

Is there a way to only generate MV logs for records that meet a condition? 

For example: 

    create materialized view log on my_workorder_system.workorder with rowid where ISGISS=1.

User1974 (1527 rep)

Nov 30, 2020, 07:30 PM • Last activity: Dec 10, 2020, 11:00 PM

1 votes

4 answers

446 views

Integration: Keep two systems in sync

oracle oracle-12c data-synchronization cron integration

I have a GIS system with 40 tables, ranging from 1,000 to 60,000 rows per table. The tables are the [system of record][1] for **assets** in a municipality. The GIS assets in the tables get integrated to a [Workorder Management System][2] (WMS) on a weekly basis. The integration is based on [web serv...

                                  I have a GIS system with 40 tables, ranging from 1,000 to 60,000 rows per table. The tables are the system of record  for **assets** in a municipality.

The GIS assets in the tables get integrated to a Workorder Management System  (WMS) on a weekly basis. The integration is based on web services  that serve up the GIS tables to the WMS.

**Constraint #1:**

The integration to the Workorder Management System is **multi-purpose**.

1. There is a single asset table in the WMS that gets updated, via **cron tasks**, with any edits that have been made to the GIS assets (new assets, changed assets, and decommissioned assets). Only assets that have been edited are updated in the WMS.
2. The integration is also used to dynamically serve up the assets to a **web map** in the WMS (all of the GIS assets are used in the map--not just the assets that have been edited). The map in the WMS connects directly to the GIS web services--it does not use the records in the asset table or the cron tasks.

**Constraint #2:**

The WMS cron tasks are **notoriously slow**. Given my organization's infrastructure, my vendor says that the WMS cron tasks will only be able to sync **150 records per minute**. 

 - Testing is ongoing, but we have been told to only sync the records that actually need to be synced (edits) due to the significant performance concerns. In other words, we can't just integrate or copy *all* records, *all the time*.
 - *To give you an idea, this is what the cron task process looks like:
REST GIS web service >> JSON object >> Parse the JSON into individual records >> Generate XML for each record >> Process the XML records with Java classes  >> Insert the records into the database*.

**Constraint #3:**

GIS data is **notoriously messy**.

In constraint #2, I mentioned that the records get processed with Java classes. The Java classes check for errors (parent/child, field rules, etc.) and flag any records that fail. 

 - These records do not get integrated into the WMS. 
 - It is up to the GIS teams to correct the errors in GIS tables, then we'll try again in the next integration instance (next week) to sync the GIS records to the WMS.

---------------

**Question:**

Given the constraints above, I think I need to figure out a way to integrate all the GIS assets to the WMS (constraint 1.2), but also flag the records that need to be synced due to edits (constraint 1.1).

 - For edited assets that failed to sync--I need to **retry** them in future syncs until they are successful.
 - And I need to avoid syncing records unnecessarily--due to the performance concerns. 

**How can I do this?**

User1974 (1527 rep)

Sep 22, 2019, 11:08 PM • Last activity: Nov 9, 2020, 04:36 AM

2 votes

1 answers

386 views

Scheduled snapshots of views (without using materialized views or Oracle Golden Gate)?

oracle replication oracle-18c integration

I have 40 views in an Oracle 18c [GIS database][2] that are used in a map in a [workorder management system][3] (WMS). - The views are served up to the WMS map via a web service. - There are an average of 10,000 rows per view. The views have joins to dblink-tables in a separate Oracle database, and...

                                  I have 40  views in an Oracle 18c GIS database  that are used in a map in a workorder management system  (WMS).  

 - The views are served up to the WMS map via a web service. 
 - There are an average of 10,000 rows per view.

The views have joins to dblink-tables in a separate Oracle database, and as a result, are not fast enough for use in the WMS map (3-second map refresh delay). Furthermore, it seems like a bad idea to compute the views each time a user refreshes the map--since the map does not need to be up-to-date in real-time.

As an alternative, I would like to take snapshots of the views on a weekly basis. The snapshots would be static tables that would perform well in the WMS map.

**The Catch:**

Unfortunately, due to office politics issues, using technology like materialized views or Oracle's Golden Gate to solve this problem is not an option.

---------------

What are my options for taking scheduled snapshots of Oracle views (without using materialized views or Golden Gate)?

For example, I could make an .SQL script that truncates static tables and inserts the rows from the views into the tables (although, as a novice, I don't know how efficient or risky that option would be, or if there are better alternatives).

User1974 (1527 rep)

Dec 29, 2019, 12:49 AM • Last activity: Sep 20, 2020, 03:42 PM

0 votes

1 answers

262 views

Alternative methods for client data integration (Azure SQL Database)

azure-sql-database integration azure-data-factory

I have an application that ingests data from clients on a daily/weekly basis (two different data sets, one daily and the other weekly) into a SQL Azure Database. The clients' data source depends on what software they use, so can vary from client to client. I currently have two methods of integration...

                                  I have an application that ingests data from clients on a daily/weekly basis (two different data sets, one daily and the other weekly) into a SQL Azure Database. The clients' data source depends on what software they use, so can vary from client to client.
I currently have two methods of integration, depending on the client:

1. Using Azure Data Factory and Self-Hosted Integration Runtime. In this method, the client is required to provide (within their network) a VM in which I set up the Integration Runtime, and a SQL Server database with just two tables where they dump the two datasets as required. In ADF, I create pipelines to pull the data directly from their SQL Server into my Azure SQL Database, then run necessary import procedures.
2. Using Azure Data Factory and BLOB Storage. In this method, I provide the client with a set of Powershell scripts to be run on a schedule (Windows task scheduler) that help them to copy their exported files (.CSV) to our BLOB storage. Then, the ADF pipelines copy from the BLOB storage to the Azure SQL Db, then run the necessary import procedures.

The first method is much simpler, but in terms of infrastructure at the client end, it seems like a bit of overkill to set up a mostly blank Windows VM and a database with just a couple of data dump tables. Obviously, this can be costly if the client themselves are cloud-hosted - firing up a new VM is not cheap, so could make them think twice about using our product.

The second method requires me to set up a storage container for each client, which I feel could make administration difficult as we scale up. Also, providing scripts to run with Windows Task Scheduler doesn't feel overly elegant.

Does anybody have any alternative solutions to this scenario? Or am I on the right track?

Any insights would be greatly appreciated. Thanks.

brad (11 rep)

Feb 24, 2020, 12:15 AM • Last activity: Aug 31, 2020, 05:20 AM

0 votes

0 answers

2129 views

Creating blank/deleting rows form excel using SSIS

sql-server ssis excel integration

I have an .xlsx file I import data from nightly. Once the import is complete I delete the tables(sheets/tabs) in the excel file using an Execute SQL Task so that the file is blank for the user but contains correct header rows: DROP TABLE `sheet1$` GO DROP TABLE `sheet2$` GO I then re-create the tabl...

                                  I have an .xlsx file I import data from nightly.
Once the import is complete I delete the tables(sheets/tabs) in the excel file using an Execute SQL Task so that the file is blank for the user but contains correct header rows:

    DROP TABLE sheet1$
    GO
    
    DROP TABLE sheet2$
    GO

I then re-create the tables (sheets/tabs) using another Execute SQL Task on completion of the last:

    CREATE TABLE sheet1$ (
    Column 1 LongText,
    Column 2 LongText
    )
    GO
    
    CREATE TABLE sheet2$ (
    Column 1 LongText,
    Column 2 LongText,
    Column 3 LongText,
    Column 4 LongText,
    Column 5 LongText,
    Column 6 LongText
    )
    GO

The problem I have is that say 10 rows existed on day 1. 10 records are imported. The tables(sheets/tabs) are deleted. The tables(sheets/tabs) are re-created but for some reason the number of rows are still part of the sheet. So for instance if on day 2 no new records where added to the file then the import imports 10 blank rows. It's like they are being cleared but excel is holding some sort of reference to data rows.
If the file has not been added to for a few days then it just continues to import 10 blank rows each night filling up the table with nonsense rows.

Obviously I can't use the following otherwise I wouldn't have to drop and re-create.

    DELETE * FROM sheet1$

What am I missing/doing wrong?

There are many workarounds I can think off such as:

An sql script that deletes NULL rows from the SQL table where they are imported to but this messes up the auto increment ID field which I need to be the same as the record count.

So I could:

Import to an interim temp table and only import rows that contain data to the final table.

OR

After import copy data to another table, delete NULL rows, truncate original table and import back (retains ID increment)
...etc etc

I am sure the workarounds are not needed and complicating the solution.

Thanks


                                

Round (123 rep)

May 11, 2020, 09:16 AM

1 votes

3 answers

376 views

View: Ignore a left join if it is not used?

oracle optimization join oracle-12c integration

I have a web service that references a **view** called `gis_sidewalks_vw`. create table gis_sidewalks ( id number(10,0), last_edited_date date ); insert into gis_sidewalks (id, last_edited_date) values (1, TO_DATE('2019/01/01 00:00:00', 'yyyy/mm/dd hh24:mi:ss')); insert into gis_sidewalks (id, last_...

                                  I have a web service that references a **view** called gis_sidewalks_vw.

    create table gis_sidewalks (
        id number(10,0),
        last_edited_date date
        );
    insert into gis_sidewalks (id, last_edited_date) values (1, TO_DATE('2019/01/01 00:00:00', 'yyyy/mm/dd hh24:mi:ss'));
    insert into gis_sidewalks (id, last_edited_date) values (2, TO_DATE('2019/02/01 00:00:00', 'yyyy/mm/dd hh24:mi:ss'));
    insert into gis_sidewalks (id, last_edited_date) values (3, TO_DATE('2019/03/01 00:00:00', 'yyyy/mm/dd hh24:mi:ss'));
    insert into gis_sidewalks (id, last_edited_date) values (4, TO_DATE('2019/04/01 00:00:00', 'yyyy/mm/dd hh24:mi:ss'));
    commit;
    
    create table maximo_assets (
        id number(10,0),
        lastsyncdate date
        );
    insert into maximo_assets (id, lastsyncdate) values (1, TO_DATE('2019/04/01 00:00:00', 'yyyy/mm/dd hh24:mi:ss'));
    insert into maximo_assets (id, lastsyncdate) values (2, TO_DATE('2019/03/01 00:00:00', 'yyyy/mm/dd hh24:mi:ss'));
    insert into maximo_assets (id, lastsyncdate) values (3, TO_DATE('2019/02/01 00:00:00', 'yyyy/mm/dd hh24:mi:ss'));
    insert into maximo_assets (id, lastsyncdate) values (4, TO_DATE('2019/01/01 00:00:00', 'yyyy/mm/dd hh24:mi:ss'));
    commit;
    
    create or replace view gis_sidewalks_vw as (
    select
        s.id,
        s.last_edited_date as gis_last_edited_date,
        a.lastsyncdate as maximo_lastsyncdate,
        case 
            when s.last_edited_date > a.lastsyncdate then 1
        end sync_needed
    from
        gis_sidewalks s
    left join
        maximo_assets a
        on s.id = a.id
    );
    
    select * from gis_sidewalks_vw;

             ID GIS_LAST_EDITED_DATE MAXIMO_LASTSYNCDATE SYNC_NEEDED
     ---------- -------------------- ------------------- -----------
              1 01-JAN-19            01-APR-19                      
              2 01-FEB-19            01-MAR-19                      
              3 01-MAR-19            01-FEB-19                     1
              4 01-APR-19            01-JAN-19                     1

The view has a **left join** and a **calculated column**:

    case 
        when s.last_edited_date > a.lastsyncdate then 1
    end sync_needed

    ...

    left join
        maximo_assets a

------------------

**Scenario:**
------------

The view & the web service are **multi-purpose**.

Purpose #1:

Serve up **only** the rows where sync_needed = 1 to a cron task (synced weekly to a separate database).

Purpose #2:

Serve up **all the rows** in the view to a web map (map is in constant use).

---------

**Problem:**
-------

In purpose #1, it makes sense to join to the maximo_assets table and generate the calculated column.

However, in purpose #2, it does **not** make sense to join to the maximo_assets table and generate the calculated column.

Unsurprisingly, with purpose #2, I am experiencing **performance issues** in the web map due to the unnecessary join.

----------

**Question:**
----------

Is there a way to design the view so that it **ignores** the join to the maximo_assets table if the join is not being used?

For example:

    select
        id,
        gis_last_edited_date
        --maximo_lastsyncdate
        --sync_needed
    from
        gis_sidewalks_vw



                                

User1974 (1527 rep)

Sep 28, 2019, 10:16 PM • Last activity: Dec 19, 2019, 06:22 PM

0 votes

1 answers

112 views

Get notification about invalid views

oracle oracle-12c view event-notification integration

I have 40 views that are [integrated/synced][1] to a work order management system on a weekly basis. The views are based on GIS tables which are *notoriously messy*. Over time, the views can end up becoming **invalid**. Example: - A user deletes or renames a column in an underlying table, but fails...

                                  I have 40 views that are integrated/synced  to a work order management system on a weekly basis.

The views are based on GIS tables which are *notoriously messy*. Over time, the views can end up becoming **invalid**.

Example:

 -  A user deletes or renames a column in an underlying table, but fails to notify I.T. of the change, so the view is not updated accordingly. 
 - As a result, the view would become invalid: ORA-04063: view "ROADS_VW" has errors or ORA-00904: "FIELD1": invalid identifier.

--------

I would like to catch & fix invalid views (or fix the underlying data) before the views are synced to the work order management system each week.

**Is there a way to get Oracle to notify me about invalid views?**

For instance, if the integrations occur on Fridays, then get an email on Thursdays if any of the views are invalid.

User1974 (1527 rep)

Oct 13, 2019, 06:05 PM • Last activity: Oct 14, 2019, 08:27 AM

1 votes

1 answers

84 views

How to fix data discrepancies between columns from two tables

integration

The problem: I am working for a retail site that sells various products for cars and trucks. We want to integrate eBay and Amazon API to list our products on these stores. The problem arises when I try to send Vehicle compatibility data. Example from our DB: Product ID | Other IDs | Make | Model *nu...

                                  The problem:
I am working for a retail site that sells various products for cars and trucks. We want to integrate eBay and Amazon API to list our products on these stores. The problem arises when I try to send Vehicle compatibility data.

Example from our DB:

    Product ID | Other IDs | Make  | Model
      *number* | *number*  | Ford  | F150 Crew Cab
      *number* | *number*  | Chevy | Silverado 2500/3500
      *number* | *number*  | Dodge | Ram Pickup 2500-5500 HD

Example from eBay Master Vehicle List DB:

     Make      | Model
     Ford      | F-150
     Chevrolet | Silverado 2500
     Dodge     | Ram 2500

As seen from the examples eBay/Amazon doesn't recognize our Models. There is no easy solution like using a "split" function to split Models into Model and Sub-model for example, because every case is different. Sometimes Model fields have Submodel as well, sometimes they have a range of Submodels (2500-5500), sometimes Model is not correct (F150 instead of F-150) etc.

**Solution:**
Changing every Model in the database to a correct Model format is not an option, because our system already uses these incorrect Models for business logic.

Writing a million exceptions in the code to handle incorrect Models is not very appealing as well.

The only solution I see is to create a new table and fill it manually with correct model names and their corresponding IDs for our products. This would be very time consuming task though.

Could anyone suggest a good solution for this problem?

                                

OutFall (167 rep)

Mar 6, 2017, 07:47 PM • Last activity: Oct 31, 2018, 06:02 PM

2 votes

1 answers

263 views

Data from different tables into one table

sql-server storage integration

I've requirement where I need to store information from different sources in my system and the process that data. When I say from different sources it will be completely different table structure from all the sources and these sources will be infinite. If I've to maintain the same table structure of...

                                  I've requirement where I need to store information from different sources in my system and the process that data. When I say from different sources it will be completely different table structure from all the sources and these sources will be infinite. 

If I've to maintain the same table structure of source then I've to create multiple tables for each source and I'll have thousands of sources so my database will be bloated with lot of tables. All this information is accessed within my system so for each source I need to maintain mapping information with that I can query particular table to get that source data.

Hence I came with a single table structure to store that data and it looks like below - tables with column information:

    TableMetaData 
    Id, TableName, SourceId

    TableFieldMetaData
    FieldId, FieldName, TableMetaDataId

    TableFieldValues
    Id, MetaDataFieldId, Value

Example source table

    Customer
    CustId, Name, Location
    1, ABC, London

above table will be stored as below in my three tables:

    TableMetaData
    1, Customer, XYZSource

    TableFieldMetaData
    1, CustId, 1
    2, Name, 1
    3, Location, 1

    TableFieldValues
    1, 1, 1
    2, 2, ABC
    3, 3, London

I want to know whether this is the best approach to store data or not? I know this single table will have millions of records and retrieving the data will be very tedious too. Is there any better approach for storing this kind of data? Or should I go with multiple tables approach which I mentioned initially where every source will have replica of their tables in my system.

What is the industry standard for this?
                                

NewbieDev (21 rep)

Aug 26, 2016, 05:09 PM • Last activity: Jan 18, 2018, 06:27 PM

2 votes

1 answers

4504 views

Register dll to use in SSIS Script Component

sql-server integration ssis-2016

I've been asked to register an existing dll file so that it can be referenced in an SSIS script component. I have a dim and distant memory of doing this using gacutil.exe so that was my initial go to. However, gacutil.exe is not present on the integration server, presumably because the full version...

                                  I've been asked to register an existing dll file so that it can be referenced in an SSIS script component.

I have a dim and distant memory of doing this using gacutil.exe so that was my initial go to. 

However, gacutil.exe is not present on the integration server, presumably because the full version Visual Studio is not installed there, only the Data Tools shell. I have also tried to copy the file into C:/windows/assembly as I read this would work, but nothing happens when I drop the file in it just cancels out.

How can I do this without gacutil.exe? Is there a way or will have to install full VS SDK to do it.

Molenpad (1814 rep)

Nov 16, 2017, 05:53 PM • Last activity: Nov 16, 2017, 06:03 PM

10 votes

3 answers

304 views

Looking for advice on how to integrate data from 100+ client DB's into a centralized reporting database

sql-server replication reporting integration

I am a SQL Developer (not DBA or Architect) for a small (~50 employees) SaaS company. I am tasked with figuring out how to: 1. Offload operational reporting from our 100+ OLTP databases 2. Allow those reports to run against data from multiple client databases 3. Position our company to provide more...

                                  I am a SQL Developer (not DBA or Architect) for a small (~50 employees) SaaS company. I am tasked with figuring out how to:  

 1. Offload operational reporting from our 100+ OLTP databases
 2. Allow those reports to run against data from multiple client databases
 3. Position our company to provide more analytics-based solutions in the future

I have read a number of articles on various technologies like transactional replication (specifically the many-to-one/central subscriber model), SQL service broker, log shipping, Change Tracking (CT), and Change Data Capture (CDC, my understanding is this is enterprise-only), and I am not sure what path is best to pursue.

I am hoping some of you with integration expertise may have encountered a setup similar to ours and be able to point me down a successful path or direct me to some resources that would be helpful.

Due to cost constraints, our solution must work within SQL Server Standard Edition. Also, the solution must be reasonable to support/maintain within our small organization.

**Basic configuration:**

We currently have 100+ individual client databases, most deployed on SQL servers at our data center, but some deployed on client servers within their data center that we can remote into. These are all SQL Server 2008 R2 databases, but we are planning to upgrade to SQL 2016 soon.

We use database projects and dacpacs to ensure the schema is the same across all client databases that would be integrated. However, since we do not force all clients to upgrade to new versions at the same time, some schema differences are possible between upgrades. The solution must be flexible enough not to break if client A is on software version 1.0 and client B is on version 1.1.

Operational reports are currently run directly from each client's OLTP database. We are concerned about the impact this will have on the application's performance if we do not offload it.

**High-Level Requirements:**

Our clients are hospital sterile processing departments (SPD’s) who want up-to-the-moment reports on what they’ve processed so far, where inventory is, etc. SPD's process inventory around the clock, including weekends and holidays. Since one of the main purposes of this effort is to better support operational reporting, we would like the data to be as close to real-time as possible to continue meeting clients’ needs. 

Currently we have some SPD’s in separate databases that are actually part of the same hospital system. These clients want the ability to report against all the SPD’s in their system. 

Strategically speaking, we would like the ability to easily aggregate data across all our clients to support our internal analytics initiatives. Our expectation is that we would be able to use the collected operational data as a source for data marts/warehouse.

**Thoughts so far:**  

Transactional replication seems like it would provide the most "real-time" solution. I found this response to be especially helpful, but I am concerned that with the potential for schema differences it will not work for us: https://dba.stackexchange.com/questions/43931/sql-server-many-to-one-replication/43995 

Log shipping doesn't sound ideal given that the log cannot restore while queries are active. I either have to kick everyone out so the log can restore or the data will become stale. I am unclear as to whether this method could be used to centralize data from multiple databases, since each shipped log would only be for the individual database it came from.

Using SQL service broker, latency may be unpredictable if a queue were unable to keep up with the number of messages to process.  

CT only identifies a version for each table row. Latency would be dependent on how quickly we could process something like an SSIS package against each database to retrieve the data and insert it in a central repository.

Do we need to consider replicating each database individually and then perhaps use some sort of data virtualization technique to combine data from the various replicated sources?

Any advice or direction you are willing to provide would be greatly appreciated.

bperry (101 rep)

Jun 28, 2017, 09:20 PM • Last activity: Sep 1, 2017, 10:17 PM

1 votes

1 answers

60 views

Data integration name of best practices to avoid corruption, caused by becoming out of sync?

best-practices data-integrity integration

I have identified an issue in the design of a system and I am trying to find the right language to describe the cause of the issue so I can report it to managers, and get the issue fixed. The problem is we have a web service syncing data by copying changes (based on a datetime of last edit column) i...

                                  I have identified an issue in the design of a system and I am trying to find the right language to describe the cause of the issue so I can report it to managers, and get the issue fixed.

The problem is we have a web service syncing data by copying changes (based on a datetime of last edit column) in a student enrollments table from one system to another system. But because the sources system deletes rows when a student drops a subject, drops are not being synced. 

To be able to detect the drops the system needs to maintain a state of what it has copied to the second system so it can recognize when rows have been dropped or needs to validate that they tables are in sync, but I am not sure how to describe this and reference a specific best practice or methodology to backup my argument. 

Does anyone know the name of the best practices or data integrity rules that would prevent this type of logical error in the design?

user802599 (463 rep)

May 28, 2017, 11:46 PM • Last activity: Jun 5, 2017, 04:09 AM

5 votes

3 answers

1863 views

Can PostgreSQL support integration test with some kind of throwaway overlay?

postgresql snapshot testing unit-test integration

It's a common problem to write integration tests that include a database. If the test changes the database then it could effect other tests or the next run of itself. I know that I could wrap my test in a transaction and rollback the transaction after the test run. But it would be very nice if Postg...

                                  It's a common problem to write integration tests that include a database. If the test changes the database then it could effect other tests or the next run of itself.

I know that I could wrap my test in a transaction and rollback the transaction after the test run. But it would be very nice if PostgreSQL could provide some kind of global snapshoting or throwaway overlay. In an ideal case such a feature would cover all state of the database including schemas and stored procedures.

Thomas Koch (151 rep)

Nov 12, 2013, 06:13 PM • Last activity: Dec 9, 2016, 10:45 AM

1 votes

0 answers

60 views

Efficient methods/tools for managing a data warehouse pulling data from multiple database applications

sql-server sql-server-2012 data-warehouse integration

Until now I have used an on premise SQL Server 2012 and performed manual ETL operations to create/manage a data warehouse that contains data from multiple database application from different organizations. I get the source data [several dozen csv/txt files some of them 50 MB in size] every month usi...

                                  Until now I have used an on premise SQL Server 2012 and performed manual ETL operations to create/manage a data warehouse that contains data from multiple database application from different organizations. 

I get the source data [several dozen csv/txt files some of them 50 MB in size] every month using 2 methods: 1) execute sql queries on the source databases (for the client-server applications) and 2) using a web-based reporting tool (for the web-based applications). The destination are databases in my SQL Server based data warehouse, which I manage using SSMS/SSIS. This manual process easily takes several hours each month to update the data warehouse. I now want to update the data on a more frequent basis (possibly daily) for better analytics/reporting. I suppose it would involve gaining direct access to the source database applications?

I need to get some high level information on the methods/tools out there to accomplish this and make this process efficient/automated.

Sid (11 rep)

Oct 29, 2016, 11:01 PM • Last activity: Oct 30, 2016, 11:43 AM

17 votes

4 answers

77951 views

Not able to create SSISDB catalog

sql-server ssis sql-server-2014 integration

Getting the error below while trying to create a catalog in sql server 2014 integration services. Any idea what I missed in installation or anywhere else? > The catalog backup file 'C:\Program Files\Microsoft SQL > Server\120\DTS\Binn\SSISDBBackup.bak' could not be accessed. Make sure > the database...

                                  Getting the error below while trying to create a catalog in sql server 2014 integration services.
Any idea what I missed in installation or anywhere else?

> The catalog backup file 'C:\Program Files\Microsoft SQL
> Server\120\DTS\Binn\SSISDBBackup.bak' could not be accessed. Make sure
> the database file exist, and the SQL Server Service account is able to
> access it(Microsoft.SqlServer.IntegrationServices.Common.ObjectModel)
                                

Radhi (323 rep)

Sep 22, 2014, 10:23 AM • Last activity: Jun 15, 2016, 07:54 AM

1 votes

1 answers

926 views

decrypt data in column before load into destination

sql-server ssis etl integration

I have a table named city which contains two columns - city_id (int) - city_name (varbinary) -- encrypted column I want to **extract** data from this table, **transform** (decrypt the city_name) and **load** (decrypted city_name) into new table (destination). I used the following query in oledb sour...

                                  I have a table named city which contains two columns

 - city_id (int)
 - city_name (varbinary) -- encrypted column

I want to **extract** data from this table, **transform** (decrypt the city_name) and **load** (decrypted city_name) into new table (destination).

I used the following query in oledb source in ssis. but it returns city_name column as NULL.

select city_id, CONVERT(nvarchar(50), decryptbykeyautocert(cert_id('Usercert'),NULL,city_name)) as city_name
from city

The above query works fine in sql server management studio query editor.

plz guide how to decrypt data before insert into destination.

regards,

user1543848 (75 rep)

Mar 3, 2016, 09:21 AM • Last activity: Mar 3, 2016, 04:13 PM

Showing page 1 of 20 total questions