Database Administrators

Q&A for database professionals who wish to improve their database skills

Latest Questions

0 votes

2 answers

95 views

Cannot create MATERIALIZED VIEW with REFRESH ON COMMIT on query with multiple joins

In oracle, I have the following tables (ugly pseudo SQL below) : CLIENT (CLIENT_ID NUMBER NOT NULL /*PK*/, GROUP_ID NUMBER NOT NULL) GROUP (GROUP_ID NUMBER /*PK*/, GROUP_NAME VARCHAR2) GROUP_DATA (GROUP_ID NUMBER /*PK*/, COUNTRY_ID VARCHAR2) and I am trying to create the following materialized view...

                                  In oracle, I have the following tables (ugly pseudo SQL below) : 

    CLIENT (CLIENT_ID NUMBER NOT NULL /*PK*/, GROUP_ID NUMBER NOT NULL)
    GROUP (GROUP_ID NUMBER /*PK*/, GROUP_NAME VARCHAR2)
    GROUP_DATA (GROUP_ID NUMBER /*PK*/, COUNTRY_ID VARCHAR2)

and I am trying to create the following materialized view : 

    CREATE MATERIALIZED VIEW MV_CLIENT
        REFRESH COMPLETE ON COMMIT
    AS SELECT CLIENT.CLIENT_ID,
    GROUP.GROUP_NAME,
    GROUP_DATA.COUNTRY_ID
    FROM CLIENT
    INNER JOIN GROUP ON GROUP.GROUP_ID = CLIENT.GROUP_ID
    INNER JOIN GROUP_DATA ON GROUP_DATA.GROUP_ID = CLIENT.GROUP_ID

I am getting the following error : 
ORA-12054: cannot set the ON COMMIT refresh attribute for the materialized view

I do not understand the issue as GROUP_ID is guaranteed to exist only once in each of the GROUP and GROUP_DATA tables. I feel like tracking the updates (especially for refresh COMPLETE) should not be a problem. 

Please note that creating the materialized view on CLIENT+GROUP or CLIENT+GROUP_DATA works but CLIENT+GROUP+GROUP_DATA does not. I have tried to create the MV logs and changed the COMPLETE to FAST but the error stays the same. 

Now I know that a solution would be to merge GROUP and GROUP_DATA in a single table but this is not the solution I need. So my question is, is this an oracle limitation or am I missing something ? And if this is a limitation, does anyone know the rationale for this ?
Thanks !

                                

vpi (21 rep)

May 5, 2025, 11:56 AM • Last activity: Aug 1, 2025, 06:28 PM

1 votes

2 answers

155 views

Efficient approach for filtering, paging and sorting across multiple tables with multiple key columns

sql-server query-performance index performance-tuning materialized-view

I have a query that retrieves data from a complex data structure involving multiple tables. The data can be filtered using a tree component that allows selection of different keys from different tables, sometimes filtering across two or three tables at once (for example filtering on OrganisationID a...

                                  I have a query that retrieves data from a complex data structure involving multiple tables. The data can be filtered using a tree component that allows selection of different keys from different tables, sometimes filtering across two or three tables at once (for example filtering on OrganisationID and the LocationID within this organisation).

In addition to filtering, the data must be paged and sorted. I'm using a tableview in the ui that allows sorting by a single selected column (only one ORDER BY clause at a time).

The filtering targets specific columns from different tables. For example:
- OrganisationID (from the Organisation table)
- LocationID (from the Location table)
- FolderID (from the Folder table)

Sometimes the filter must apply across multiple of these keys simultaneously, for example filtering all related OrganisationIDs and LocationIDs and retrieving a paged result set.

**Current approach:**
I considered creating multiple indexed views based on these key columns, each with many indices to support different sorting options. This would lead to about 6 views, each with around 26 indices. Since indices are pre-sorted this does provide with the ability to quickly filter and sort. But having a very large amount of indices like this on single tables let alone on an indexed view spanning multiple tables seems very contrary to index best practice.

**Question:**
What are efficient strategies to implement filtering, paging, and sorting across multiple fixed keys from different tables at once?

Here is a dbfiddle which illustrates what I am trying to describe.
It is a bit simplified, but it catches enough of the gist of what I am trying to accomplish.


**Additional Information:**
Filtering, sorting, and paging are all currently implemented. However, due to the nature of filtering and sorting across different tables, any sort operation that lacks an appropriate index severely impacts paging. Without an index, SQL Server must scan the entire filtered set to apply sorting and paging. For some filter combinations, this results in scanning over 500,000 rows.

Although most individual queries run reasonably fast, the system executes them frequently and concurrently for many users. This leads to significant memory grants, resulting in high memory pressure and a drastic drop in the Page Life Expectancy (PLE), affecting overall performance.
For the worst case scenario I can find a query costing 166233968 logical reads.
                                

Sotem (19 rep)

Jun 2, 2025, 11:53 AM • Last activity: Jul 16, 2025, 07:51 AM

0 votes

1 answers

233 views

Sharding a Materialized View

postgresql query-performance optimization materialized-view

I am a Postgres 15.3 AWS RDS instance and have a database that centers around one table, called `posts` that has 20 other relations to build the full object. The full query is about a second altogether and there are indexes on all the join keys. So in order to make reads fast we have a materialized...

                                  I am a Postgres 15.3 AWS RDS instance and have a database that centers around one table, called posts that has 20 other relations to build the full object. The full query is about a second altogether and there are indexes on all the join keys. So in order to make reads fast we have a materialized view, mv_posts that represents the full table with ~20 joins and subselects and everything the front end needs (see below). 

Here is the math that scares me, I have about 5000 rows and building the materialized view takes 2050 ms. This is not scalable. I am considering the following implementation. 

Create a new materialized view mv_new_posts that keeps new posts created that day, and join it 1-1 to the mv_posts materialized view in a normal view and use that to query the posts object. I would then have to truncate mv_new_posts and regenerate mv_posts at midnight every night. This feels ugly but would scale for everything except when a posts row is updated, which would require regenerating the base view in some form. 

I have considered the following other solutions:

 - Partition Tables: The issue is that I dont have the row volume for this. I have not benchmarked it but do not beleive that with my small number of rows it would work.
 - pg_ivm  but the restrictions on outer (i.e. left) joins and aggregation functions make this impossible.
 - Rolling my own, i.e. creating a table and triggers so that a single insert/update on a dependent table results in a single row update in the denormalized table that would replace the materialized view. This I have done on a smaller scale but do not feel would be worth the work here. In particular it would be a maintenance nightmare.

The materialized view definition is below, in case there is an obvious optimization I have missed. 

    CREATE MATERIALIZED VIEW mv_posts AS(
    SELECT
        p.*,
        (
            setweight(to_tsvector('english', COALESCE(p.title, '')), 'A') ||
            setweight(to_tsvector('english', COALESCE(p.description, '')), 'B') ||
            setweight(to_tsvector('english', COALESCE(pc.email, '')), 'A') ||
            setweight(to_tsvector('english', COALESCE(d.name, '')), 'B')
        ) AS search_document,
        CASE WHEN pt.slug = 'ping' THEN True ELSE False END AS ping,
        ARRAY_REMOVE(vis.all_user_ids, NULL) AS all_visibility_user_ids,
        ARRAY_REMOVE(vis.user_ids, NULL) AS visibility_user_ids,
        ARRAY_REMOVE(vis.group_ids, NULL) AS visibility_group_ids,
        ARRAY_REMOVE(vis.org_ids, NULL) AS visibility_org_ids,
        CASE WHEN hm.id IS NULL THEN NULL ELSE ROW_TO_JSON(hm.*) END AS headline_file,
        COALESCE(JSON_AGG(DISTINCT JSON_BUILD_OBJECT(
                'id', pm.id,
                'file_name', pm.file_name,
                'content_type', pm.content_type,
                'name', pm.name,
                'orphan', pm.orphan,
                'folder', pm.folder,
                'extension', pm.extension,
                'ordinal', pmj.ordinal)::JSONB
            ) FILTER (
                WHERE NOT (pm.id::TEXT = 'null') ), '[]'::JSON
            ) AS file,
        COALESCE(JSON_AGG(DISTINCT pmj.file_id) FILTER ( WHERE NOT (pmj.file_id IS NULL) ), '[]'::JSON) AS file_ids,
        COALESCE(JSON_AGG(DISTINCT pu) FILTER ( WHERE NOT (pu.id::TEXT = 'null') ), '[]'::JSON) AS users,
        COALESCE(JSON_AGG(DISTINCT pu.email) FILTER ( WHERE NOT (pu.id::TEXT = 'null') ), '[]'::JSON) AS user_emails,
        pt.id AS post_type_id,
        row_to_json(pt.*) AS post_type,
        row_to_json(pit.*) AS incident_type,
        row_to_json(pc.*) AS creator,
        d.id AS org_id,
        JSON_BUILD_OBJECT(
            'id', d.id,
            'name', d.name,
            'admin_id', d.admin_id,'group_id', d.group_id,
            'creator_id', d.creator_id,
            'law_enforcement_partner', d.law_enforcement_partner,
            'slug', d.slug,
            'state', d.state,
            'location', JSON_BUILD_OBJECT(
                    'id', dl.id,
                    'address', dl.address,
                    'city', dl.city,
                    'state_province', dl.state_province,
                    'postal_code', dl.postal_code,
                    'geojson', ST_AsGeoJSON(dl.geom)::JSON
                )
            ) AS org,
        ARRAY_REMOVE(ARRAY_AGG(pp.user_id),NULL) AS users_saved,
        COALESCE(
            JSON_AGG(
            DISTINCT
                JSON_BUILD_OBJECT(
                    'id', v.id,
                    'make', v.make,
                    'year', v.year,
                    'state', v.state,
                    'model', v.model,
                    'color', v.color,
                    'license_plate', v.license_plate,
                    'vin', v.vin,
                    'description', v.description,
                    'file', (
                                SELECT COALESCE(JSON_AGG(vm.*) FILTER ( WHERE NOT (vm.id::TEXT = 'null') ), '[]'::JSON)
                                FROM file vm
                                JOIN car_file_join vmj ON vmj.file_id = vm.id
                                WHERE vmj.car_id=v.id
                    )
                )::JSONB
            ) FILTER ( WHERE NOT (v.id::TEXT = 'null') ), '[]'::JSON)
            AS cars,COALESCE(
            JSON_AGG(
            DISTINCT
                JSON_BUILD_OBJECT(
                    'id', psn.id,
                    'first_name', psn.first_name,
                    'last_name', psn.last_name,
                    'aliases', psn.aliases,
                    'description', psn.description,
                    'hair', psn.hair,
                    'build', psn.build,
                    'sex_id', psn.sex_id,
                    'age', psn.age,
                    'eye_color', psn.eye_color,
                    'height', psn.height,
                    'weight', psn.weight,
                    'race', psn.race,
                    'identifying_characteristics', psn.identifying_characteristics,
                    'country', psn.country,
                    'guy_type', row_to_json(psnt.*),
                    'location', row_to_json(psnl.*),
                    'file', (
                                SELECT COALESCE(JSON_AGG(psnm.*) FILTER ( WHERE NOT (psnm.id::TEXT = 'null') ), '[]'::JSON)
                                FROM file psnm
                                JOIN guy_file_join psnmj ON psnmj.file_id = psnm.id
                                WHERE psnmj.guy_id=psn.id
                    )
                )::JSONB
            ) FILTER ( WHERE NOT (psn.id::TEXT = 'null') ), '[]'::JSON)
            AS guys,COALESCE(
            JSON_AGG(
            DISTINCT
                JSON_BUILD_OBJECT(
                    'id', l.id,
                    'address', l.address,
                    'city', l.city,
                    'state_province', l.state_province,
                    'postal_code', l.postal_code,
                    'geojson', ST_AsGeoJSON(l.geom)::JSON,
                    'location_string', l.location_string
                )::JSONB
            ) FILTER ( WHERE NOT (l.id::TEXT = 'null') ), '[]'::JSON) AS locations,
            COALESCE(
                ARRAY_AGG(DISTINCT l.location_string) FILTER ( WHERE l.location_string IS NOT NULL ),
                '{}'::TEXT[]
            )AS location_strings,
        ST_UNION( l.geom )::geography  AS geom
    FROM posts p
    INNER JOIN visibility vis ON vis.post_id = p.id
    LEFT JOIN file hm ON hm.id = p.headline_file_id
    LEFT JOIN post_file_join pmj ON pmj.post_id = p.id
    LEFT JOIN file pm ON pm.id=pmj.file_id
    LEFT JOIN auth_users AS pc ON p.creator_id=pc.id
    LEFT JOIN post_user_join puj ON puj.post_id=p.id
    LEFT JOIN auth_users pu ON pu.id=puj.user_id
    LEFT JOIN types pt ON pt.id=p.type_id
    LEFT JOIN incident_types pit ON pit.id = p.incident_type_id
    LEFT JOIN post_location_join plj ON plj.post_id = p.id
    LEFT JOIN locations l ON plj.location_id = l.id
    LEFT JOIN post_guy_join ppj ON ppj.post_id = p.id
    LEFT JOIN guys psn ON psn.id = ppj.guy_id
    LEFT JOIN guy_types psnt ON psnt.id = ppj.guy_type_id
    LEFT JOIN locations psnl ON psnl.id= psn.current_location_id
    LEFT JOIN post_car_join pvj ON pvj.post_id=p.id
    LEFT JOIN cars v ON v.id = pvj.car_id
    LEFT JOIN pinned_posts pp ON pp.post_id = p.id
    LEFT JOIN orgs d ON pc.org_id = d.id
    LEFT JOIN locations dl ON d.primary_location_id = dl.id
    WHERE p.active
    GROUP BY
        p.id,
        vis.all_user_ids,
        vis.user_ids,
        vis.org_ids,
        vis.group_ids,
        hm.id,
        pc.id,
        pt.id,
        pit.id,
        d.id,
        dl.id);
                                

Frank Conry (123 rep)

Jan 28, 2024, 01:07 PM • Last activity: Jun 12, 2025, 04:09 AM

1 votes

1 answers

233 views

Time based incremental load query optimization and index design

oracle index-tuning materialized-view

**Problem** I try to optimize an extract/load query on a highly normalized table structure. Extract/load is implementing incremental loading using a datetime column to determine only records which have changed since the last extract/load. The extract/load query after joins selects hundreds of millions of records. Each individual table joined has a date column "last_modified" which indicates when a record was modified last. To determine the combined last modified date of an joined record I select the greatest value of "last_modified" of each individual table. This solution seams not to scale as with hundred of millions of records the filter predicate on the combined "last_modified" does not use any index and requires full table access on all individual tables and takes too long to complete (1h+) just to filter after the combined "last_modified" information. **Background** - Oracle database (multiple versions starting from 11g) - I'm not a dba or have much Oracle know how - Extract/load process (especially delta loading) is not time critical. If I can guarantee that data is up to date/extracted every 24h its fine. **Sample** The following is a simplified sample setup. Assume there is a index on reference_key.

CREATE TABLE shipment (
    key VARCHAR2(10) NOT NULL,
    last_modified DATE,  -- represents when a record in shipment was last modified
    CONSTRAINT shipment_pk PRIMARY KEY (key)
);

CREATE TABLE invoice (
    key VARCHAR2(10) NOT NULL,
    reference_key VARCHAR2(10),
    last_modified DATE, -- represents when a record in invoice was last modified
    CONSTRAINT invoice_pk PRIMARY KEY (key), 
    CONSTRAINT invoice_fk_shipment FOREIGN KEY (reference_key) REFERENCES shipment(key)
);

CREATE TABLE line (
    key VARCHAR2(10) NOT NULL,
    reference_key VARCHAR2(10),
    last_modified DATE, -- represents when a record in line was last modified   
    CONSTRAINT line_pk PRIMARY KEY (key),
    CONSTRAINT line_fk_invoice FOREIGN KEY (reference_key) REFERENCES invoice(efksid)
);

--To get flat de-normalized records a join like the following is needed

SELECT
    shipment.key || '-' || line.key || '-' || invoice.key AS key,
    GREATEST(line.last_modified, invoice.last_modified, shipment.last_modified) AS last_modified
FROM
    line
    JOIN invoice ON line.reference_key = invoice.key
    JOIN shipment ON invoice.reference_key = shipment.key;

To extract/load data since the next 100000 records from a given date I could run the following.

SELECT *
FROM (
    SELECT
        shipment.key || '-' || line.key || '-' || invoice.key AS key,
        GREATEST(line.last_modified, invoice.last_modified, shipment.last_modified) AS last_modified
    FROM
        line
        JOIN invoice ON line.reference_key = invoice.key
        JOIN shipment ON invoice.reference_key = shipment.key)
WHERE last_modified > '${target_last_modified}'
ORDER BY last_modified;
FETCH NEXT 100000 ROWS ONLY;

The problem this query does not finish/feasible with hundred of millions of records (can even go up to 1-2 billion records). **What I have tried** - Indices on the individual last_modified columns seams not to help as it is not considered? when using greatest and still does a full table access scan. - A functional/and computed column(with an index) seams also not an option as it would need to incorporate multiple tables which is not possible? in Oracle. - Materialized views to preselect the combined last_modified column and create an index on it. If possible I would like to avoid those. I think I think I could not use a refresh-on-commit materialized view due to the high transactions volume on used tables. **Questions** - In this scenario are there any tricks using indices which could speed up my query using the last_modified field. - If materialized views are my only options what kind of refresh strategies would be feasible. A full refresh would (very likely) take too long. Are there any incremental refresh strategies - if yes would I not just move my extract/load problem from the query to the materialized view? Thanks for any input/hint given!

Clemens K (11 rep)

Mar 8, 2023, 02:46 PM • Last activity: Jun 4, 2025, 02:01 PM

2 votes

1 answers

1096 views

Implications of many materialized views in Postgres?

postgresql performance materialized-view

I'm working on a system that includes a scheduling component, which has support for recurring events. After reading many, many posts on storing recurring events, it was suggested that they be calculated for a specific time period using a Postgres function and then stored in a materialized view to av...

                                  I'm working on a system that includes a scheduling component, which has support for recurring events. After reading many, many posts on storing recurring events, it was suggested that they be calculated for a specific time period using a Postgres function and then stored in a materialized view to avoid having to recalculate every time.

The system I'm working on serves multiple businesses, each with their own calendar and customers. My thought was that a materialized view would be created on a per-business basis. However, if the service is successful, there could be well over a 1000 businesses using the system. That said, are there issues with having that many materialized views? And if so, is there a better pattern that would scale as the service grows?

FYI: I'm running Postgres 9.4

jdixon04 (219 rep)

Oct 21, 2015, 05:26 AM • Last activity: Jun 1, 2025, 01:04 AM

1 votes

1 answers

268 views

Insert into indexed view?

sql-server view materialized-view

I give an example to show my problem. I create 2 table as the following: ``` CREATE TABLE a ( id INT ) CREATE TABLE b ( name NVARCHAR(10), id INT ) ``` Then insert data into these tables ``` INSERT INTO a VALUES(1),(2) INSERT INTO b VALUES('Kan',1),('Michael',2) ``` Next,I create indexed view that j...

I give an example to show my problem. I create 2 table as the following:

CREATE TABLE a
(
	id INT
)

CREATE TABLE b
(
	name NVARCHAR(10),
	id INT
)

Then insert data into these tables

INSERT INTO a VALUES(1),(2)
INSERT INTO b VALUES('Kan',1),('Michael',2)

Next,I create indexed view that join these tables via id

CREATE VIEW a_b
WITH SCHEMABINDING
AS
(
SELECT a.id,b.name FROM a INNER JOIN b ON a.id=b.id
)

CREATE UNIQUE CLUSTERED INDEX ix_a_b
ON a_b(id)

INSERT INTO a_b VALUES (3,'Joe') will be wrong

As I know about view: + views do not store data,just saved queries + but indexed view that stored data physically like table in the database.So why I don't insert,delete from a_b? And what I know about VIEW is right or wrong?Help me improve?

Dunguyen (51 rep)

Apr 9, 2020, 02:11 PM • Last activity: May 21, 2025, 09:09 AM

2 votes

1 answers

271 views

Oracle multithreaded JDBC bulk select high-performance data export

oracle performance update locking materialized-view

The goal of our solution is to export as fast as possible CLOB data from an events table. The concept is to logically partition the data with a partition_id column such that every thread from the external application (e.g. Spring Boot) does not interfere with the other threads. Approaches like CDC w...

                                  The goal of our solution is to export as fast as possible CLOB data from an events table. The concept is to logically partition the data with a partition_id column such that every thread from the external application (e.g. Spring Boot) does not interfere with the other threads. 
Approaches like CDC with Streaming APIs have been discussed and cannot be used in our case. Our goal is to use "basic" Oracle 12c/19c features that were highly optimized like the cost-based optimizer. There is a solution in place with advance queueing which is by far too slow and for internal reasons cannot be followed up. 

We have discussed the following options.


Option 1, 'Mark events as processed in the events table' 
--------------------------------------------------------
The external thread of the application (e.g. Spring Boot) ...
1. registers via the partitions table for a dedicated partition (e.g. partition_id=10),
2. bulk selects a certain amount of data (e.g. rownum  ack),
4. de-registers the execution of the thread in the partitions table for another thread.
-

    create table events(
        id               number(15),
        status_id        number(2), -- whether the events was successfully uploaded
        insert_timestamp timestamp,
        partition_id     number(9), --> partitions
        event            clob       -- the events paylod to export
    );

    create table partitions
    (
        id               number(15),
        is_running       varchar2(1),  -- when a thread of the external application reserved a partition for an execution, a flag is set e.g. '+'
        last_execution   timestamp     -- when an execution of an export was done, the current timestamp is set
    );
    
    create index msg_idx on events (partition_id, status_id, id);

Question Option 1: As there will be multiple threads inserting new data, may be 10-50 threads bulk selecting data, and as many threads bulk updating the same table - however, always different rows!! - is there any risk that Oracle cannot handle the processing efficiently and maybe e.g. locks the whole table etc.? What would be important to consider using this approach?


Option 2, 'Save the highest processed commit sequence id ORA_ROWSCN in a partitions table as offset' 
--------------------------------------------------------------------------------------------------------
This option would make use of the ROWDEPENDENCIES feature. As the insert/commit sequence is known as number it can be used as "last_processed offset". The processing is similar for the registration of thread. The selection on the events table would start based on the last offset ORA_ROWSCN (like Apache Kafka consumers).

    create table events
    (
        id               number(15),
        insert_timestamp timestamp,
        partition_id     number(9), --> partitions
        event            clob       -- the event paylod to export
    ) ROWDEPENDENCIES;
    
    create table partitions
    (
        id               number(15),
        is_running       varchar2(1),  -- when a thread of the external application reserved a partition for an execution, a flag is set e.g. '+'
        last_offset      number(15),   -- ORA_ROWSCN of the last execution
        last_execution   timestamp     -- when an execution of an export was done, the current timestamp is set
    );

Question Option 2: The advantage of this option would be, that events are only inserted and selected, never updated. However, as ORA_ROWSCN cannot be indexed, we would create an appropriate table with a materialized view. 
- How should be the refresh of the materialized view configured ideally for such cases?
- What kind of risks comes with these solutions in term of hard to reproduce production issues?


Option 3, 'Save all processed event ids in a separate table' 
------------------------------------------------------------
The partition registration is the same as with the other options. The difference to option 1 is, that the events that were processed are not updated in the events table with a status, but inserted in a dedicated table. 
The selection would join the events and processed_events table to detect the events to be processed either with 'join', 'not in' etc.

    create table events
    (
        id               number(15),
        insert_timestamp timestamp,
        partition_id     number(9), --> partitions
        event            clob       -- the event paylod to export
    );
    
    create table processed_events(
    	id               number(15) -- the event that where processed
    	,...
    )
    
    create table partitions
    (
        id               number(15),
        is_running       varchar2(1),  -- when a thread of the external application reserved a partition for an execution, a flag is set e.g. '+'
        last_execution   timestamp     -- when an execution of an export was done, the current timestamp is set
    );

Question Option 3: Are there any known performance aspects that should be considered?


For sure we will try the variants. However, the behavior on test systems with few CPUs might be quite different than on an instance with more than 100 CPUs etc.  Therefore, we are very grateful for feedback of any kind.

Thanks for reading :)












                                

d.sndrs (21 rep)

Oct 1, 2021, 03:17 PM • Last activity: May 19, 2025, 12:02 AM

3 votes

1 answers

281 views

Hiding an index on a view to other queries in SQL Server

sql-server performance materialized-view hints

I have a View that has an Index that I'd like to use for known queries. The issue is the existence of this Indexed View causes other queries to perform poorly. Is there a way to hide the Index or View so that it is only used when specifically included in a hint, or the ilk?

                                  I have a View that has an Index that I'd like to use for known queries.  The issue is the existence of this Indexed View causes other queries to perform poorly.  Is there a way to hide the Index or View so that it is only used when specifically included in a hint, or the ilk?
                                

CWMjr (63 rep)

Mar 6, 2025, 08:24 PM • Last activity: May 16, 2025, 03:12 AM

1 votes

2 answers

345 views

Creating a clustered index on a view, minimising contention

sql-server sql-server-2016 clustered-index concurrency materialized-view

I have a table of 100 billion rows with an identity bigint column that is the clustered primary key. I have a schema bound view on that table that is filtered down to the last 500 million rows or so. I want to create a clustered index on the view that incorporates the primary key and a few other col...

                                  I have a table of 100 billion rows with an identity bigint column that is the clustered primary key.

I have a schema bound view on that table that is filtered down to the last 500 million rows or so. I want to create a clustered index on the view that incorporates the primary key and a few other columns.

The table is highly transactional, so can't really be locked for more than a minute at a time.

Will the creation of the index on the view cause contention directly on the underlying table during the creation?

If so, is it possible for me to create the index on my schema bound view in an incremental manner (i.e. somehow pause part of the way through creation, to allow the table to catch up on its transaction backlog, then resume creation).

I have to be careful with overall server contention as well; I've seen heavy operations against one table like this slow up my whole server before.

Essentially, I'm looking for the most efficient way to create the index on my view to minimize contention.

I'm using an indexed view because I'm using the HASHBYTES function to generate a hash of the row, then creating an index on that hash and the primary key of the table. This allows me to quickly compare 500 million rows of data with another large dataset for any changes in data. (I realize I probably could do this in a computed column on the table itself as well, but I figured there may be less contention if the data is a materialized copy as an indexed view.)

The table is not partitioned.

J.D. (40893 rep)

Dec 16, 2019, 08:26 PM • Last activity: May 8, 2025, 08:25 AM

0 votes

2 answers

707 views

Postgresql MVIEW refresh from Oracle Materialized View Log

postgresql oracle materialized-view dblink

Is it possible for me to have Oracle MVIEW log in my source database and refresh a Postgresql database MVIEW using this MVIEW log? I know I can query the MVIEW log, I have done that. But since it is not a good practice to query a log, is it even possible to use an MVIEW log from Postgresql just like...

                                  Is it possible for me to have Oracle MVIEW log in my source database and refresh a Postgresql database MVIEW using this MVIEW log? I know I can query the MVIEW log, I have done that. But since it is not a good practice to query a log, is it even possible to use an MVIEW log from Postgresql just like an Oracle MVIEW uses a log?

If that's not possible, is a dblink from Postgresql to Oracle possible?

MacJava (1 rep)

Apr 23, 2020, 01:33 PM • Last activity: Apr 19, 2025, 06:04 AM

1 votes

2 answers

488 views

Oracle MV requires object type to be defined as FINAL?

oracle materialized-view inheritance user-defined-type

I want to create an Oracle 18c **materialized view (MV)** on a table that has a user-defined datatype called [ST_GEOMETRY][1]: - The MV would do a COMPLETE refresh on a schedule — on a local table. ------------ **Some info about ST_GEOMETRY:** [![enter image description here][2]][2] > The [ST_Geomet...

                                  I want to create an Oracle 18c **materialized view (MV)** on a table that has a user-defined datatype called ST_GEOMETRY :

 - The MV would do a COMPLETE refresh on a schedule — on a local table.

------------

**Some info about ST_GEOMETRY:**




> The ST_Geometry  data type implements the SQL 3 specification of
> user-defined data types (UDTs), allowing you to create columns capable
> of storing spatial data such as the location of a landmark, a street,
> or a parcel of land. It provides International Organization for
> Standards (ISO) and Open Geospatial Consortium, Inc. (OGC) compliant
> structured query language (SQL) access to the geodatabase and
> database. This storage extends the capabilities of the database by
> providing storage for objects (points, lines, and polygons) that
> represent geographic features. It was designed to make efficient use
> of database resources; to be compatible with database features such as
> replication and partitioning; and to provide rapid access to spatial
> data.
> 
> ST_Geometry itself is an abstract, noninstantiated superclass.
> However, its subclasses can be instantiated. An instantiated data type
> is one that can be defined as a table column and have values of its
> type inserted into it.


The SQL definition of the user-defined datatype can be found here: dbfiddle .

    CREATE OR REPLACE TYPE SDE."ST_GEOMETRY" (
    ...
    ) NOT final

*(Although, I'm guessing most of the logic is stored in the EXTPROC.)*

-----------------

**MV object types must be FINAL:**

There is a known issue  when trying to create an MV on the ST_GEOMETRY user-defined datatype (ORA-30373 ):

> Creating an Oracle materialized view for a table containing an
> ST_Geometry attribute returns the following error:
> 
>     "ORA-30373: object data types are not supported in this context".
> 
>     Code:
>     SQL> CREATE MATERIALIZED VIEW parcel_view
>       2  AS SELECT * FROM parcel@remote_server;
>     
>     CREATE MATERIALIZED VIEW parcel_view
>     *
>     ERROR at line 1:
>     ORA-30373: object data types are not supported in this context
> 
> Cause:
> 
> Oracle's Advanced Replication functionality requires that **all**
> **object types be defined as FINAL** to participate within a materialized
> view.



> The reason that the ST_Geometry cannot be defined as FINAL is because
> the type contains subtypes used for type inheritance. Oracle does not
> allow a type with subtypes to be defined as FINAL.
> 
> Esri is currently working with Oracle to address this issue and
> limitation. The following Oracle TAR file is available for reference:
> "6482996.992 - ORA-30373 MATERIALIZE VIEW UNABLE TO REPLICATE A TYPE
> WHICH CONTAINS SUBTYPES. Enhancement Request (ER) 6370112".

*Note: I've contacted ESRI support and they say there hasn't been any progress with Oracle regarding the enhancement request.*

-------------

**Question:**

Is there a way to workaround this issue? 

 - I ask because ESRI has a reputation of not being great with Oracle. So there is a
   chance that they have misunderstood the issue or haven't gotten creative enough.

------------

**For example, what about the ONLY keyword?** 

9.9 Materialized View Support for Objects 

> For both object-relational and object materialized views that are
> based on an object **table**, if the type of the master object table
> is not FINAL, the FROM clause in the materialized view definition
> query must include the ONLY keyword. 
> 
> For example: 
> 
>     CREATE MATERIALIZED
>     VIEW customer OF cust_objtyp AS  
>     SELECT CustNo FROM ONLY HR.Customer_objtab@dbs1;  
> 
> Otherwise, the FROM clause must omit the ONLY keyword.

Notes:

 - Unfortunately, that blurb is about object **tables**, not about object **columns** (I believe the ST_GEOMETRY user-defined datatype would be considered
   an "object column").
   
 - But I wonder if the concept of the ONLY keyword could be used to help with
   the FINAL problem with the ST_GEOMETRY object column. Any ideas?





                                

User1974 (1527 rep)

Jan 25, 2021, 03:23 AM • Last activity: Apr 19, 2025, 04:04 AM

1 votes

1 answers

43 views

Refreshing a PostgreSQL materialised view and returning the rowcount

postgresql materialized-view

Immediately after refreshing a materialised view in Postgres I'd like to know how many rows are in it. Currently I do this by running a 2nd SQL command (`SELECT count(*) FROM`...) Is there a more efficient way of doing this? Can the REFRESH command be persuaded to return the rowcount?

                                  Immediately after refreshing a materialised view in Postgres I'd like to know how many rows are in it.

Currently I do this by running a 2nd SQL command (SELECT count(*) FROM...)

Is there a more efficient way of doing this? Can the REFRESH command be persuaded to return the rowcount?

ConanTheGerbil (1303 rep)

Apr 12, 2025, 09:12 AM • Last activity: Apr 13, 2025, 12:07 PM

0 votes

1 answers

517 views

What are the approaches for refreshing materialized views in Oracle, when underlying tables are updated often?

oracle-12c materialized-view

I am a web developer, maintaining a web app that tracks orders, customers, products, etc, that my client uses internally. I use Oracle 12c, hosted on AWS RDS. My client has just switched some other systems, so we are at a point where data structures have changed, and I am using a new schema in Oracl...

                                  I am a web developer, maintaining a web app that tracks orders, customers, products, etc, that my client uses internally. I use Oracle 12c, hosted on AWS RDS. My client has just switched some other systems, so we are at a point where data structures have changed, and I am using a new schema in Oracle to store new data in the new structures.

In order that the web app doesn't have to be re-engineered to work with new data structures, a decision was made to implement materialized views in Oracle that union the new data from the new schema (manipulated into the "legacy structure") together with the legacy data.

So now I have to deal with refreshing these materialized views so that the web app constantly has access to the latest data. Ideally the relevant materialized view(s) would be refreshed whenever I receive a new record into the new schema, but during working hours, I receive new data every few seconds maybe. A compromise is OK - if the materialized views are stale by a few minutes (maybe 5 or (less ideally) 10 minutes), that might be an acceptable situation.

My question is, what approach should I have for refreshing these materialized views? I don't want to overload Oracle with constant refreshes, and the web app should provide users with a good user experience when reading/writing data from/to Oracle. I'm far from being an Oracle/DB expert, so I am not really sure what options there are. I guess I could just have a cron job that runs every 5 minutes or something to refresh stale materialized views one by one, but I am wondering if this approach is a bit naive.

In reality, I am dealing with 14 materialized views (for now), and in my testing, some of them take up to 2.5 minutes to do a complete refresh.

osullic (101 rep)

Feb 14, 2020, 11:51 AM • Last activity: Apr 12, 2025, 01:02 PM

1 votes

1 answers

1607 views

Created a Materialized View in ORACLE which won't refresh

oracle materialized-view

I created a materialized view log along with a materialized view called update_nboe based on table NBOE_ EMPLOYEES_TEST using the following code ``` CREATE MATERIALIZED VIEW LOG ON NBOE_EMPLOYEES_TEST WITH primary key; CREATE MATERIALIZED VIEW update_nboe REFRESH FAST ON DEMAND AS SELECT E.EMP_ID, E...

I created a materialized view log along with a materialized view called update_nboe based on table NBOE_ EMPLOYEES_TEST using the following code

CREATE MATERIALIZED VIEW LOG ON NBOE_EMPLOYEES_TEST WITH primary key;

CREATE MATERIALIZED VIEW update_nboe
REFRESH FAST ON DEMAND
AS
SELECT E.EMP_ID, E.USERNAME ,E.NAME, E.LOCATION , E.TITLE, E.LOCATION_CODE, E.RS_GROUP
FROM NBOE_EMPLOYEES_TEST E;

Then I updated NBOE_EMPLOYEES_TEST by inserting additional records hoping that the materialized view would update and refresh on-demand after using the following piece of code

exec dbms_mview.refresh('update_nboe',atomic_refresh_test=>TRUE);

However, I see a red cross on my connections panel for the materialized view and it won't refresh either. Would appreciate some input.

HSB (13 rep)

Jan 3, 2020, 12:07 AM • Last activity: Apr 6, 2025, 07:03 AM

1 votes

1 answers

646 views

Oracle Materialized View - Is it Safe to Use Only ROWID as Record Identifier?

oracle oracle-12c materialized-view goldengate

this is a question related to Oracle 12c. I'm currently trying to create MVs as data sources for Golden Gate replication. We need **fast refresh** because We have to replicate 60,000 rows + at one time and deleting and reinserting all rows seem to be inefficient. The base tables have unique columns...

                                  this is a question related to Oracle 12c.

I'm currently trying to create MVs as data sources for Golden Gate replication. We need **fast refresh** because We have to replicate 60,000 rows + at one time and deleting and reinserting all rows seem to be inefficient.

The base tables have unique columns which act as composite primary keys but there's no primary key constraint defined in them. They're peoplesoft tables so We prefer not to add pk constraints if possible, as They're used in so many applications.

I have built the MVs using only **WITH ROWID** clause for the materialized view logs and They seem to be working fine. But I don't know if it's a good practice to do so, should I add PK constraints on all the master tables instead? or maybe there's another way? 

Feedbacks are appreciated. Thanks.

Lee (61 rep)

Jul 17, 2019, 04:32 AM • Last activity: Mar 11, 2025, 03:04 PM

0 votes

1 answers

812 views

PostgreSQL : Refresh a materialized view base on a foreign table

postgresql materialized-view foreign-data

I've got a foreign table from a distant csv host on a external webserver. I need to enhance this table so I create a materialized view to add some extra fields and join with internal data. Making a view not working, I've got a error (probably due to performance issue of distant http server) I can't...

                                  I've got a foreign table from a distant csv host on a external webserver.

I need to enhance this table so I create a materialized view to add some extra fields and join with internal data. Making a view not working, I've got a error (probably due to performance issue of distant http server)

I can't use some bash cron script on the Linux host as IT department not wanted so I need to stay in PostgreSQL capabilities it self.

I've try [this method](https://stackoverflow.com/questions/29437650/how-can-i-ensure-that-a-materialized-view-is-always-up-to-date)  but it seems I can't add a trigger on a foreign table.

Any way to refresh materialized view ? (each time distant csv is updated but could be also based on a frequency, for example each 24h)

---

EDIT to be more efficient : How to refresh a materialized view from a foreign table ?

Bad Wolf (11 rep)

Dec 7, 2021, 01:41 PM • Last activity: Feb 21, 2025, 02:08 AM

0 votes

1 answers

81 views

What is the best way to handle materialized views with different timezones?

postgresql materialized-view

I am building some API endpoints backed by materialized views (I am using Ruby on Rails). The data is from Quake Live and it basically contains game stats (weapon accuracy, damage dealt, etc). I want to have a materialized view for damage done today (midnight to current time), this week, this month,...

SELECT
  players.id AS player_id,
  players.name AS player_name,
  players.steam_id AS steam_id,
  SUM(stats.damage_dealt) AS total_damage_dealt,
  SUM(stats.damage_taken) AS total_damage_taken
FROM
  stats
JOIN
  players ON players.id = stats.player_id
WHERE
  stats.created_at >= date_trunc('day', NOW())
GROUP BY
  players.id, players.steam_id;

It works, however I want to add support for multiple timezones, so that if you request stats for today with X timezone, the materialized views will reflect that. So far the only thing I could think of is creating a different materialized view for each timezone (not doable). My question is if anyone has a hint on what I could do other than dynamically query my tables with created_at, which would make the use of materialized views useless if not for all time stats. Cheers.

devamat (111 rep)

Jan 21, 2025, 01:45 PM • Last activity: Jan 22, 2025, 08:22 AM

1 votes

2 answers

993 views

Materialized view with FAST refresh on remote table: How to include a GEOMETRY column?

oracle datatypes spatial materialized-view oracle-18c

I want to create a **fast refresh** materialized view (18c) on a **remote table**. The MV would also have a GEOMETRY column. Options for the GEOMETRY column datatype include: - ESRI's proprietary implementation of [ST_GEOMETRY][1] (user-defined datatype; is an 'object' datatype) - Oracle's SDO_GEOME...

                                  I want to create a **fast refresh** materialized view (18c) on a **remote table**. The MV would also have a GEOMETRY column. 

Options for the GEOMETRY column datatype include:
 - ESRI's proprietary implementation of ST_GEOMETRY  (user-defined datatype; is an 'object' datatype)
 - Oracle's SDO_GEOMETRY datatype

----------------

To start, I can successfully create a fast refresh MV ***without*** a GEOMETRY column:

    create materialized view log on maximo.workorder with primary key;  --remote table
    grant select maximo.mlog$_workorder to schema_for_dblink; --I've given the dblink access to everything in this schema

    
    create materialized view my_gis_schema.wo_mv
    build immediate 
    refresh fast
    start with sysdate next sysdate + (15/(60*60*24))
    as
    select
        cast(workorderid as number(38,0)) as objectid,
        wonum,
        status,
        --other fields
        longitudex,
        latitudey
    from
        maximo.workorder@my_dblink

----------

The MV above works, but I want to store the XY coordinates from the remote table in a GEOMETRY column in the MV (right now, the coordinates are stored in number columns, not a geometry column). 

Unfortunately,  my options for the GEOMETRY column in an MV seem pretty limited:

1. Oracle doesn’t seem to support ESRI's ST_GEOMETRY  datatype in MVs (more info here  and here ). 
     - The SQL would be: sde.st_geometry(longitudex,latitudey,null,null, 26917 ) as shape

2. Additionally, Oracle doesn't seem to support SDO_GEOMETRY in MVs with the **fast refresh** option on a **remote table**: ORA-12015: cannot create a fast refresh materialized view from a complex query
     - The SQL would be: sdo_geometry(2001, 26917, sdo_point_type(longitudex,latitudey, null), null, null) as shape

---------------------

**Question:**

Is there a way to include a GEOMETRY column in a materialized view on a **remote table**, using the **fast refresh** option?



                                

User1974 (1527 rep)

Dec 11, 2020, 06:23 PM • Last activity: Jan 14, 2025, 02:01 AM

0 votes

1 answers

4531 views

Are there limitations on using pg_dump on a materialized view such that the data won't be included in the results?

postgresql postgis materialized-view pg-dump geometry

I have a database, call it **maps**, which contains 2 schemas, **a** & **b**. In each schema I have a number of tables: **a.t1**, **a.t2**, **b.t1**, **b.t2** (plus others). The column sets on each of these tables is different, but there are a number of columns in common. I have defined a materialis...

                                  I have a database, call it **maps**, which contains 2 schemas, **a** & **b**. In each schema I have a number of tables: **a.t1**, **a.t2**, **b.t1**, **b.t2** (plus others). The column sets on each of these tables is different, but there are a number of columns in common.

I have defined a materialised view, **a.mv**, which brings in the common columns of the 4 tables listed, including a geometry column (which represents a geographic outline).

I want to backup the current contents of the view so that I can restore it on another server. To do this, I use this pg_dump command:

    pg_dump -h hostname -U username -t a.mv -f mv.sql maps

What I get as a result is the SQL to define the table, but no data. The view definitely has data in it, because I can select from it in PgAdmin (it was created multiple days ago and the underlying tables haven't been changed since)

I can dump the underlying tables, including data, with (eg)

    pg_dump -h hostname -U username -t a.t1 -f t1.sql maps

but not the view. From the limited matches I've found with googling this, what I'm trying should work, but I'm wondering if the presence of a geometry column in the dump is causing the issue (or this might be a complete red herring). FWIW, the total data in the view is fairly substantial - probably around 1GB. However, I've dumped all the underlying tables in schema **a** successfully (including the 2 tables referenced by the view, and others), which was larger (1.5GB)

Any ideas what the issue could be here? On the face of it, what I'm trying to do should work, but just doesn't and with no indicated errors.

khafka (79 rep)

Feb 6, 2020, 11:36 AM • Last activity: Jan 8, 2025, 01:03 PM

1 votes

1 answers

86 views

An index that covers both an ARRAY column and sorting by another column in postgres?

postgresql index postgresql-performance materialized-view

Let's say I have a table with an array column and an integer column: ``` CREATE TABLE records( score INTEGER; features INTEGER[]; ); ``` Now I want to make queries that extract rows that have at least one of the requested features, and then I also want to order them by score. So something like this:...

Let's say I have a table with an array column and an integer column:

CREATE TABLE records(
   score INTEGER;
   features INTEGER[];
);

Now I want to make queries that extract rows that have at least one of the requested features, and then I also want to order them by score. So something like this:

SELECT *
FROM records
WHERE $1 && features 
ORDER BY score;

But I have no idea what kind of index could be used to speed up this query. I have a BTREE on score and GIN on features, but I don't think the index on score is going to be used. Another peculiarity is that *records* is actually a denormalised materialised view, and *features* are composed by inner joining with other tables (basically each feature is a row in *features* table and then there's another table that marks a record as having the specified feature). If the queries on the materialised view can't be reasonably sped up, then maybe I should get rid of it and just query a basic view. Bonus question: what if there are also other INTEGER array columns which also need to be checked for containing a certain value? How are such queries usually handled?

DiplomateProgrammer (11 rep)

Dec 14, 2024, 10:49 PM • Last activity: Dec 15, 2024, 12:18 AM

Showing page 1 of 20 total questions