Database Administrators
Q&A for database professionals who wish to improve their database skills
Latest Questions
8
votes
1
answers
735
views
What cases benefit from the Reduce, Replicate, and Redistribute join hints?
The [From Clause Documentation][1] starting with SQL Server 2008 briefly mention 3 join hints and their basic mechanisms: - Reduce - Replicate - Redistribute However there does not seem to be much information on when it might become necessary to use them. It appears that they can be used in conjunct...
The From Clause Documentation starting with SQL Server 2008 briefly mention 3 join hints and their basic mechanisms:
- Reduce
- Replicate
- Redistribute
However there does not seem to be much information on when it might become necessary to use them.
It appears that they can be used in conjunction with the hash, loop and merge which are already understood for purpose of this question.
The relevant section from the documentation:
>For SQL Data Warehouse and Parallel Data Warehouse, these join hints apply to INNER joins on two distribution incompatible columns. They can improve query performance by restricting the amount of data movement that occurs during query processing. The allowable join hints for SQL Data Warehouse and Parallel Data Warehouse are as follows:
>* REDUCE
Reduces the number of rows to be moved for the table on the right side of the join in order to make two distribution incompatible tables compatible. The REDUCE hint is also called a semi-join hint.
>* REPLICATE
Causes the values in the joining column from the table on the left side of the join to be replicated to all nodes. The table on the right is joined to the replicated version of those columns.
>* REDISTRIBUTE
Forces two data sources to be distributed on columns specified in the JOIN clause. For a distributed table, Parallel Data Warehouse will perform a shuffle move. For a replicated table, Parallel Data Warehouse will perform a trim move. To understand these move types, see the "DMS Query Plan Operations" section in the "Understanding Query Plans" topic in the Parallel Data Warehouse product documentation. This hint can improve performance when the query plan is using a broadcast move to resolve a distribution incompatible join.
crokusek
(2110 rep)
Jan 24, 2019, 08:20 PM
• Last activity: Jun 12, 2020, 06:01 AM
1
votes
1
answers
68
views
Select into query, insert fails, but the table is created
I am using SQL Server 2016 and I tried to the following query. SELECT CONVERT(BIGINT, 'A') col1 INTO #tmp This query is obviously in error. Because it does not convert. However, the temporary table (#tmp) is created even if the query fails. Why? I think this is by design, but I want to know why. PDW...
I am using SQL Server 2016 and I tried to the following query.
SELECT CONVERT(BIGINT, 'A') col1 INTO #tmp
This query is obviously in error. Because it does not convert. However, the temporary table (#tmp) is created even if the query fails.
Why? I think this is by design, but I want to know why.
PDW (parallel datawarehouse) does not create temporary table.
Suseong Park
(11 rep)
Mar 24, 2020, 05:27 AM
• Last activity: Mar 24, 2020, 12:31 PM
1
votes
0
answers
315
views
Output result of DBCC PDW_SHOWEXECUTIONPLAN into a variable
I would like to output the results of DBCC PDW_SHOWEXECUTIONPLAN into a variable (or table, I'm not fussy!). I know the PDW syntax is quite different when it comes to outputing results, which means everything I've tried has so far failed. I've seen some suggestions of inserting the results into a ta...
I would like to output the results of DBCC PDW_SHOWEXECUTIONPLAN into a variable (or table, I'm not fussy!).
I know the PDW syntax is quite different when it comes to outputing results, which means everything I've tried has so far failed. I've seen some suggestions of inserting the results into a table, but I always get syntax errors
create table #1234
(
sql1 varchar(max)
)
with (distribution = round_robin)
insert into #1234
DBCC PDW_SHOWEXECUTIONPLAN(@did, @spid)
>Parse error at line: 37, column: 2: Incorrect syntax near 'DBCC'.
Neil P
(1294 rep)
Feb 5, 2018, 04:45 PM
• Last activity: Feb 5, 2018, 05:58 PM
6
votes
1
answers
743
views
Server roles in SQL Server 2016
There are 3 new server roles in SQL Server 2016: - mediumrc - largerc - xlargerc Any idea what these are for? Google doesn't give any information about these. ![enter image description here][1] [1]: https://i.sstatic.net/IIOQS.png
There are 3 new server roles in SQL Server 2016:
- mediumrc
- largerc
- xlargerc
Any idea what these are for? Google doesn't give any information about these.

Zinx
(319 rep)
Jun 9, 2015, 02:24 AM
• Last activity: Feb 2, 2016, 08:55 AM
2
votes
1
answers
515
views
Is it possible to deploy Microsoft PDW on Azure?
I've read about it and want to play around with it for a bit. I think it requires a minimum of 4-5 machines to run, which is feasible with Azure IaaS, but how do I actually deploy it there?
I've read about it and want to play around with it for a bit.
I think it requires a minimum of 4-5 machines to run, which is feasible with Azure IaaS, but how do I actually deploy it there?
Akash
(1032 rep)
Mar 28, 2015, 06:32 PM
• Last activity: Jan 7, 2016, 08:51 AM
8
votes
3
answers
1690
views
Updating a big replicated Dimension (SQL Server PDW)
We use a [SQL Server PDW appliance][1] for our data warehouse. One of the tables in our warehouse is a replicated table with about 20 million rows. As part of our ETL process we need to expire old records from this dimension; however, we are seeing that updating a handful of records (<100) takes ove...
We use a SQL Server PDW appliance for our data warehouse. One of the tables in our warehouse is a replicated table with about 20 million rows. As part of our ETL process we need to expire old records from this dimension; however, we are seeing that updating a handful of records (<100) takes over 1 hour to complete. This is what I would like to improve if I can.
Naturally, one option that I thought about was changing this Dimension from Replicated to Distributed. My testing shows that it would fix the issue with the ETL process taking long (from 1.5 hours came down to 30 secs) but all the joins against the Distributed version of this dimension would be affected since the joins are almost never based on the same distribution column. When I look at the execution plan of some of these queries I usually see either a *ShuffleMove * or a *BroadcastMove* operation.
So my question to the PDW guru's here is:
Is there anything else that can be done in order to improve the performance of updating records in the **replicated** version of this Dimension?
Again, moving to a Distributed table doesn't seem to be the best solution since it will affect hundreds of already written SQL queries and reports developed by other people.
Icarus
(337 rep)
Aug 12, 2013, 09:03 PM
• Last activity: Oct 19, 2014, 08:28 PM
0
votes
1
answers
238
views
SQL Server 2012 Vs PDW
We are migrating to SQL server 2012 to PDW. Any SQL script for Comparing the data between SQL server 2012 and Parallel dataware house or any tools to compare.
We are migrating to SQL server 2012 to PDW.
Any SQL script for Comparing the data between SQL server 2012 and Parallel dataware house or any tools to compare.
Sam
(1 rep)
Jul 17, 2014, 07:33 AM
• Last activity: Jul 17, 2014, 08:52 PM
1
votes
1
answers
634
views
Raise error from Stored PROC in SQL Server PDW
I need to do some validation inside a stored procedure before I continue processing but [SQL Server PDW][1] (SQL Server 2008 R2) does not support RAISEERROR inside stored procedures. Is there any other way that I can raise an error with a specific error message inside a stored procedure? I can try a...
I need to do some validation inside a stored procedure before I continue processing but SQL Server PDW (SQL Server 2008 R2) does not support RAISEERROR inside stored procedures.
Is there any other way that I can raise an error with a specific error message inside a stored procedure?
I can try and do something illegal, like force a "division by zero" error, but the error would be misleading.
I'd like to be able to raise an error specifying the exact problem that's occurring.
Any other way?
Icarus
(337 rep)
Jan 16, 2014, 07:31 PM
• Last activity: Jan 17, 2014, 09:20 AM
Showing page 1 of 8 total questions