Is "ROW_NUMBER() OVER(ORDER BY xml.node)" well defined?
12
votes
1
answer
1131
views
(It is more a question of *documentation* rather than *behavior*. It has been migrated from [Stack overflow](https://stackoverflow.com/questions/78665262/) as was suggested there.)
While researching the [answer to another question](https://stackoverflow.com/questions/78411293/#78412334) that required extracting XML nodes in the original element order, I came across several answers ([here](https://stackoverflow.com/questions/1134075/#30185119) , [here](https://stackoverflow.com/questions/17748012/#17764715) , and [here](https://stackoverflow.com/questions/20249307/#20250521)) that used expressions of the form
ROW_NUMBER() OVER (ORDER BY xml.node)
, and asserted or implied that the assigned row number values would be assigned in XML document order.
However, I cannot find anywhere that defines the behavior of ORDER BY xml.node
. Although it seems to be allowed in the OVER()
clause, the [documentation](https://learn.microsoft.com/en-us/sql/t-sql/queries/select-over-clause-transact-sql) does not specifically mention XML nodes.
For example, given:
DECLARE @xml XML = '
One
Two
Three
Four
'
SELECT
ROW_NUMBER() OVER(ORDER BY xml.node) AS rn,
xml.node.value('./text()', 'varchar(255)') AS value
FROM @xml.nodes('*/node') xml(node)
ORDER BY
ROW_NUMBER() OVER(ORDER BY xml.node)
The following results are returned:
rn | value
----------
1 | One
2 | Two
3 | Three
4 | Four
**Question:** Are these results guaranteed anywhere in the documentation? Is this accepted as guaranteed, but undocumented behavior? Or is this another case like ORDER BY (SELECT NULL)
that appears to work anecdotally for small seemingly pre-ordered source data sets, but may eventually fail when scaled up? The reason I am asking is that I would prefer not to recommend use of a technique whose behavior and reliability is not supported by the documentation.
Interestingly, although an XML node is allowed in a windowed ORDER BY
, it is not allowed in an ordinary SELECT ... ORDER BY
. When used in an ordinary select order-by clause, ORDER BY xml.node
yields the following error:
> Msg 493 Level 16 State 1 Line 7
>
> The column 'node' that was returned from the nodes() method cannot be used directly. It can only be used with one of the four XML data type methods, exist(), nodes(), query(), and value(), or in IS NULL and IS NOT NULL checks.
The above error message does not list the window-function OVER(ORDER BY ...)
as an allowed use.
[See this dbfiddle](https://dbfiddle.uk/r_mOqNdp) .
Asked by T N
(231 rep)
Jun 25, 2024, 05:10 PM
Last activity: Jun 28, 2024, 08:04 AM
Last activity: Jun 28, 2024, 08:04 AM