Do Azure SQL and SQL Server still call-out into the CLR for FORMAT and PARSE? If so, which CLR version and how does marshalling affect performance?

0 votes
1 answer
681 views
sql-clr
                          The documentation for SQL Server's T-SQL FORMAT function implies calls are forwarded to .NET's ToString methods on .NET types corresponding to T-SQL types, and TRY_PARSE and PARSE's documentation is quite explicit about it using the CLR:

> https://learn.microsoft.com/en-us/sql/t-sql/functions/format-transact-sql 
> ## FORMAT( value, format, culture )
> **format**: The format argument must contain a valid .NET Framework format string, either as a standard format string (for example, "C" or "D"), or as a pattern of custom characters for dates and numeric values (for example, "MMMM DD, yyyy (dddd)"). Composite formatting is not supported.
> 
> **culture**: culture accepts any culture supported by the .NET Framework as an argument; it is not limited to the languages explicitly supported by SQL Server. If the culture argument is not valid, FORMAT raises an error.

and:

> https://learn.microsoft.com/en-us/sql/t-sql/functions/parse-transact-sql?view=sql-server-ver16 
> ## PARSE( string_value AS data_type [ USING culture ] )
> ### Remarks
> * Keep in mind that there is a certain performance overhead in parsing the string value.
> * PARSE relies on the presence of the .NET Framework Common Language Runtime (CLR).
> * This function will not be remoted since it depends on the presence of the CLR. Remoting a function that requires the CLR would cause an error on the remote server.

So using FORMAT( intValue, 'N2', 'en-US' ) and PARSE( '2023-02-21' AS date ) in a T-SQL SELECT projection will cause SQL Server to somehow invoke .NET's Int32.ToString( "N2", CultureInfo.GetCultureInfo( "en-US" ) ) and DateTime.Parse( "2023-02-21" ) methods at runtime.

...which strikes me as introducing considerable overhead, as SQL Server would need to use .NET's runtime type/value marshalling to convert from SQL Server's own internal representation of varchar(n) and nvarchar values to .NET's UTF-16-based strings, invoking those .NET methods (which, in many cases, are forwarded-out to non-CIL native code within the CLR... thus making the CLR part pointless, no?)

But more profoundly, Azure SQL notably disabled SQL-CLR shortly after launch for security reasons and there's no indication Microsoft will bring the feature back, except [in their _Managed Instance_ service](https://learn.microsoft.com/en-us/shows/data-exposed/its-just-sql-clr-in-azure-sql-database-managed-instance) , so if Azure SQL has the SQL-CLR disabled, how is it that our T-SQL scripts have no problem with using .NET-based FORMAT/PARSE functions?

-----

Based on this, I have some questions, which all basically boil-down to _"What's going on, eh?"_:

* Computers can have multiple versions of .NET installed side-by-side, so exactly what version of the CLR and BCL are being used when you use FORMAT/PARSE/TRY_PARSE?
   * e.g. .NET Framework 4.8 x64 vs. .NET Core 2.1 vs. .NET 5 vs. .NET 7+
   * [The official documentation for SQL-CLR doesn't seem to have been updated](https://learn.microsoft.com/en-us/dotnet/framework/data/adonet/sql/sql-server-common-language-runtime-integration)  since .NET 5+ effectively replaced Framework 4.x, so my reading implies it always uses .NET Framework 4.x...
* How does it work on SQL Server-for-Linux where only _post-.NET Core 3.1_ versions of .NET are available?
   * The ("_classic_") .NET Framework 4.x is very Windows-specific, and I know SQL Server-for-Linux is essentially its own operating-system, but that seems like a lot of engineering-effort to keep using .NET Framework 4.x in this situation. It boggles the mind.
   * Whereas if SQL Server is using .NET Core, or .NET 5+ instead of the .NET Framework 4.8 runtime, then how can we use .NET 5+ in our own SQL-CLR projects?
* How does this impact parallelisation of queries and query-plan generation?
   * The article about SQL-CLR performance does not make any mention of scalar parallelisation nor UDF inlining at all, beyond saying that T-SQL built-ins are generally better than SQL-CLR, but that SQL-CLR is still far better than old-world CURSOR-based approaches.
     * ...but I assume that using a scalar SQL-CLR function in a query breaks parallelisation (as .NET doesn't expose a way to explicitly mark a method as being reentrant or thread-safe, so SQL Server would have no way of knowing that, so it would _have to_ disable parallelisation) - which kills performance, on top of the type-marshalling overhead - it's very odd that this factor isn't mentioned in the documentation.
* And if Azure SQL has SQL-CLR disabled, then how do FORMAT and PARSE work? Does Azure SQL have a _native_ built-in reimplementation of these functions that avoid the CLR entirely, or does it _actually_ have SQL-CLR enabled but only allows a small set of _blessed_ .NET BCL functions to be called? If so, then why _only_ those functions when there exist _many_ other useful functions we'd also like to call from T-SQL (composite string-formatting, Calendar, Regex, etc) - the whole thing seems _very arbitrarily limited_.
    * And if Azure SQL _does_ have SQL-CLR running internally but only for first-party code, why does [the clr enabled option](https://learn.microsoft.com/en-us/sql/database-engine/configure-windows/clr-enabled-server-configuration-option)  return 0 (and many other CLR-related properties/functions seem broken) unless Azure SQL's T-SQL sandboxing is doing some convoluted hiding-of-functionality?

-----

## Update

(I hate it when my initial research doesn't find anything helpful, then I post to SO/SE, and then _right afterwards_ I search Google again with slightly different keywords and find out what I wanted to know... so here's some information I just found, but I don't want to post it as an answer just yet, maybe in a couple of weeks if no-one else can post a succint, accurate, recent and relevant answer)

* According to https://github.com/MicrosoftDocs/sql-docs/issues/1594  and my own quick research:
   * See also: https://stackoverflow.com/a/41114654/159145 
   * As of Q1 2023, all versions of SQL Server, up-to-and-including SQL Server 2022, including SQL Server for Linux, use _some version_ of the .NET Framework 4.x.
      * On Windows Server, it uses whatever the latest version of the OS-provided .NET Framework 4.x is, which should be .NET 4.8.1.
      * On Linux, it uses a private build of .NET Framework 4.x that runs within SQL Server's Windows-abstraction layer "SQLPAL" which is also involved with sandboxing SQL-CLR user-code.
         * On Linux, the EXTERNAL_ACCESS option (which would allow SQL-CLR code to break-out of the sandbox) is not supported, ditto on _Azure SQL Server Managed Instance_.
* However I haven't been able to find out how/why PARSE/FORMAT/etc works in Azure SQL despite all the indications that SQL-CLR is entirely disabled - or the performance/overhead/query-plan implications of the native-to-managed-and-back process involved in invoking .NET code from T-SQL.
                        
Asked by Dai (632 rep)
Feb 21, 2023, 07:18 PM
Last activity: Feb 21, 2023, 11:43 PM
Do Azure SQL and SQL Server still call-out into the CLR for FORMAT and PARSE? If so, which CLR version and how does marshalling affect performance?

Related Questions