Sample Header Ad - 728x90

SQL Server PATINDEX issue/bug when using different case-sensitive collations

2 votes
1 answer
541 views
I have a function (which I found here years ago) that uses STUFF / PATINDEX to strip out non-alphanumeric characters from a string. When running on a Case Insensitive collation, it works fine. Recently I needed to use it on a Case Sensitive collation DB and found some odd behaviors. If the **%pattern%** for PATINDEX is just specifying lower case (ex: **%[^a-z0-9_-]%** ), then all upper case "Z"s are removed when the Collation is **Latin1_General_100_CS_AS**. If the Collation is **SQL_Latin1_General_CP1_CS_AS**, then upper case "A"s are removed. Is this a bug or did I miss something?
USE TestCollation
GO
PRINT '---------SourceString---------'
PRINT 'ABCDEFGHIJKLMNO_Z_A_PQRSTUVWXYZ-abcdefghijklmnopqrstuvwxyz'
GO
ALTER DATABASE TestCollation COLLATE SQL_Latin1_General_CP1_CI_AS;
GO
PRINT '---------SQL_Latin1_General_CP1_CI_AS---------'
GO

DECLARE @ExternalId VARCHAR(255) =
     'ABCDEFGHIJKLMNO_Z_A_PQRSTUVWXYZ-abcdefghijklmnopqrstuvwxyz'
DECLARE @return VARCHAR(255)
SET @return = @ExternalId
DECLARE @KeepValues AS VARCHAR(50)
SET @KeepValues = '%[^a-z0-9_-]%'
WHILE PATINDEX ( @KeepValues, @return ) > 0
BEGIN
    SET @return = STUFF ( @return, PATINDEX ( @KeepValues, @return ), 1, '' )
END
PRINT @return

go

ALTER DATABASE TestCollation COLLATE Latin1_General_100_CS_AS;
GO
PRINT '---------Latin1_General_100_CS_AS---------'
GO
DECLARE @ExternalId VARCHAR(255) =
     'ABCDEFGHIJKLMNO_Z_A_PQRSTUVWXYZ-abcdefghijklmnopqrstuvwxyz'
DECLARE @return VARCHAR(255)
SET @return = @ExternalId
DECLARE @KeepValues AS VARCHAR(50)
SET @KeepValues = '%[^a-z0-9_-]%'
WHILE PATINDEX ( @KeepValues, @return ) > 0
BEGIN
    SET @return = STUFF ( @return, PATINDEX ( @KeepValues, @return ), 1, '' )
END
PRINT @return

ALTER DATABASE TestCollation COLLATE SQL_Latin1_General_CP1_CS_AS;
GO
PRINT '---------SQL_Latin1_General_CP1_CS_AS---------'
GO
DECLARE @ExternalId VARCHAR(255) =
     'ABCDEFGHIJKLMNO_Z_A_PQRSTUVWXYZ-abcdefghijklmnopqrstuvwxyz'
DECLARE @return VARCHAR(255)
SET @return = @ExternalId
DECLARE @KeepValues AS VARCHAR(50)
SET @KeepValues = '%[^a-z0-9_-]%'
WHILE PATINDEX ( @KeepValues, @return ) > 0
BEGIN
    SET @return = STUFF ( @return, PATINDEX ( @KeepValues, @return ), 1, '' )
END
PRINT @return
Asked by Jay Austin (31 rep)
Jun 27, 2022, 07:55 PM
Last activity: Jun 28, 2022, 04:08 PM