How to assign GUID to batches of data based on row counts and tunable batch size
0
votes
1
answer
920
views
I need to assign batches of rows a common GUID id (for consumption by an external process).
Below is a simplified setup describing what I'm looking for:
BASE_TABLE
represents one of pre-existing tables in this scenario, much simplified obviously. Values in the "real" table should be considered random; they're not consecutive in real life (even the ID is not really an int)
CREATE TABLE BASE_TABLE (
ID int
,VALUE1 varchar(10)
,VALUE2 varchar(255)
)
TARGET_TABLE
represents the "output" of the process I'm working on, which is just the input to something else that requires batches of data to have a uniqueIdentifier
BATCHID for each batch. Batches in real life are ~1000 rows each.
CREATE TABLE TARGET_TABLE
(
ID int
,VALUE1 varchar(10)
,VALUE2 varchar(255)
,BATCHID uniqueIdentifier
)
(Add some dummy data)
DECLARE @DATA1 int
set @DATA1=0
WHILE @DATA1<100
BEGIN
INSERT INTO BASE_TABLE (ID,VALUE1,VALUE2) VALUES (@DATA1,'v1'+CONVERT(varchar,@DATA1), 'v2'+CONVERT(varchar,@DATA1))
SET @DATA1=@DATA1+1
END
/**
example DESIRED RESULT, with batch size of 5
|ID|VALUE1|VALUE2|BATCHID|
|--|------|------|------------------------------------|
|0 |v10 |v20 |38B1B4FB-7F1E-44FD-9336-19095C01C629|
|1 |v11 |v21 |38B1B4FB-7F1E-44FD-9336-19095C01C629|
|2 |v12 |v22 |38B1B4FB-7F1E-44FD-9336-19095C01C629|
|3 |v13 |v23 |38B1B4FB-7F1E-44FD-9336-19095C01C629|
|4 |v14 |v24 |38B1B4FB-7F1E-44FD-9336-19095C01C629|
|5 |v15 |v25 |41122454-A743-4545-8F0C-D7B461E072AE|
|6 |v16 |v26 |41122454-A743-4545-8F0C-D7B461E072AE|
|7 |v17 |v27 |41122454-A743-4545-8F0C-D7B461E072AE|
|8 |v18 |v28 |41122454-A743-4545-8F0C-D7B461E072AE|
|9 |v19 |v29 |41122454-A743-4545-8F0C-D7B461E072AE|
|10|v110 |v210 |41122454-A743-4545-8F0C-D7B461E072AE|
|11|v111 |v211 |FBDE5513-C869-4F2D-AC4D-40CBEF4A2D48|
etc
Important result is that for each batch of size N there are N rows with the same BatchID, which must be a GUID.
I'm pretty sure I could do what I want with a cursor, but I want to do something like this if possible:
select
ID,VALUE1,VALUE2,
NEWID() OVER(PARTITION BY ROW_NUMBER() OVER (ORDER BY ID)) AS BatchID
from BASE_TABLE
but it's not valid because NEWID() isn't an aggregate function.
Asked by Gus
(133 rep)
Mar 2, 2021, 06:19 PM
Last activity: Mar 2, 2021, 10:51 PM
Last activity: Mar 2, 2021, 10:51 PM