Sample Header Ad - 728x90

String Manipulation of the Result from Recursive CTE

3 votes
1 answer
390 views
Good afternoon everyone I found just one post here within the last year about this, but it doesn't help my situation. I have been working with MySQL and trying to improve my knowledge of recursive CTE. The version of MySQL is 8.0.19 on a Windows device. The table that I have is generated with:
DROP TABLE IF EXISTS test_table;
CREATE TABLE test_table (
  id int(5) NOT NULL AUTO_INCREMENT,
  source varchar(20) NOT NULL,
  destination varchar(20) NOT NULL,
  route varchar (200) NOT NULL,
  open BOOLEAN DEFAULT true,
  PRIMARY KEY (id)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
Values are entered using:
INSERT INTO test_table VALUES 
  (1, 'STA_A', 'STA_B', 'Line1', true),
  (2, 'STA_B', 'STA_A', 'Line1', true),
  (3, 'STA_B', 'STA_C', 'Line1', true),
  (4, 'STA_C', 'STA_B', 'Line1', true),
  (5, 'STA_C', 'STA_D', 'Line1', true),
  (6, 'STA_D', 'STA_C', 'Line1', true),
  (7, 'STA_D', 'STA_E', 'Line1', true),
  (8, 'STA_E', 'STA_D', 'Line1', true),
  (9, 'STA_K', 'STA_B', 'Line2', true),
  (10, 'STA_B', 'STA_K', 'Line2', true),
  (11, 'STA_B', 'STA_L', 'Line2', true),
  (12, 'STA_L', 'STA_B', 'Line2', true),
  (13, 'STA_L', 'STA_M', 'Line2', true),
  (14, 'STA_M', 'STA_L', 'Line2', true),
  (15, 'STA_M', 'STA_N', 'Line2', true),
  (16, 'STA_N', 'STA_M', 'Line2', true);
Finally, here is the recursive CTE:
SET profiling = 1;
SET @from = 'STA_A';
SET @to = 'STA_M';
SET @via = 'STA_M';
SET @avoid = 'XXXXX';
WITH RECURSIVE cte AS (
	-- Anchor
	SELECT test_table.destination, CONCAT(test_table.source, ' => ', test_table.route, ' => ', test_table.destination) path, 1 length
	FROM test_table
	WHERE test_table.source = @from
	
	UNION ALL

	-- Recursive member
	SELECT test_table.destination, CONCAT(cte.path, ' => ', test_table.route, ' => ', test_table.destination) path, cte.length + 1 length
	FROM cte
	INNER JOIN test_table
	ON test_table.source = cte.destination
	WHERE NOT FIND_IN_SET(test_table.destination, REPLACE(path, ' => ', ','))
	AND open = TRUE
)
SELECT *
FROM cte
WHERE cte.destination = @to
AND FIND_IN_SET(@via, REPLACE(path, ' => ', ','))
AND NOT FIND_IN_SET(@avoid, REPLACE(path, ' => ', ','))
ORDER BY cte.length
LIMIT 500;
SHOW PROFILES;
This pulls successfully the route from Station A to Station M, listing the stations and line segments between the source and destination:
STA_A => Line1 => STA_B => Line2 => STA_L => Line2 => STA_M
What I need to do is remove one of the duplicate line segments (Line2) and the associated station, leaving:
STA_A => Line1 => STA_B => Line2 => STA_M
Is this possible within the CTE, possibly sequential or recursive string manipulation, or would I need to push all the results into a separate "de-duplicating" process? The real database might produce 100 different routes that would need to be de-duplicated in this way and there could be several duplicated line segments in sequence, such as:
STA_A => Line1 => STA_B => Line1 => STA_C => Line1 => STA_D => Line1 => STA_E => Line2 => STA_F => Line2 => STA_G => Line2 => STA_H => Line2 => STA_I => Line2 => STA_J
would de-duplicate to:
STA_A => Line1 => STA_E => Line2 => STA_J
Thank you for your time (and patience!).
Asked by nicodemus (33 rep)
Jan 7, 2023, 02:52 PM
Last activity: Jan 8, 2023, 02:20 PM