Sometimes we just want to spit the rows in a table by the columns values because that is what SQL is for. This is no easy task depending on your data structure and lead to some complex queries. So here is the road to how to separate the columns data into each row that I followed.
Let’s start with the example table to show how the table would be set, so that you know where this is all coming from. At minimal we need the primary ID that defines the unique row and the column that will be split into multiple rows, for example.
ID | Firstname | Lastname | Hobbies |
1 | Chris | Eagle | Cycling,Running,Music |
2 | Lucy | Whitehead | Music,Theatre,Holiday |
3 | Foo | Bar | Running,Holiday,Cycling |
Like with any SQL function we start with the creation and the entry of the required variables. For this we need the identifier for referencing each row, a delimiter for how each value in the string is connected and finally the column.
CREATE FUNCTION [dbo].[SplitRowByColumn]
(
)
RETURNS @returntable TABLE
(
id int,
csvColumn varchar(max),
delimiter varchar(1)
)
To make things a little lighter we are going to put the required items into a temporary table. This way when we are referencing and processing the table, we are only calling locally and not on the full dataset. The other reason you will see later, is it is used to manage what rows have been processed. As you will see I am only using 1 field ‘csvColumn’, but you can branch these out to have extra columns if required.
— DECLARE TEMP DBs
declare @tempItemsTable TABLE (id int,
csvColumn varchar(max));
— GET ALL VALUES
INSERT @tempItemsTable
(id,csvColumn)
SELECT id, csvColumn
from [dbo].[SourceTable]
This is where the fun now happens, as we split each row by its column and piece it back together. In the loop below I use the split function found on the great Stack Overflow to separate the column for reference. https://stackoverflow.com/questions/13527537/sql-query-to-split-column-data-into-rows
In the loop we go through each item in the ‘tempItemsTable’ that we just constructed with all the rows we wish to split. Then for each row we can get the values out and split the column, while at the same time we insert each split item into a row of the returning table.
— LOOP ITEMS
WHILE (SELECT COUNT(id) from @tempItemsTable) > 0
BEGIN
DECLARE @id int
DECLARE @currentCsvColumn varchar(max)
— GET VALUES
SELECT TOP 1 @id=id, @currentCsvColumn=csvColumn FROM @tempItemsTable
— INSERT VALUES INTO TEMP DB
INSERT INTO @returntable
(id, csvColumn)
SELECT @id, splitTable.items
FROM
[Current_Database].[dbo].[Split] (@currentCsvColumn, @delimiter) AS splitTable
— REMOVE 1 STEP
DELETE @tempItemsTable WHERE id = (SELECT TOP 1 id FROM @tempItemsTable)
END
At the end of the loop we then remove the first item, so that the loop count will go down and we move onto the next row.
Finally on the output we should have the below example table:
ID | Hobbies |
1 | Cycling |
1 | Running |
1 | Music |
2 | Music |
2 | Theatre |
2 | Holiday |
3 | Running |
3 | Holiday |
3 | Cycling |
With this table you can now link the returned table to the original table
SELECT
p.[ID], p.[First Name], p.[Last Name,], sr.[csvColumn]
FROM
[dbo].[SourceTable] AS p
INNER JOIN
[dbo].[SplitRowByColumn]() as sr ON sr id = p.id
Bonus Round
In the circumstance that we have two columns to split, there is a slight alteration that we have to make for the linking of the two split columns together. We required the Split function, from Stack Overflow, to return what index level it is currently at, so we can match it with the another column we are splitting. The full example of the new Split function is below and here is also how we then link them into the same returning table.
— INSERT VALUES INTO TEMP DB
INSERT INTO @returntable
(id, csvColumn1, csvColumn2)
SELECT @id, splitTable1.items, splitTable2.items
FROM
[Current_Database].[dbo].[Split] (@currentCsvColumn1, @delimiter) AS splitTable1
INNER JOIN
[Current_Database].[dbo].[Split] (@currentCsvColumn2, @delimiter) AS splitTable2
ON splitTable1.IndexNum = splitTable2.IndexNum
Full Examples
SplitRowByColumn Function
CREATE FUNCTION [dbo].[SplitRowByColumn]
(
)
RETURNS @returntable TABLE
(
id int,
csvColumn varchar(max),
delimiter varchar(1)
)
AS
BEGIN
— DECLARE TEMP DBs
declare @tempItemsTable TABLE (id int,
csvColumn varchar(max));
— GET ALL VALUES
INSERT @tempItemsTable
(id,csvColumn)
SELECT id, csvColumn
from [dbo].[SourceTable]
— LOOP ITEMS
WHILE (SELECT COUNT(id)
from @tempItemsTable) > 0
BEGIN
DECLARE @id int
DECLARE @currentCsvColumn varchar(max)
— GET VALUES
SELECT TOP 1 @id=id, @currentCsvColumn=csvColumn FROM @tempItemsTable
— INSERT VALUES INTO TEMP DB
INSERT INTO @returntable
(id, csvColumn)
SELECT @id, splitTable.items
FROM
[Current_Database].[dbo].[Split] (@currentCsvColumn, @delimiter) AS splitTable
— REMOVE 1 STEP
DELETE @tempItemsTable WHERE id = (SELECT TOP 1 id FROM @tempItemsTable)
END
RETURN
END
Split Function
CREATE FUNCTION [dbo].[Split]
(@String nvarchar(4000), @Delimiter char(1))
RETURNS @Results TABLE (Items nvarchar(4000),
IndexNum int)
AS
BEGIN
DECLARE @COUNTER INT
SET @COUNTER = 0
DECLARE @INDEX INT
DECLARE @SLICE nvarchar(4000)
— HAVE TO SET TO 1 SO IT DOESNT EQUAL Z
— ERO FIRST TIME IN LOOP
SELECT @INDEX = 1
WHILE @INDEX !=0
BEGIN
— GET THE INDEX OF THE FIRST OCCURENCE OF THE SPLIT CHARACTER
SELECT @INDEX = CHARINDEX(@Delimiter,@STRING)
— NOW PUSH EVERYTHING TO THE LEFT OF IT INTO THE SLICE VARIABLE
IF @INDEX !=0
SELECT @SLICE = LEFT(@STRING,@INDEX – 1)
ELSE
SELECT @SLICE = @STRING
— PUT THE ITEM INTO THE RESULTS SET
INSERT INTO @Results
(Items, IndexNum)
VALUES(@SLICE, @COUNTER)
— INCREMENT COUNTER
SET @COUNTER = @COUNTER + 1
— CHOP THE ITEM REMOVED OFF THE MAIN STRING
SELECT @STRING = RIGHT(@STRING,LEN(@STRING) – @INDEX)
— BREAK OUT IF WE ARE DONE
IF LEN(@STRING) = 0 BREAK
END
RETURN
END