Recursive sum in tree structure

Recursive sum in tree structure

I have a tree struture in a single table. The table is a tree of categories that can be nested endlessly. Each category has a ProductCount column that tells how many products are directly in the category (not summing child categories).
Id | ParentId | Name | ProductCount
————————————
1 | -1 | Cars | 0
2 | -1 | Bikes | 1
3 | 1 | Ford | 10
4 | 3 | Mustang | 7
5 | 3 | Focus | 4

I would like to make a sql query that for each row/category gives me the number of products including the ones in the child categories.
The output for the table above should be
Id | ParentId | Name | ProductCount | ProductCountIncludingChildren
————————————————————————–
1 | -1 | Cars | 0 | 21
2 | -1 | Bikes | 1 | 1
3 | 1 | Ford | 10 | 21
4 | 3 | Mustang | 7 | 7
5 | 3 | Focus | 4 | 4

I know I probably should use CTE, but cant quite get it working the way it should.
Any help is appreciated!

Solution 1:

You can use a recursive CTE where you in the anchor part get all rows and in the recursive part join to get the child rows. Remember the original `Id` aliased `RootID` from the anchor part and do sum aggregate in the main query grouped by `RootID`.

SQL Fiddle

MS SQL Server 2012 Schema Setup:

``````create table T
(
Id int primary key,
ParentId int,
Name varchar(10),
ProductCount int
);

insert into T values
(1, -1, 'Cars',    0),
(2, -1, 'Bikes',   1),
(3,  1, 'Ford',    10),
(4,  3, 'Mustang', 7),
(5,  3, 'Focus',   4);

create index IX_T_ParentID on T(ParentID) include(ProductCount, Id);
``````

Query 1:

``````with C as
(
select T.Id,
T.ProductCount,
T.Id as RootID
from T
union all
select T.Id,
T.ProductCount,
C.RootID
from T
inner join C
on T.ParentId = C.Id
)
select T.Id,
T.ParentId,
T.Name,
T.ProductCount,
S.ProductCountIncludingChildren
from T
inner join (
select RootID,
sum(ProductCount) as ProductCountIncludingChildren
from C
group by RootID
) as S
on T.Id = S.RootID
order by T.Id
option (maxrecursion 0)
``````
``````| ID | PARENTID |    NAME | PRODUCTCOUNT | PRODUCTCOUNTINCLUDINGCHILDREN |
|----|----------|---------|--------------|-------------------------------|
|  1 |       -1 |    Cars |            0 |                            21 |
|  2 |       -1 |   Bikes |            1 |                             1 |
|  3 |        1 |    Ford |           10 |                            21 |
|  4 |        3 | Mustang |            7 |                             7 |
|  5 |        3 |   Focus |            4 |                             4 |
``````

Solution 2:

This is the same concept as Tom’s answer, but less code (and way faster).

``````with cte as
(
select v.Id, v.ParentId, v.Name, v.ProductCount,
cast('/' + cast(v.Id as varchar) + '/' as varchar) Node
from Vehicle v
where ParentId = -1
union all
select v.Id, v.ParentId, v.Name, v.ProductCount,
cast(c.Node + CAST(v.Id as varchar) + '/' as varchar)
from Vehicle v
join cte c on v.ParentId = c.Id
)

select c1.Id, c1.ParentId, c1.Name, c1.ProductCount,
c1.ProductCount + SUM(isnull(c2.ProductCount, 0)) ProductCountIncludingChildren
from cte c1
left outer join cte c2 on c1.Node <> c2.Node and left(c2.Node, LEN(c1.Node)) = c1.Node
group by c1.Id, c1.ParentId, c1.Name, c1.ProductCount
order by c1.Id
``````

SQL Fiddle (I added some extra data rows for testing)

Solution 3:

Actually this could be a good use of HIERARCHYID in SQL Server..

``````CREATE TABLE [dbo].[CategoryTree]
(
[Id] INT,
[ParentId] INT,
[Name] VARCHAR(100),
[ProductCount] INT
)
GO

INSERT [dbo].[CategoryTree]
VALUES
(1, -1, 'Cars', 0),
(2, -1, 'Bikes', 1),
(3, 1, 'Ford', 10),
(4, 3, 'Mustang', 7),
(5, 3, 'Focus', 4)
--,(6, 1, 'BMW', 100)
GO
``````

Query

``````WITH [cteRN] AS (
SELECT *,
ROW_NUMBER() OVER (
PARTITION BY [ParentId] ORDER BY [ParentId]) AS [ROW_NUMBER]
FROM  [dbo].[CategoryTree]
),
[cteHierarchy] AS (
SELECT CAST(
CAST(hierarchyid::GetRoot() AS VARCHAR(100))
+ CAST([ROW_NUMBER] AS VARCHAR(100))
+ '/' AS HIERARCHYID
) AS [Node],
*
FROM [cteRN]
WHERE [ParentId] = -1
UNION ALL
SELECT CAST(
hierarchy.Node.ToString()
+ CAST(RN.[ROW_NUMBER] AS VARCHAR(100)
) + '/' AS HIERARCHYID),
rn.*
FROM [cteRN] rn
INNER JOIN [cteHierarchy] hierarchy
ON rn.[ParentId] = hierarchy.[Id]
)
SELECT x.[Node].ToString() AS [Node],
x.[Id], x.[ParentId], x.[Name], x.[ProductCount],
x.[ProductCount] + SUM(ISNULL(child.[ProductCount],0))
AS [ProductCountIncludingChildren]
FROM [cteHierarchy] x
LEFT JOIN [cteHierarchy] child
ON child.[Node].IsDescendantOf(x.[Node]) = 1
AND child.[Node] <> x.[Node]
GROUP BY x.[Node], x.[Id], x.[ParentId], x.[Name], x.[ProductCount]
ORDER BY x.[Id]
``````

Result

Solution 4:

I couldn’t come up with a good T-SQL, set based answer, but I did come up with an answer:
The temp table mimics your table structure. The table variable is a work table.

``````--Initial table
CREATE TABLE #products (Id INT, ParentId INT, NAME VARCHAR(255), ProductCount INT)
INSERT INTO #products
( ID,ParentId, NAME, ProductCount )
VALUES  ( 1,-1,'Cars',0),(2,-1,'Bikes',1),(3,1,'Ford',10),(4,3,'Mustang',7),(5,3,'Focus',4)

--Work table
DECLARE @products TABLE (ID INT, ParentId INT, NAME VARCHAR(255), ProductCount INT, ProductCountIncludingChildren INT)
INSERT INTO @products
( ID ,
ParentId ,
NAME ,
ProductCount ,
ProductCountIncludingChildren
)
SELECT  Id ,
ParentId ,
NAME ,
ProductCount,
0
FROM #products

DECLARE @i INT
SELECT @i = MAX(id) FROM @products

--Stupid loop - loops suck
WHILE @i > 0
BEGIN
WITH cte AS (SELECT ParentId, SUM(ProductCountIncludingChildren) AS ProductCountIncludingChildren FROM @products GROUP BY ParentId)
UPDATE p1
SET p1.ProductCountIncludingChildren = p1.ProductCount + isnull(p2.ProductCountIncludingChildren,0)
FROM @products p1
LEFT OUTER JOIN cte p2 ON p1.ID = p2.ParentId
WHERE p1.ID = @i

SELECT @i = @i - 1
END

SELECT *
FROM @products

DROP TABLE #products
``````

I’d be very interested to see a better, set based approach. The problem I ran into is that when you use recursive cte’s, you start with the parent and work toward the children – this doesn’t really work for getting a sum at the parent levels. You’d have to do some kind of backward recursive cte.

Solution 5:

This wont be optimal but it works, however it involves 2 CTEs. 1 main CTE and a CTE in a table valued function to sum up the values for each sub tree.

The first CTE

``````;WITH cte
AS
(
SELECT
anchor.Id,
anchor.ParentId,
anchor.Name,
anchor.ProductCount,
s.Total AS ProductCountIncludingChildren
FROM
testTable anchor
CROSS APPLY SumChild(anchor.id) s
WHERE anchor.parentid = -1
UNION ALL
SELECT
child.Id,
child.ParentId,
child.Name,
child.ProductCount,
s.Total AS ProductCountIncludingChildren
FROM
cte
INNER JOIN testTable child on child.parentid = cte.id
CROSS APPLY SumChild(child.id) s
)
SELECT * from cte
``````

AND the function

``````CREATE FUNCTION SumChild
(
@id int

)
RETURNS TABLE
AS
RETURN
(
WITH cte
AS
(
SELECT
anchor.Id,
anchor.ParentId,
anchor.ProductCount
FROM
testTable anchor
WHERE anchor.id = @id
UNION ALL
SELECT
child.Id,
child.ParentId,
child.ProductCount
FROM
cte
INNER JOIN testTable child on child.parentid = cte.id
)
SELECT SUM(ProductCount) AS Total from CTE
)
GO
``````

Which results in:

from the source table