Recursive sum in tree structure
I have a tree struture in a single table. The table is a tree of categories that can be nested endlessly. Each category has a ProductCount column that tells how many products are directly in the category (not summing child categories).
Id | ParentId | Name | ProductCount
1 | -1 | Cars | 0
2 | -1 | Bikes | 1
3 | 1 | Ford | 10
4 | 3 | Mustang | 7
5 | 3 | Focus | 4
I would like to make a sql query that for each row/category gives me the number of products including the ones in the child categories.
The output for the table above should be
Id | ParentId | Name | ProductCount | ProductCountIncludingChildren
1 | -1 | Cars | 0 | 21
2 | -1 | Bikes | 1 | 1
3 | 1 | Ford | 10 | 21
4 | 3 | Mustang | 7 | 7
5 | 3 | Focus | 4 | 4
I know I probably should use CTE, but cant quite get it working the way it should.
Any help is appreciated!
You can use a recursive CTE where you in the anchor part get all rows and in the recursive part join to get the child rows. Remember the original
RootID from the anchor part and do sum aggregate in the main query grouped by
MS SQL Server 2012 Schema Setup:
create table T ( Id int primary key, ParentId int, Name varchar(10), ProductCount int ); insert into T values (1, -1, 'Cars', 0), (2, -1, 'Bikes', 1), (3, 1, 'Ford', 10), (4, 3, 'Mustang', 7), (5, 3, 'Focus', 4); create index IX_T_ParentID on T(ParentID) include(ProductCount, Id);
with C as ( select T.Id, T.ProductCount, T.Id as RootID from T union all select T.Id, T.ProductCount, C.RootID from T inner join C on T.ParentId = C.Id ) select T.Id, T.ParentId, T.Name, T.ProductCount, S.ProductCountIncludingChildren from T inner join ( select RootID, sum(ProductCount) as ProductCountIncludingChildren from C group by RootID ) as S on T.Id = S.RootID order by T.Id option (maxrecursion 0)
| ID | PARENTID | NAME | PRODUCTCOUNT | PRODUCTCOUNTINCLUDINGCHILDREN | |----|----------|---------|--------------|-------------------------------| | 1 | -1 | Cars | 0 | 21 | | 2 | -1 | Bikes | 1 | 1 | | 3 | 1 | Ford | 10 | 21 | | 4 | 3 | Mustang | 7 | 7 | | 5 | 3 | Focus | 4 | 4 |
This is the same concept as Tom’s answer, but less code (and way faster).
with cte as ( select v.Id, v.ParentId, v.Name, v.ProductCount, cast('/' + cast(v.Id as varchar) + '/' as varchar) Node from Vehicle v where ParentId = -1 union all select v.Id, v.ParentId, v.Name, v.ProductCount, cast(c.Node + CAST(v.Id as varchar) + '/' as varchar) from Vehicle v join cte c on v.ParentId = c.Id ) select c1.Id, c1.ParentId, c1.Name, c1.ProductCount, c1.ProductCount + SUM(isnull(c2.ProductCount, 0)) ProductCountIncludingChildren from cte c1 left outer join cte c2 on c1.Node <> c2.Node and left(c2.Node, LEN(c1.Node)) = c1.Node group by c1.Id, c1.ParentId, c1.Name, c1.ProductCount order by c1.Id
SQL Fiddle (I added some extra data rows for testing)
Actually this could be a good use of HIERARCHYID in SQL Server..
CREATE TABLE [dbo].[CategoryTree] ( [Id] INT, [ParentId] INT, [Name] VARCHAR(100), [ProductCount] INT ) GO INSERT [dbo].[CategoryTree] VALUES (1, -1, 'Cars', 0), (2, -1, 'Bikes', 1), (3, 1, 'Ford', 10), (4, 3, 'Mustang', 7), (5, 3, 'Focus', 4) --,(6, 1, 'BMW', 100) GO
WITH [cteRN] AS ( SELECT *, ROW_NUMBER() OVER ( PARTITION BY [ParentId] ORDER BY [ParentId]) AS [ROW_NUMBER] FROM [dbo].[CategoryTree] ), [cteHierarchy] AS ( SELECT CAST( CAST(hierarchyid::GetRoot() AS VARCHAR(100)) + CAST([ROW_NUMBER] AS VARCHAR(100)) + '/' AS HIERARCHYID ) AS [Node], * FROM [cteRN] WHERE [ParentId] = -1 UNION ALL SELECT CAST( hierarchy.Node.ToString() + CAST(RN.[ROW_NUMBER] AS VARCHAR(100) ) + '/' AS HIERARCHYID), rn.* FROM [cteRN] rn INNER JOIN [cteHierarchy] hierarchy ON rn.[ParentId] = hierarchy.[Id] ) SELECT x.[Node].ToString() AS [Node], x.[Id], x.[ParentId], x.[Name], x.[ProductCount], x.[ProductCount] + SUM(ISNULL(child.[ProductCount],0)) AS [ProductCountIncludingChildren] FROM [cteHierarchy] x LEFT JOIN [cteHierarchy] child ON child.[Node].IsDescendantOf(x.[Node]) = 1 AND child.[Node] <> x.[Node] GROUP BY x.[Node], x.[Id], x.[ParentId], x.[Name], x.[ProductCount] ORDER BY x.[Id]
I couldn’t come up with a good T-SQL, set based answer, but I did come up with an answer:
The temp table mimics your table structure. The table variable is a work table.
--Initial table CREATE TABLE #products (Id INT, ParentId INT, NAME VARCHAR(255), ProductCount INT) INSERT INTO #products ( ID,ParentId, NAME, ProductCount ) VALUES ( 1,-1,'Cars',0),(2,-1,'Bikes',1),(3,1,'Ford',10),(4,3,'Mustang',7),(5,3,'Focus',4) --Work table DECLARE @products TABLE (ID INT, ParentId INT, NAME VARCHAR(255), ProductCount INT, ProductCountIncludingChildren INT) INSERT INTO @products ( ID , ParentId , NAME , ProductCount , ProductCountIncludingChildren ) SELECT Id , ParentId , NAME , ProductCount, 0 FROM #products DECLARE @i INT SELECT @i = MAX(id) FROM @products --Stupid loop - loops suck WHILE @i > 0 BEGIN WITH cte AS (SELECT ParentId, SUM(ProductCountIncludingChildren) AS ProductCountIncludingChildren FROM @products GROUP BY ParentId) UPDATE p1 SET p1.ProductCountIncludingChildren = p1.ProductCount + isnull(p2.ProductCountIncludingChildren,0) FROM @products p1 LEFT OUTER JOIN cte p2 ON p1.ID = p2.ParentId WHERE p1.ID = @i SELECT @i = @i - 1 END SELECT * FROM @products DROP TABLE #products
I’d be very interested to see a better, set based approach. The problem I ran into is that when you use recursive cte’s, you start with the parent and work toward the children – this doesn’t really work for getting a sum at the parent levels. You’d have to do some kind of backward recursive cte.
This wont be optimal but it works, however it involves 2 CTEs. 1 main CTE and a CTE in a table valued function to sum up the values for each sub tree.
The first CTE
;WITH cte AS ( SELECT anchor.Id, anchor.ParentId, anchor.Name, anchor.ProductCount, s.Total AS ProductCountIncludingChildren FROM testTable anchor CROSS APPLY SumChild(anchor.id) s WHERE anchor.parentid = -1 UNION ALL SELECT child.Id, child.ParentId, child.Name, child.ProductCount, s.Total AS ProductCountIncludingChildren FROM cte INNER JOIN testTable child on child.parentid = cte.id CROSS APPLY SumChild(child.id) s ) SELECT * from cte
AND the function
CREATE FUNCTION SumChild ( @id int ) RETURNS TABLE AS RETURN ( WITH cte AS ( SELECT anchor.Id, anchor.ParentId, anchor.ProductCount FROM testTable anchor WHERE anchor.id = @id UNION ALL SELECT child.Id, child.ParentId, child.ProductCount FROM cte INNER JOIN testTable child on child.parentid = cte.id ) SELECT SUM(ProductCount) AS Total from CTE ) GO
Which results in:
from the source table
Apologies about formatting.