Select COUNT(*) of subquery without running it twice
I’ve got a procedure to return a result set which is limited by page number and some other stuff. As an OUTPUT parameter I need to return a total amount of selected rows according to the parameters except the page number. So I have something like that:
WITH SelectedItems AS
(SELECT Id, Row1, Row2, ROW_NUMBER() OVER (ORDER BY Row1) AS Position
WHERE Row2 = @Row2)
SELECT Id, Row1, Row2
WHERE Position BETWEEN @From AND @To
And then I need to set the OUTPUT parameter to the number of rows in the innerquery. I can just copy the query and count it, but this query could returns thousands of rows (and will be more in the future), so I am looking for method to do that with a good performance. I was thinking about table variables, is it a good idea? Or any other suggestions?
To be more specific, it’s the Microsoft SQL Server 2008.
Thank you, Jan
You can count the total rows as a separate column in your main query using COUNT(*). Like this:
WITH SelectedItems AS (SELECT Id, Row1, Row2, ROW_NUMBER() OVER (ORDER BY Row1) AS Position, COUNT(*) OVER () AS TotalRows FROM Items WHERE Row2 = @Row2) SELECT Id, Row1, Row2 FROM SelectedItems WHERE Position BETWEEN @From AND @To
This will return the count in your result set rather than in a output parameter, but that should fit your requirements. Otherwise, combine with a temp table:
DECLARE @tmp TABLE (Id int, RowNum int, TotalRows int); WITH SelectedItems AS (SELECT Id, Row1, Row2, ROW_NUMBER() OVER (ORDER BY Row1) AS Position, COUNT(*) OVER () AS TotalRows FROM Items WHERE Row2 = @Row2) INSERT @tmp SELECT Id, Row1, Row2 FROM SelectedItems WHERE Position BETWEEN @From AND @To SELECT TOP 1 @TotalRows = TotalRows FROM @tmp SELECT * FROM @tmp
You will find using a temp table for just your paged result will not use much memory (depending on your page size of course) and you’re only keeping it live for a short period of time. Selecting the full result set from the temp table and selecting the TotalRows will only take a tiny bit longer.
This will be much faster than running a totally separate query, which in my test (repeating the WITH) doubled the execution time.
I think you should do it in a separate query. While those two queries might look pretty much the same, but the way query optimizer deals with them would differ pretty significantly.
Theoretically, SQL Server might not even go through all the rows in the subquery to be able to count it.
I don’t have access to my code base right now, but I believe that you can use COUNT() OVER (or a similar command) to return the total number of rows as part of the subquery. You can then return that as part of the final result set. It gets duplicated in every row, but that’s a minor performance hit in my opinion for an application that is using paging and should have limited final results anyway.
In a couple hours I’ll post the exact code.
EDIT: Here’s the line that I’ve used to generate the count. In the end our developers wanted a separate method to get the count by itself, so now I’m maintaining the search criteria in two places within the same stored procedure.
COUNT(*) OVER (PARTITION BY '') AS TotalCount
Add that to your CTE and then you can select the TotalCount and it will be a column in each of your rows.
You MUST run the entire query, without limiting the range, at least once in order to get the full row count. Since you’re going to do this anyway, you should select @@RowCount in order to output the total rows found, rather than overloading your data reader with a redundant count(*) column in every row.
1. When running NEW query for first time:
select YOUR_COLUMNS from YOUR_TABLE where YOUR_SEARCH_CONDITION order by YOUR_COLUMN_ORDERING_LIST; select @@rowcount;
2. Only READ the first X rows
The above query avoids flooding your SqlDataReader with a redundant COUNT(*) column, that would otherwise be sent for every call to SqlDataReader.Read(). Since you are running the query for the first time… instead of selecting a range, just READ only the first X rows. This gives you exactly what you want… the full result count, the first X records, and efficient streaming of the result set without the redundant count column.
3. For subsequent runs of the SAME query to get a subset of the results
select YOUR_COLUMNS from (select YOUR_COLUMNS, ROW_NUMBER() over(order by BY YOUR_COLUMN_ORDERING_LIST) as RowNum) Results where Results.RowNum between @From and @To;
In any case,
@@rowcount is the most direct way to access the count on the first run of the query without limiting the result set (ur gonna want first X results anyway), without running a separate count() query, without using a temp table, and without including a redundant count() column.
Couldn’t you just set the output variable to @@RowCount? This will get the rows affected by the last executed statement:
SELECT stuff FROM mytable SET @output = @@ROWCOUNT
This should give you what you need, and doesn’t involve running the query again.