Monday, March 9, 2009

Writing recursive queries in sql server 2000/2005

In SQL Server 2000 there is no simple way to create recursive queries that have several levels of data (hierarchical data). Generally a recursive query is needed when you have a parent and child data stored in the same table. One example may be employee data where all employee data is stored in one Employee table and there is an indicator that specifies the employees supervisor that points back to another record in the same table. This is done quite often for other data as well.

To query this data to find the employee and supervisor is pretty easy. The query would be:



SELECT e1.EmpId, e1.FirstName, e1.LastName,
e1.SupervisorID, e2.FirstName AS SupFirstName, e2.LastName
AS SupLastNameFROM Employee e1 LEFT JOIN Employee e2
ON e1.SupervisorID = e2.EmpId

So you can see this is pretty simple to just get this level of data, but what if you want to find out all of the direct and indirect reports of a particular employee.

C programmers use recursive programming techniques for traversing tree structures. The same can be implemented in T-SQL using recursive stored procedure calls.

Consider the employee table of an organization, that stores all the employee records. Each employee is linked to his/her manager by a manger ID. This table can be represented using the following CREATE TABLE statement:



CREATE TABLE dbo.Emp
(
EmpID int PRIMARY KEY,
EmpName varchar(30),
MgrID int FOREIGN KEY REFERENCES Emp(EmpID)
)

Notice that, EmpID is decalred as a primary key, and the MgrID column is declared as a foreign key constraint, that references the EmpID column of the same table, that is, a self referencing table. This is so, because all employees and managers are stored in the same table.

Now what all we need is a stored procedure that calls itself recursively. You should be aware of a limitation imposed by SQL Server though. A stored procedure can nest itself upto a maximum of 32 levels. If you exceed this limit, you will receive the following error:
Server: Msg 217, Level 16, State 1, Procedure , Line 1 Maximum stored procedure nesting level exceeded (limit 32).

The SQL server 2005 microsoft come up with Common Table Expressions for writing Recursive queries.

A common table expression (CTE) provides the significant advantage of being able to reference itself, thereby creating a recursive CTE. A recursive CTE is one in which an initial CTE is repeatedly executed to return subsets of data until the complete result set is obtained.


A query is referred to as a recursive query when it references a recursive CTE. Returning hierarchical data is a common use of recursive queries, for example: Displaying employees in an organizational chart, or data in a bill of materials scenario in which a parent product has one or more components and those components may, in turn, have subcomponents or may be components of other parents.


A recursive CTE can greatly simplify the code required to run a recursive query within a SELECT, INSERT, UPDATE, DELETE, or CREATE VIEW statement. In earlier versions of SQL Server, a recursive query usually requires using temporary tables, cursors, and logic to control the flow of the recursive steps.

recursive CTE in Transact-SQL is similar to recursive routines in other programming languages. Although a recursive routine in other languages returns a scalar value, a recursive CTE can return multiple rows.


A recursive CTE consists of three elements:
1. Invocation of the routine.

The first invocation of the recursive CTE consists of one or more CTE_query_definitions joined by UNION ALL, UNION, EXCEPT, or INTERSECT operators. Because these query definitions form the base result set of the CTE structure, they are referred to as anchor members.CTE_query_definitions are considered anchor members unless they reference the CTE itself. All anchor-member query definitions must be positioned before the first recursive member definition, and a UNION ALL operator must be used to join the last anchor member with the first recursive member.
2. Recursive invocation of the routine.

The recursive invocation includes one or more CTE_query_definitions joined by UNION ALL operators that reference the CTE itself. These query definitions are referred to as recursive members.
3. Termination check.

The termination check is implicit; recursion stops when no rows are returned from the previous invocation.

The recursive CTE structure must contain at least one anchor member and one recursive member. The following pseudocode shows the components of a simple recursive CTE that contains a single anchor member and single recursive member.


WITH cte_name ( column_name [,...n] )
AS
(
CTE_query_definition –- Anchor member is defined.
UNION ALL
CTE_query_definition –- Recursive member is defined referencing cte_name.
)
-- Statement using the CTE
SELECT * FROM cte_name

Example:

The following example shows the semantics of the recursive CTE structure by returning a hierarchical list of employees, starting with the highest ranking employee, in the company. The statement that executes the CTE limits the result set to employees in the Research and Development Group.



WITH DirectReports (ManagerID, EmployeeID, Title, DeptID, Level)
AS
(-- Anchor member definition
SELECT e.ManagerID, e.EmployeeID, e.Title, edh.DepartmentID,
0 AS Level FROM HumanResources.Employee AS e INNER JOIN HumanResources.EmployeeDepartmentHistory AS edh
ON e.EmployeeID = edh.EmployeeID AND edh.EndDate IS NULL WHERE ManagerID IS NULL
UNION ALL
-- Recursive member definition
SELECT e.ManagerID, e.EmployeeID, e.Title, edh.DepartmentID, Level + 1 FROM HumanResources.Employee AS e
INNER JOIN HumanResources.EmployeeDepartmentHistory AS edh
ON e.EmployeeID = edh.EmployeeID AND edh.EndDate IS NULL INNER JOIN DirectReports AS d
ON e.ManagerID = d.EmployeeID)
-- Statement that executes the CTE
SELECT ManagerID, EmployeeID, Title, LevelFROM DirectReportsINNER JOIN HumanResources.Department AS dp ON DirectReports.DeptID = dp.DepartmentID
WHERE dp.GroupName = N'Research and Development' OR Level = 0;