The Question
I had an interview via web-meeting recently. I was given access to SQL Manager and I had to solve a set of TSQL programming problems. The first programming task they throw at me is a question, "What is a correlated subquery?" I said, "I do not know" and moved on. It turns out I knew what a correlates subquery was but I just did not know such queries were called that.This post presents an overview of correlates subqueries in TSQL.
Subqueries
A subquery is just a portion of a query that is evaluated before main part of a query. To understand this, consider the following (from the same job interview):SELECT *
FROM [TableA] AS A
INNER JOIN [TableB] AS B
ON A.[AID] = B.[AID]
INNER JOIN [TableC] AS C
ON A.[AID] = C.[AID]
INNER JOIN [TableD] AS D
ON A.[AID] = D.[AID]
INNER JOIN [TableE] AS E
ON A.[AID] = E.[AID]
WHERE [AName] = 'Smith'
SELECT *
FROM [TableA] AS A
WHERE [AName] = 'Smith'
SELECT *
FROM (
SELECT *
FROM [TableA] AS A
WHERE [AName] = 'Smith') AS A
INNER JOIN [TableB] AS B
ON A.[AID] = B.[AID]
INNER JOIN [TableC] AS C
ON A.[AID] = C.[AID]
INNER JOIN [TableE] AS E
ON A.[AID] = E.[AID]
INNER JOIN [TableE] AS E
ON A.[AID] = E.[AID]
Correlated Subqueries
A correlated subquery is a subquery that depends on values from the outer query. Technet covers correlated subqueries here Correlated Subqueries. To demonstrate correlated queries, consider a database with tables:
- Player: this table holds a list of soccer players (football players for you non-Americans)
- Goal: this table holds a list of goals scored by players
To demonstrate correlated subqueries, a query will be developed that shows all players who have not scored any goals:
SELECT P.[PlayerID], P.[PlayerName]
FROM [Player] AS P
WHERE NOT EXISTS (
SELECT *
FROM [Goal] AS G
WHERE P.[PlayerID] = G.[PlayerID])
So yes, I knew how to write a correlated subquery. I just did not know the definition.
Implementing the query without using a Correlated Subquery
A LEFT JOIN in a subquery can be used as follows to implement the same query -- all players who have not scored a single goal:
SELECT [PlayerID], [PlayerName]
FROM (
SELECT P.[PlayerID], P.[PlayerName], G.[GoalID]
FROM [Player] AS P
LEFT JOIN [Goal] AS G
ON P.[PlayerID] = G.[PlayerID]) AS P
WHERE P.[GoalID] IS NULL
Conclusion
Yes, I moved onto the second interview.
Hi Jan, in my opinion the answer to the slow running query is to index the IDs and the Name helping the query optimizer do its job.
ReplyDelete