Read BETWEEN The Dates
Finding records that exist between two dates can sometimes be slow. This is especially true if temporary tables and table variables are used. I did some tinkering and testing of the different ways that you can get the same records but with faster results. The end result I shall detail here.
Build Test Data
The first step is to build some test data. This will make following along a lot easier if you can just copy this into your SQL Console and experiment. I have defined 2 test tables: @ProjectList and @WorkTable. @ProjectList will define all of the major steps of an imaginary work project. The main reason for this table is to be able to have a start and end date range for building the initial date list. @WorkTable will define the individual jobs performed within this work project. Each job listed will have its own start and stop date. This can be considered as individual employees on the work site. As different project steps are completed, some employees will stop working and new employees will start work. The tables are simple and contain a small amount of records but should still be enough to test the script.
/*
Test Project table definition and data load.
- /
DECLARE @ProjectList table
(
Proj_ID int,
Proj_Name varchar(60),
Proj_Start datetime,
Proj_End datetime
)
INSERT INTO @ProjectList
SELECT
1 AS Proj_ID,
'Discuss Building' AS Proj_Name,
'1997-01-15' AS Proj_Start,
'1997-07-20' AS Proj_End
UNION
SELECT
2 AS Proj_ID,
'Draw Building Plan' AS Proj_Name,
'1997-07-21' AS Proj_Start,
'1998-09-15' AS Proj_End
UNION
SELECT
3 AS Proj_ID,
'Argue About Fees' AS Proj_Name,
'1998-03-01' AS Proj_Start,
'1999-10-20' AS Proj_End
UNION
SELECT
4 AS Proj_ID,
'Groundbreaking' AS Proj_Name,
'1998-11-01' AS Proj_Start,
'1998-11-02' AS Proj_End
UNION
SELECT
5 AS Proj_ID,
'Construction' AS Proj_Name,
'1998-11-01' AS Proj_Start,
'2001-05-06' AS Proj_End
UNION
SELECT
6 AS Proj_ID,
'Stock The Supplies' AS Proj_Name,
'2001-01-04' AS Proj_Start,
'2001-07-20' AS Proj_End
UNION
SELECT
7 AS Proj_ID,
'Order The Books' AS Proj_Name,
'2001-06-01' AS Proj_Start,
'2003-06-30' AS Proj_End
UNION
SELECT
8 AS Proj_ID,
'Publisher Problems' AS Proj_Name,
'2001-07-01' AS Proj_Start,
'2004-08-18' AS Proj_End
UNION
SELECT
9 AS Proj_ID,
'Stock The Shelves' AS Proj_Name,
'2002-02-01' AS Proj_Start,
'2005-03-27' AS Proj_End
UNION
SELECT
10 AS Proj_ID,
'Library Is Open' AS Proj_Name,
'2004-11-01' AS Proj_Start,
'2012-12-11' AS Proj_End
/*
Test Work table definition and data load.
- /
DECLARE @WorkTable table
(
Work_ID int,
Work_Start datetime,
Work_End datetime
)
INSERT INTO @WorkTable
SELECT 1, '1997-01-15', '1998-01-14'
UNION SELECT 2, '1997-02-01', '1997-03-01'
UNION SELECT 3, '1997-03-01', '1997-04-01'
UNION SELECT 4, '1997-04-01', '1997-05-01'
UNION SELECT 5, '1997-05-01', '1997-06-01'
UNION SELECT 6, '1997-06-01', '1997-07-01'
UNION SELECT 7, '1997-07-01', '1997-08-01'
UNION SELECT 8, '1997-08-01', '1997-09-01'
UNION SELECT 9, '1997-09-01', '1997-10-01'
UNION SELECT 10, '1997-10-01', '1997-11-01'
UNION SELECT 11, '1997-11-01', '1997-12-01'
UNION SELECT 12, '1997-12-01', '1998-01-01'
UNION SELECT 13, '1998-01-01', '1998-02-01'
UNION SELECT 14, '1998-02-01', '1998-03-01'
UNION SELECT 15, '1998-03-01', '1998-04-01'
UNION SELECT 16, '1998-04-01', '1998-05-01'
UNION SELECT 17, '1998-05-01', '1998-06-01'
UNION SELECT 18, '1998-06-01', '1998-07-01'
UNION SELECT 19, '1998-07-01', '1998-08-01'
UNION SELECT 20, '1998-08-01', '1998-09-01'
UNION SELECT 21, '1998-09-01', '1998-10-01'
UNION SELECT 22, '1998-10-01', '1998-11-01'
UNION SELECT 23, '1998-11-01', '1998-12-01'
UNION SELECT 24, '1998-12-01', '1999-01-01'
UNION SELECT 25, '1999-01-01', '1999-02-01'
UNION SELECT 26, '1999-02-01', '1999-03-01'
UNION SELECT 27, '1999-03-01', '1999-04-01'
UNION SELECT 28, '1999-04-01', '1999-05-01'
UNION SELECT 29, '1999-05-01', '1999-06-01'
UNION SELECT 30, '1999-06-01', '1999-07-01'
UNION SELECT 31, '1999-07-01', '1999-08-01'
UNION SELECT 32, '1999-08-01', '1999-09-01'
UNION SELECT 33, '1999-09-01', '1999-10-01'
UNION SELECT 34, '1999-10-01', '1999-11-01'
UNION SELECT 35, '1999-11-01', '1999-12-01'
UNION SELECT 36, '1999-12-01', '2000-01-01'
UNION SELECT 37, '2000-01-01', '2000-02-01'
UNION SELECT 38, '2000-02-01', '2000-03-01'
UNION SELECT 39, '2000-03-01', '2000-04-01'
UNION SELECT 40, '2000-04-01', '2000-05-01'
UNION SELECT 41, '2000-05-01', '2000-06-01'
UNION SELECT 42, '2000-06-01', '2000-07-01'
UNION SELECT 43, '2000-07-01', '2000-08-01'
UNION SELECT 44, '2000-08-01', '2000-09-01'
UNION SELECT 45, '2000-09-01', '2000-10-01'
UNION SELECT 46, '2000-10-01', '2000-11-01'
UNION SELECT 47, '2000-11-01', '2000-12-01'
UNION SELECT 48, '2000-12-01', '2001-01-01'
UNION SELECT 49, '2001-01-01', '2001-02-01'
UNION SELECT 50, '2001-02-01', '2001-03-01'
UNION SELECT 51, '2001-03-01', '2001-04-01'
UNION SELECT 52, '2001-04-01', '2001-05-01'
UNION SELECT 53, '2001-05-01', '2001-06-01'
Temporary Date List Table
The next part of the experiment is building the date list. This table will be the listing of all valid work days that are between the minimum start date and maximum end date of the @ProjectList table. If the projects to be used are already known, then that can be used to filter the minimum start date and maximum end date to closer values. This will help reduce the size of all tables and offer one area of speeding up the script.
A recursive CTE is used to build the date list. This does offer the problem that a CTE is normally restricted in the number of recursions allowed so an option "OPTION (MAXRECURSION 0)" needs to be added at the end of this script section to allow the CTE to work. The CTE contains a second table that will use the initial date list created and filter the results to what is wanted. In this case, all weekend dates are tossed out. I also have the code in there to filter out the holidays.
This code is currently commented out as it needs the scripts from this article (Creating Holidays and Special Dates). The temporary table: @CalTable has been created with a Primary Key and with a Unique Nonclustered Index. Both of these help the table variable to process larger amounts of data faster and offers another place to help speed up this script.
/*
This section creates a table variable to hold the calendar information.
This can be used as a speed increase in the other CTE.
Defining the table variable (@CalTable) with a primary key and unique index
helps to reduce to total time used to find all of the records.
- /
DECLARE @CalTable table
(
CalDate datetime PRIMARY KEY CLUSTERED,
CalYear int,
CalMonth int,
UNIQUE NONCLUSTERED (CalDate)
)
;WITH
CalBaseList AS Recursive CTE to build the initial calendar table based on the date ranges.
(
SELECT
MIN(pl.Proj_Start) AS CalStart,
MAX(pl.Proj_End) AS CalEnd
FROM @ProjectList AS pl
WHERE
pl.Proj_ID = 7
pl.Proj_ID BETWEEN 2 AND 6
UNION ALL
SELECT
DATEADD(dd, 1, CalStart) AS CalStart,
CalEnd
FROM CalBaseList
WHERE
DATEADD(dd, 1, CalStart) <= CalEnd
),
CalList AS Filters out any unwanted dates (Weekends, Holidays).
(
SELECT
s2.CalDate,
DATEPART(yy, s2.CalDate) AS CalYear,
DATEPART(mm, s2.CalDate) AS CalMonth
FROM
(
SELECT
s1.CalDate,
DATEPART(dw, s1.CalDate) AS DayFlag,
( Uses routines defined in this article: http://www.sqlservercentral.com/articles/SQL/69774/
SELECT
COUNT(*)
FROM dbo.fn_ActivityForDay(s1.CalDate, 0)
WHERE
ReportFlag = 1
AND DayOffFlag = 1
) AS HolidayFlag
0 AS HolidayFlag
FROM
(
SELECT
CalStart AS CalDate
FROM CalBaseList
) AS s1
) AS s2
WHERE
s2.DayFlag <> 1 Sunday
AND s2.DayFlag <> 7 Saturday
AND s2.HolidayFlag = 0 Holiday Count
)
INSERT INTO @CalTable Load the Calendar Table.
SELECT
cl.CalDate,
cl.CalYear,
cl.CalMonth
FROM CalList AS cl
OPTION (MAXRECURSION 0)
Finding The Dates
The final part is putting the tables from above together and retrieving our desired data. For this, I wanted to show the original way using the BETWEEN method. Then list the way using the split date range check.
The Original BETWEEN
This is the original test script. I have listed the times the SQL Console used to process the script 3 times. They were 11 seconds, 5 seconds and 5 seconds respectively.
/*
Find dates with the BETWEEN method.
- /
11, 5, 5 seconds.
SELECT
wt.Work_ID,
ct.CalDate,
ct.CalYear,
ct.CalMonth
FROM @WorkTable AS wt
LEFT JOIN @CalTable AS ct ON
ct.CalDate BETWEEN wt.Work_Start AND wt.Work_End
Split Date Range Check
This script first finds all of the records that are within the range of the start date based on the JOIN clause. Then, a flag is set, and checked, based on if the date is also within the end date range within the WHERE clause. This script does not save the flag, but the check in the WHERE clause can easily be copied and returned as a regular field. It was the splitting of the date range check that allowed my production scripts to run a lot faster.
As an example, I had a script that was taking over 5 minutes to process. Adding this one change to the script reduced its processing time down to 30 seconds. Definitely a happy increase in speed. When I had ran this test script 3 times, I had the process time of 5 seconds, 3 seconds, and 3 seconds respectively.
/*
Find dates with the individual date range checks.
- /
SELECT 5, 3, 3 seconds
wt.Work_ID,
ct.CalDate,
ct.CalYear,
ct.CalMonth
FROM @WorkTable AS wt
LEFT JOIN @CalTable AS ct ON
wt.Work_Start <= ct.CalDate
WHERE
(
CASE
WHEN ct.CalDate <= wt.Work_End THEN 1
ELSE 0
END
) = 1