Create these tables and stored proc to follow what I am saying.
-- source table where we will put in some test data to load CREATE TABLE dbo.TestData ( BusinessCode VARCHAR(10) NOT NULL, Attr1 VARCHAR(50), Attr2 VARCHAR(50), ) -- type II dimension table where we will load the test data CREATE TABLE dbo.TypeIITest ( TableId INT NOT NULL IDENTITY(1,1), BusinessCode VARCHAR(10) NOT NULL, Attr1 VARCHAR(50), Attr2 VARCHAR(50), EffectiveFrom DATE NOT NULL, EffectiveTo DATE NOT NULL, DateTimeCreated DATETIME NOT NULL DEFAULT GETDATE() ) ALTER TABLE TypeIITest ADD PRIMARY KEY CLUSTERED (BusinessCode,EffectiveTo) -- stored proc that will do the type II load. CREATE PROC [dbo].[Load_TypeIITest] @EffectiveFrom DATE AS -- First delete things with EffectiveFrom = @EffectiveFrom. -- This is only for cases when you re-run during the same day. DELETE dbo.TypeIITest WHERE EffectiveFrom = @EffectiveFrom -- Do TypeII Insert/Update based on InstrumentId INSERT dbo.TypeIITest (BusinessCode,Attr1,Attr2,EffectiveFrom,EffectiveTo) SELECT BusinessCode,Attr1,Attr2,@EffectiveFrom,'2099-12-31' FROM ( MERGE dbo.TypeIITest tgt USING ( SELECT BusinessCode,Attr1,Attr2 FROM dbo.TestData ) src ON tgt.BusinessCode = src.BusinessCode AND tgt.EffectiveTo = '2099-12-31' WHEN MATCHED AND ( src.Attr1 != tgt.Attr1 OR (src.Attr1 IS NULL AND tgt.Attr1 IS NOT NULL) OR (src.Attr1 IS NOT NULL AND tgt.Attr1 IS NULL) OR src.Attr2 != tgt.Attr2 OR (src.Attr2 IS NULL AND tgt.Attr2 IS NOT NULL) OR (src.Attr2 IS NOT NULL AND tgt.Attr2 IS NULL) ) THEN UPDATE SET tgt.EffectiveTo = @EffectiveFrom WHEN NOT MATCHED THEN INSERT (BusinessCode,Attr1,Attr2,EffectiveFrom,EffectiveTo) Values (src.BusinessCode,src.Attr1,src.Attr2,@EffectiveFrom,'2099-12-31') OUTPUT $action AS Action, src.BusinessCode,src.Attr1,src.Attr2 ) changed WHERE changed.Action = 'UPDATE' ;
For those familiar with Type II loads, hopfully, this is standard and I am not doing anything out of the ordinary in the procedure.
Now, here is the scenario. Under normal circumstances, each day I will get some data in dbo.TestData, and I will run EXEC Load_TypeIITest @EffectiveFrom = @Today
So enter some data into dbo.TestSource table for a series of days and run the proc and see the TypeII table like this.
TRUNCATE TABLE dbo.TypeIITest -- On '2014-02-10' TRUNCATE TABLE dbo.TestData INSERT INTO dbo.TestData VALUES ('ABC','hello sir',NULL) EXEC [dbo].[Load_TypeIITest] @EffectiveFrom = '2014-02-10' SELECT * FROM dbo.TypeIITest -- On '2014-02-11' TRUNCATE TABLE dbo.TestData INSERT INTO dbo.TestData VALUES ('ABC','hello',NULL) EXEC [dbo].[Load_TypeIITest] @EffectiveFrom = '2014-02-11' SELECT * FROM dbo.TypeIITest -- On '2014-02-12' TRUNCATE TABLE dbo.TestData INSERT INTO dbo.TestData VALUES ('ABC','hello','blah') EXEC [dbo].[Load_TypeIITest] @EffectiveFrom = '2014-02-12' SELECT * FROM dbo.TypeIITest
You should see what's happening and it should be as expected.
Now, the scenario that is messing me up. Sometimes, I may need to back fill. In this example, I started loading from 2014-02-10. But I may need to back fill starting from some earlier date. When I do that, under the current logic, I end up getting EffectiveFrom and EffectiveTo dates that go out of whack. Hopefully, you can see what I mean by doing this test.
-- On '2014-02-13', I have to load as if it were '2014-01-01' TRUNCATE TABLE dbo.TestData INSERT INTO dbo.TestData VALUES ('ABC','hello','blah1') EXEC [dbo].[Load_TypeIITest] @EffectiveFrom = '2014-01-01' SELECT * FROM dbo.TypeIITest
Do you see what I mean? How do I incorporate this kind of scenario?