Hello,
Please can anyone advice my how to create cumulative sum efficiently? My data structure could be represented by the example below:
CustomerID | ProductID | PurchaseOrderID | Quantity | Cumulative Purchase |
1 | 1 | 1 | 12 | 12 |
1 | 1 | 2 | 67 | 79 |
1 | 1 | 3 | 12 | 91 |
1 | 2 | 1 | 7 | 7 |
1 | 2 | 2 | 65 | 72 |
1 | 2 | 3 | 48 | 120 |
2 | 2 | 1 | 20 | 20 |
2 | 2 | 2 | 29 | 49 |
2 | 2 | 3 | 24 | 73 |
2 | 2 | 4 | 32 | 105 |
2 | 3 | 1 | 12 | 12 |
2 | 3 | 2 | 88 | 100 |
2 | 3 | 3 | 19 | 119 |
This is code generating the same table for SQL testing:
declare@TestPurchasesastable (
CustomerIDint,
ProductIDint,
PurchaseOrderIDint,
Quantityint
)
insert@TestPurchases
values (1, 1, 1, 12),
(1, 1, 2, 67),
(1, 1, 3, 12),
(1, 2, 1, 7),
(1, 2, 2, 65),
(1, 2, 3, 48),
(2, 2, 1, 20),
(2, 2, 2, 29),
(2, 2, 3, 24),
(2, 2, 4, 32),
(2, 3, 1, 12),
(2, 3, 2, 88),
(2, 3, 3, 19)
Each customer purchases one or more product and can purchase one product more than one time. I’m trying to calculate cumulative sum of purchases as described in the last column. I usually use following code:
selectdt.CustomerID
,dt.ProductID
,dt.PurchaseOrderID
,dt.Quantity
,sum(CumSum.Quantity)asCummulativePurchase
from@TestPurchasesasdt
leftjoin@TestPurchasesasCumSum
on dt.CustomerID=CumSum.CustomerID
and dt.ProductID=CumSum.ProductID
anddt.PurchaseOrderID>=CumSum.PurchaseOrderID
groupbydt.CustomerID
,dt.ProductID
,dt.PurchaseOrderID
,dt.Quantity
The code works well as long as one customer doesn’t have too many purchases of one product. The temporarily table created by optimizer has one half of square of purchase count per each customer/product combination. My real data has several millions of customer/product combinations and some of them have several thousand purchases.
Can anybody recommend approach that would be more efficient than the query above?
Thank you!
Daniel