Quantcast
Channel: Transact-SQL forum
Viewing all articles
Browse latest Browse all 23857

Efficient Cumulative Sum Calculation

$
0
0

Hello,

Please can anyone advice my how to create cumulative sum efficiently? My data structure could be represented by the example below:

CustomerID

ProductID

PurchaseOrderID

Quantity

Cumulative Purchase

1

1

1

12

12

1

1

2

67

79

1

1

3

12

91

1

2

1

7

7

1

2

2

65

72

1

2

3

48

120

2

2

1

20

20

2

2

2

29

49

2

2

3

24

73

2

2

4

32

105

2

3

1

12

12

2

3

2

88

100

2

3

3

19

119

This is code generating the same table for SQL testing:

declare@TestPurchasesastable (

       CustomerIDint,

       ProductIDint,

       PurchaseOrderIDint,

       Quantityint

)

insert@TestPurchases

values (1, 1, 1, 12),

               (1, 1, 2, 67),

               (1, 1, 3, 12),

               (1, 2, 1, 7),

               (1, 2, 2, 65),

               (1, 2, 3, 48),

               (2, 2, 1, 20),

               (2, 2, 2, 29),

               (2, 2, 3, 24),

               (2, 2, 4, 32),

               (2, 3, 1, 12),

               (2, 3, 2, 88),

               (2, 3, 3, 19)

Each customer purchases one or more product and can purchase one product more than one time. I’m trying to calculate cumulative sum of purchases as described in the last column. I usually use following code:

selectdt.CustomerID

       ,dt.ProductID

       ,dt.PurchaseOrderID

       ,dt.Quantity

       ,sum(CumSum.Quantity)asCummulativePurchase

from@TestPurchasesasdt

leftjoin@TestPurchasesasCumSum

       on      dt.CustomerID=CumSum.CustomerID

       and     dt.ProductID=CumSum.ProductID

       anddt.PurchaseOrderID>=CumSum.PurchaseOrderID

groupbydt.CustomerID

       ,dt.ProductID

       ,dt.PurchaseOrderID

       ,dt.Quantity

The code works well as long as one customer doesn’t have too many purchases of one product. The temporarily table created by optimizer has one half of square of purchase count per each customer/product combination. My real data has several millions of customer/product combinations and some of them have several thousand purchases.

Can anybody recommend approach that would be more efficient than the query above?

Thank you!

Daniel

Viewing all articles
Browse latest Browse all 23857

Trending Articles