Quantcast
Channel: Transact-SQL forum
Viewing all articles
Browse latest Browse all 23857

Best Match Across Multiple Columns

$
0
0

I have a requirement to get the best match between the columns of two tables, so for example:

Table A

ID

C1

C2

C3

C4

C5

C6

1

A

B

2

A

D

C

3

A

4

D

5

B

6

B

C

Table B

ID

C1

C2

C3

C4

C5

C6

Match

1

A

B

B

B

D

C

1, 3, 5

2

A

C

B

C

C

C

1, 3, 5, 6

3

B

B

B

C

C

C

5

4

E

D

B

F

F

E

4, 5

A row matches when any of the columns are equalor null.  The match column in Table B shows which rows of Table A that row should match. Once I know which columns match I will then be able to determine the best match by which row matches most columns. The following query seems to do the job of identifying which rows match:

SELECT TableA.ID, TableB.ID
FROM TableA
	LEFT OUTER JOIN TableB
	ON (TableB.C1 = TableA.C1 OR TableA.C1 IS NULL)
		AND (TableB.C2 = TableA.C2 OR TableA.C2 IS NULL)
		AND (TableB.C3 = TableA.C3 OR TableA.C3 IS NULL)
		AND (TableB.C4 = TableA.C4 OR TableA.C4 IS NULL)
		AND (TableB.C5 = TableA.C5 OR TableA.C5 IS NULL)
		AND (TableB.C6 = TableA.C6 OR TableA.C6 IS NULL)
WHERE NOT (TableA.C1 IS NULL
		AND TableA.C2 IS NULL
		AND TableA.C3 IS NULL
		AND TableA.C4 IS NULL
		AND TableA.C5 IS NULL
		AND TableA.C6 IS NULL)

However I am finding that when both these table have around 400K records the query just takes far too long to run. How can I optimise this query?  Are there any indexes which would help or is there a better way to approach this problem?

Thanks in advance for any help,

Graham.



Viewing all articles
Browse latest Browse all 23857

Trending Articles