I'm doing some IIS log file analysis and trying to group by referrer. This is a challenge since you can have multiple prefixes for the same domain:
SELECT
CASE
WHEN Referrer LIKE 'http%://microsoft.com/%' THEN 'Microsoft'
WHEN Referrer LIKE 'https://microsoft.com/%' THEN 'Microsoft'
WHEN Referrer LIKE 'http://www.microsoft.com/%' THEN 'Microsoft'
WHEN Referrer LIKE 'https://www.microsoft.com/%' THEN 'Microsoft'
WHEN Referrer LIKE 'http://social.microsoft.com/%' THEN 'Microsoft'
WHEN Referrer LIKE 'https://social.microsoft.com/%' THEN 'Microsoft'
ELSE 'Other'
END AS MyGroup
, COUNT(*)
FROM MyIISLog
WHERE
Referrer LIKE 'http://microsoft.com/%'
OR Referrer LIKE 'https://microsoft.com/%'
OR Referrer LIKE 'http://www.microsoft.com/%'
OR Referrer LIKE 'https://www.microsoft.com/%'
OR Referrer LIKE 'http://social.microsoft.com/%'
OR Referrer LIKE 'https://social.microsoft.com/%'
GROUP BY
CASE
WHEN Referrer LIKE 'http%://microsoft.com/%' THEN 'Microsoft'
WHEN Referrer LIKE 'https://microsoft.com/%' THEN 'Microsoft'
WHEN Referrer LIKE 'http://www.microsoft.com/%' THEN 'Microsoft'
WHEN Referrer LIKE 'https://www.microsoft.com/%' THEN 'Microsoft'
WHEN Referrer LIKE 'http://social.microsoft.com/%' THEN 'Microsoft'
WHEN Referrer LIKE 'https://social.microsoft.com/%' THEN 'Microsoft'
ELSE 'Other'
END
Ideally this could be shortened to one line but I can't seem to figure out how to do it.
Just so as to head off any questions, this won't work:
WHEN Referrer LIKE 'http[s]://microsoft.com/%' THEN 'Microsoft'
This will match https:// only; it won't match on http://
And this just leads to possible garbage results:
http%://microsoft.com/%
This would match on anything that started with "http" and contained "://microsoft.com/" - "https://someotherdomain.net/somepage.php?://microsoft.com/" would match (which is not what I want).
Any suggestions? I tried searching but the search engines keep thinking I'm trying to find optional parameters (which is not what I want).
A regex example of what I want is like this:
^https?\://(([^.]+)\.)?microsoft\.com
I know that LIKE isn't akin to regular expressions but it sure would be nice if I could simplify my currentlywaytoolong CASE and WHERE clauses.
Thank you
EDIT - In this example, I've only used a single domain (microsoft.com). However in my real world use, I need to support multiple domains in the same query (microsoft.com, google.com, yahoo.com, etc) and their variations.