Quantcast
Channel: Transact-SQL forum
Viewing all articles
Browse latest Browse all 23857

Extracting individual words from a text column.

$
0
0

I've done many searches for an answer to this question but generally they are looking for specific characters or text in order to determine the data to extract and nothing really fits the bill that I can see.

A colleague of mine has an issue I'm trying to assist with but I'm drawing a blank and could do with some help.

He has a table with over 1/4 billion rows of data and there is a column which contains text as in the example table below

Essentially, at the moment he needs to extract the 3rd word from the TEXT column, but later it could be any of the individual words. As you can see the start position can be different each time and there are occasions when there are double or even triple spaces within the column separating the words.

I have given him an old function I have used in the past which should be able to do the work (below) but it falls over due to the amount of data after a few hours. The table size is approx. 22,570,172 MB. The table has indexing and a text search catalogue on the field in question.

Basically, we need a quicker way to extract the required data without the query falling over. I was contemplating a CTE as I've used these in the past but I've not managed to achieve anything remotely successful. Has anyone got any ideas?

Old function is as follows

/*
NAME:- GET_NAME 

DESCRIPTION:- Get the name

*/

CREATE FUNCTION zzz.GET_NAME(@STR AS VARCHAR(255),@POS AS INT)
RETURNS
VARCHAR(255)

AS

BEGIN

DECLARE @STR_NEW varchar(255)
DECLARE @COUNT INT
DECLARE @COUNTER INT
DECLARE @RESULT varchar(255)

IF @STR = NULL or @STR = ''
	BEGIN
		SET @RESULT = ''
		RETURN @RESULT
	END
	
SET @STR = LTRIM(RTRIM(@STR))

If @STR = '' 
	BEGIN
		SET @RESULT = ''
		RETURN @RESULT
	END

--remove all doublespaces

SET @COUNT = 1
SET @STR_NEW = ''

WHILE @COUNT <= LEN(@STR)

  BEGIN
  If SUBSTRING(@STR, @COUNT, 1) = ' ' 
    BEGIN
    If SUBSTRING(@STR, @COUNT - 1, 1) <> ' ' 
	BEGIN
        SET @STR_NEW = @STR_NEW + SUBSTRING(@STR, @COUNT, 1)
    END
   END
    If SUBSTRING(@STR, @COUNT, 1) <> ' '
    BEGIN 
   SET @STR_NEW = @STR_NEW + SUBSTRING(@STR, @COUNT, 1)
    END

SET @COUNT = @COUNT +1

END 

--' now find pos

SET @COUNT = 1
SET @COUNTER = 1
SET @RESULT = ''

WHILE @COUNT <= LEN(@STR_NEW)
    BEGIN 
    IF @COUNTER = @POS
	BEGIN 

	WHILE @COUNT <=LEN(@STR_NEW)
	 BEGIN 
	  IF SUBSTRING(@STR_NEW,@COUNT, 1)= ' ' 
		BEGIN 
		RETURN @RESULT
		END 
	  SET @RESULT = @RESULT+ SUBSTRING(@STR_NEW,@COUNT,1)
	  SET @COUNT = @COUNT +1
 	END 

	RETURN @RESULT
	END    
	
	If SUBSTRING(@STR_NEW, @COUNT, 1) = ' ' 
		BEGIN
        		SET @COUNTER = @COUNTER + 1
		END
	SET @COUNT = @COUNT +1
	END
RETURN @RESULT

END

Many thanks 

Viewing all articles
Browse latest Browse all 23857

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>