If you're using MySQL, you may have come across its REGEXP (a.k.a. RLIKE) operator, which can be used as a more powerful alternative to the LIKE operator. For example:
SELECT * FROM address WHERE postcode REGEXP '^[A-Z]{1,2}[0-9]{1,2}
?[0-9][A-Z]{2}$';
But despite having a competent regular expression engine, MySQL doesn't provide a built-in regular expression replace function. **UPDATE**: A useful set of regex functions including REGEXP_REPLACE have now been provided in MySQL 8.0. This renders reading on unnecessary unless you're constrained to using an earlier version. It's worth noting that MariaDB does provide such a function - but that won't help if you're working on an established project that's tied to MySQL (or have other reasons for wanting to stay with it).
There have been some valiant attempts to plug the gap via User-Defined Function libraries that can be compiled from C source code and incorporated into a MySQL server instance: mysql-udf-regexp and lib_mysqludf_preg. But these do require the MySQL source code to be present and appropriate permissions to compile and build the files on the server.
Surely there could be a lighter-weight alternative...? Given that REGEXP can test whether a substring matches a regular expression, couldn't it be used to look inside a find string, get the bits that match and then replace them? Well that turns out to be a bit harder than it might sound as there are some additional complexities to consider such as start and end anchors, greedy/non-greedy matching etc. But I've written a function that will do the basic job, as shown in this Rextester online demo. Here's the code:
-- ------------------------------------------------------------------------------------
-- USAGE
-- ------------------------------------------------------------------------------------
-- SELECT reg_replace(<subject>,
-- <pattern>,
-- <replacement>,
-- <greedy>,
-- <minMatchLen>,
-- <maxMatchLen>);
-- where:
-- <subject> is the string to look in for doing the replacements
-- <pattern> is the regular expression to match against
-- <replacement> is the replacement string
-- <greedy> is TRUE for greedy matching or FALSE for non-greedy matching
-- <minMatchLen> specifies the minimum match length
-- <maxMatchLen> specifies the maximum match length
-- (minMatchLen and maxMatchLen are used to improve efficiency but are
-- optional and can be set to 0 or NULL if not known/required)
-- Example:
-- SELECT reg_replace(txt, '^[Tt][^ ]* ', 'a', TRUE, 2, 0) FROM tbl;
DROP FUNCTION IF EXISTS reg_replace;
DELIMITER //
CREATE FUNCTION reg_replace(subject VARCHAR(21845), pattern VARCHAR(21845),
replacement VARCHAR(21845), greedy BOOLEAN, minMatchLen INT, maxMatchLen INT)
RETURNS VARCHAR(21845) DETERMINISTIC BEGIN
DECLARE result, subStr, usePattern VARCHAR(21845);
DECLARE startPos, prevStartPos, startInc, len, lenInc INT;
IF subject REGEXP pattern THEN
SET result = '';
-- Sanitize input parameter values
SET minMatchLen = IF(minMatchLen < 1, 1, minMatchLen);
SET maxMatchLen = IF(maxMatchLen < 1 OR maxMatchLen > CHAR_LENGTH(subject),
CHAR_LENGTH(subject), maxMatchLen);
-- Set the pattern to use to match an entire string rather than part of a string
SET usePattern = IF (LEFT(pattern, 1) = '^', pattern, CONCAT('^', pattern));
SET usePattern = IF (RIGHT(pattern, 1) = '$', usePattern, CONCAT(usePattern, '$'));
-- Set start position to 1 if pattern starts with ^ or doesn't end with $.
IF LEFT(pattern, 1) = '^' OR RIGHT(pattern, 1) <> '$' THEN
SET startPos = 1, startInc = 1;
-- Otherwise (i.e. pattern ends with $ but doesn't start with ^): Set start pos
-- to the min or max match length from the end (depending on "greedy" flag).
ELSEIF greedy THEN
SET startPos = CHAR_LENGTH(subject) - maxMatchLen + 1, startInc = 1;
ELSE
SET startPos = CHAR_LENGTH(subject) - minMatchLen + 1, startInc = -1;
END IF;
WHILE startPos >= 1 AND startPos <= CHAR_LENGTH(subject)
AND startPos + minMatchLen - 1 <= CHAR_LENGTH(subject)
AND !(LEFT(pattern, 1) = '^' AND startPos <> 1)
AND !(RIGHT(pattern, 1) = '$'
AND startPos + maxMatchLen - 1 < CHAR_LENGTH(subject)) DO
-- Set start length to maximum if matching greedily or pattern ends with $.
-- Otherwise set starting length to the minimum match length.
IF greedy OR RIGHT(pattern, 1) = '$' THEN
SET len = LEAST(CHAR_LENGTH(subject) - startPos + 1, maxMatchLen), lenInc = -1;
ELSE
SET len = minMatchLen, lenInc = 1;
END IF;
SET prevStartPos = startPos;
lenLoop: WHILE len >= 1 AND len <= maxMatchLen
AND startPos + len - 1 <= CHAR_LENGTH(subject)
AND !(RIGHT(pattern, 1) = '$'
AND startPos + len - 1 <> CHAR_LENGTH(subject)) DO
SET subStr = SUBSTRING(subject, startPos, len);
IF subStr REGEXP usePattern THEN
SET result = IF(startInc = 1,
CONCAT(result, replacement), CONCAT(replacement, result));
SET startPos = startPos + startInc * len;
LEAVE lenLoop;
END IF;
SET len = len + lenInc;
END WHILE;
IF (startPos = prevStartPos) THEN
SET result = IF(startInc = 1, CONCAT(result, SUBSTRING(subject, startPos, 1)),
CONCAT(SUBSTRING(subject, startPos, 1), result));
SET startPos = startPos + startInc;
END IF;
END WHILE;
IF startInc = 1 AND startPos <= CHAR_LENGTH(subject) THEN
SET result = CONCAT(result, RIGHT(subject, CHAR_LENGTH(subject) + 1 - startPos));
ELSEIF startInc = -1 AND startPos >= 1 THEN
SET result = CONCAT(LEFT(subject, startPos), result);
END IF;
ELSE
SET result = subject;
END IF;
RETURN result;
END//
DELIMITER ;
Limitations
- The CREATE ROUTINE privilege is required for creating the function.
- It only supports VARCHAR(21845) rather than the larger TEXT data type.
- This method is of course going to take a while when the subject string is large.
- If ^ and/or $ anchors are used in the pattern, they must be at the very start and very end respectively - e.g. patterns such as (^start|end$) aren't supported.
- There is a "greedy" flag to specify whether the overall matching should be greedy or non-greedy. Combining greedy and lazy matching within a single regular expression (e.g. a.*?b.*) isn't supported.
- It won't allow substitution of backreferences (e.g. \1, \2 etc.) to replace capturing groups. If this functionality is needed, please see the more advanced version below, which attempts to provide a workaround by updating the function to allow a secondary find and replace within each found match (at the expense of increased complexity).
More advanced version
The following more advanced version goes some way towards addressing limitation (6) above by allowing a further "sub-pattern" to further restrict what is actually replaced within each match, as shown in this Rextester online demo. It actually consists of a function and an internally used stored procedure since a recursive call is needed and recursion is only supported by stored procedures, not functions in MySQL. Here's the code:
-- ------------------------------------------------------------------------------------
-- USAGE
-- ------------------------------------------------------------------------------------
-- SELECT reg_replace(<subject>,
-- <pattern>,
-- <subpattern>,
-- <replacement>,
-- <greedy>,
-- <minMatchLen>,
-- <maxMatchLen>,
-- <minSubMatchLen>,
-- <maxSubMatchLen>);
-- where:
-- <subject> is the string to look in for doing the replacements
-- <pattern> is the regular expression to match against
-- <subpattern> is a regular expression to match against within each
-- portion of text that matches <pattern>
-- <replacement> is the replacement string
-- <greedy> is TRUE for greedy matching or FALSE for non-greedy matching
-- <minMatchLen> specifies the minimum match length
-- <maxMatchLen> specifies the maximum match length
-- <minSubMatchLen> specifies the minimum match length
-- <maxSubMatchLen> specifies the maximum match length
-- (minMatchLen, maxMatchLen, minSubMatchLen and maxSubMatchLen are used to improve
-- efficiency but are optional and can be set to 0 or NULL if not known/required)
-- Example:
-- SELECT reg_replace(txt, '[A-Z0-9]{3}', '[0-9]', '_', TRUE, 3, 3, 1, 1) FROM tbl;
DROP FUNCTION IF EXISTS reg_replace;
DELIMITER //
CREATE FUNCTION reg_replace(subject VARCHAR(21845), pattern VARCHAR(21845),
subpattern VARCHAR(21845), replacement VARCHAR(21845), greedy BOOLEAN,
minMatchLen INT, maxMatchLen INT, minSubMatchLen INT, maxSubMatchLen INT)
RETURNS VARCHAR(21845) DETERMINISTIC BEGIN
DECLARE result VARCHAR(21845);
CALL reg_replace_worker(
subject, pattern, subpattern, replacement, greedy, minMatchLen, maxMatchLen,
minSubMatchLen, maxSubMatchLen, result);
RETURN result;
END;//
DELIMITER ;
DROP PROCEDURE IF EXISTS reg_replace_worker;
DELIMITER //
CREATE PROCEDURE reg_replace_worker(subject VARCHAR(21845), pattern VARCHAR(21845),
subpattern VARCHAR(21845), replacement VARCHAR(21845), greedy BOOLEAN,
minMatchLen INT, maxMatchLen INT, minSubMatchLen INT, maxSubMatchLen INT,
OUT result VARCHAR(21845))
BEGIN
DECLARE subStr, usePattern, useRepl VARCHAR(21845);
DECLARE startPos, prevStartPos, startInc, len, lenInc INT;
SET @@SESSION.max_sp_recursion_depth = 2;
IF subject REGEXP pattern THEN
SET result = '';
-- Sanitize input parameter values
SET minMatchLen = IF(minMatchLen < 1, 1, minMatchLen);
SET maxMatchLen = IF(maxMatchLen < 1 OR maxMatchLen > CHAR_LENGTH(subject),
CHAR_LENGTH(subject), maxMatchLen);
-- Set the pattern to use to match an entire string rather than part of a string
SET usePattern = IF (LEFT(pattern, 1) = '^', pattern, CONCAT('^', pattern));
SET usePattern = IF (RIGHT(pattern, 1) = '$', usePattern, CONCAT(usePattern, '$'));
-- Set start position to 1 if pattern starts with ^ or doesn't end with $.
IF LEFT(pattern, 1) = '^' OR RIGHT(pattern, 1) <> '$' THEN
SET startPos = 1, startInc = 1;
-- Otherwise (i.e. pattern ends with $ but doesn't start with ^): Set start pos
-- to the min or max match length from the end (depending on "greedy" flag).
ELSEIF greedy THEN
SET startPos = CHAR_LENGTH(subject) - maxMatchLen + 1, startInc = 1;
ELSE
SET startPos = CHAR_LENGTH(subject) - minMatchLen + 1, startInc = -1;
END IF;
WHILE startPos >= 1 AND startPos <= CHAR_LENGTH(subject)
AND startPos + minMatchLen - 1 <= CHAR_LENGTH(subject)
AND !(LEFT(pattern, 1) = '^' AND startPos <> 1)
AND !(RIGHT(pattern, 1) = '$'
AND startPos + maxMatchLen - 1 < CHAR_LENGTH(subject)) DO
-- Set start length to maximum if matching greedily or pattern ends with $.
-- Otherwise set starting length to the minimum match length.
IF greedy OR RIGHT(pattern, 1) = '$' THEN
SET len = LEAST(CHAR_LENGTH(subject) - startPos + 1, maxMatchLen), lenInc = -1;
ELSE
SET len = minMatchLen, lenInc = 1;
END IF;
SET prevStartPos = startPos;
lenLoop: WHILE len >= 1 AND len <= maxMatchLen
AND startPos + len - 1 <= CHAR_LENGTH(subject)
AND !(RIGHT(pattern, 1) = '$'
AND startPos + len - 1 <> CHAR_LENGTH(subject)) DO
SET subStr = SUBSTRING(subject, startPos, len);
IF subStr REGEXP usePattern THEN
IF subpattern IS NULL THEN
SET useRepl = replacement;
ELSE
CALL reg_replace_worker(subStr, subpattern, NULL, replacement, greedy,
minSubMatchLen, maxSubMatchLen, NULL, NULL, useRepl);
END IF;
SET result = IF(startInc = 1,
CONCAT(result, useRepl), CONCAT(useRepl, result));
SET startPos = startPos + startInc * len;
LEAVE lenLoop;
END IF;
SET len = len + lenInc;
END WHILE;
IF (startPos = prevStartPos) THEN
SET result = IF(startInc = 1, CONCAT(result, SUBSTRING(subject, startPos, 1)),
CONCAT(SUBSTRING(subject, startPos, 1), result));
SET startPos = startPos + startInc;
END IF;
END WHILE;
IF startInc = 1 AND startPos <= CHAR_LENGTH(subject) THEN
SET result = CONCAT(result, RIGHT(subject, CHAR_LENGTH(subject) + 1 - startPos));
ELSEIF startInc = -1 AND startPos >= 1 THEN
SET result = CONCAT(LEFT(subject, startPos), result);
END IF;
ELSE
SET result = subject;
END IF;
END;//
DELIMITER ;
SELECT * FROM address WHERE postcode REGEXP '^[A-Z]{1,2}[0-9]{1,2}
?[0-9][A-Z]{2}$';
But despite having a competent regular expression engine, MySQL doesn't provide a built-in regular expression replace function. **UPDATE**: A useful set of regex functions including REGEXP_REPLACE have now been provided in MySQL 8.0. This renders reading on unnecessary unless you're constrained to using an earlier version. It's worth noting that MariaDB does provide such a function - but that won't help if you're working on an established project that's tied to MySQL (or have other reasons for wanting to stay with it).There have been some valiant attempts to plug the gap via User-Defined Function libraries that can be compiled from C source code and incorporated into a MySQL server instance: mysql-udf-regexp and lib_mysqludf_preg. But these do require the MySQL source code to be present and appropriate permissions to compile and build the files on the server.
Surely there could be a lighter-weight alternative...? Given that REGEXP can test whether a substring matches a regular expression, couldn't it be used to look inside a find string, get the bits that match and then replace them? Well that turns out to be a bit harder than it might sound as there are some additional complexities to consider such as start and end anchors, greedy/non-greedy matching etc. But I've written a function that will do the basic job, as shown in this Rextester online demo. Here's the code:
-- ------------------------------------------------------------------------------------
-- USAGE
-- ------------------------------------------------------------------------------------
-- SELECT reg_replace(<subject>,
-- <pattern>,
-- <replacement>,
-- <greedy>,
-- <minMatchLen>,
-- <maxMatchLen>);
-- where:
-- <subject> is the string to look in for doing the replacements
-- <pattern> is the regular expression to match against
-- <replacement> is the replacement string
-- <greedy> is TRUE for greedy matching or FALSE for non-greedy matching
-- <minMatchLen> specifies the minimum match length
-- <maxMatchLen> specifies the maximum match length
-- (minMatchLen and maxMatchLen are used to improve efficiency but are
-- optional and can be set to 0 or NULL if not known/required)
-- Example:
-- SELECT reg_replace(txt, '^[Tt][^ ]* ', 'a', TRUE, 2, 0) FROM tbl;
DROP FUNCTION IF EXISTS reg_replace;
DELIMITER //
CREATE FUNCTION reg_replace(subject VARCHAR(21845), pattern VARCHAR(21845),
replacement VARCHAR(21845), greedy BOOLEAN, minMatchLen INT, maxMatchLen INT)
RETURNS VARCHAR(21845) DETERMINISTIC BEGIN
DECLARE result, subStr, usePattern VARCHAR(21845);
DECLARE startPos, prevStartPos, startInc, len, lenInc INT;
IF subject REGEXP pattern THEN
SET result = '';
-- Sanitize input parameter values
SET minMatchLen = IF(minMatchLen < 1, 1, minMatchLen);
SET maxMatchLen = IF(maxMatchLen < 1 OR maxMatchLen > CHAR_LENGTH(subject),
CHAR_LENGTH(subject), maxMatchLen);
-- Set the pattern to use to match an entire string rather than part of a string
SET usePattern = IF (LEFT(pattern, 1) = '^', pattern, CONCAT('^', pattern));
SET usePattern = IF (RIGHT(pattern, 1) = '$', usePattern, CONCAT(usePattern, '$'));
-- Set start position to 1 if pattern starts with ^ or doesn't end with $.
IF LEFT(pattern, 1) = '^' OR RIGHT(pattern, 1) <> '$' THEN
SET startPos = 1, startInc = 1;
-- Otherwise (i.e. pattern ends with $ but doesn't start with ^): Set start pos
-- to the min or max match length from the end (depending on "greedy" flag).
ELSEIF greedy THEN
SET startPos = CHAR_LENGTH(subject) - maxMatchLen + 1, startInc = 1;
ELSE
SET startPos = CHAR_LENGTH(subject) - minMatchLen + 1, startInc = -1;
END IF;
WHILE startPos >= 1 AND startPos <= CHAR_LENGTH(subject)
AND startPos + minMatchLen - 1 <= CHAR_LENGTH(subject)
AND !(LEFT(pattern, 1) = '^' AND startPos <> 1)
AND !(RIGHT(pattern, 1) = '$'
AND startPos + maxMatchLen - 1 < CHAR_LENGTH(subject)) DO
-- Set start length to maximum if matching greedily or pattern ends with $.
-- Otherwise set starting length to the minimum match length.
IF greedy OR RIGHT(pattern, 1) = '$' THEN
SET len = LEAST(CHAR_LENGTH(subject) - startPos + 1, maxMatchLen), lenInc = -1;
ELSE
SET len = minMatchLen, lenInc = 1;
END IF;
SET prevStartPos = startPos;
lenLoop: WHILE len >= 1 AND len <= maxMatchLen
AND startPos + len - 1 <= CHAR_LENGTH(subject)
AND !(RIGHT(pattern, 1) = '$'
AND startPos + len - 1 <> CHAR_LENGTH(subject)) DO
SET subStr = SUBSTRING(subject, startPos, len);
IF subStr REGEXP usePattern THEN
SET result = IF(startInc = 1,
CONCAT(result, replacement), CONCAT(replacement, result));
SET startPos = startPos + startInc * len;
LEAVE lenLoop;
END IF;
SET len = len + lenInc;
END WHILE;
IF (startPos = prevStartPos) THEN
SET result = IF(startInc = 1, CONCAT(result, SUBSTRING(subject, startPos, 1)),
CONCAT(SUBSTRING(subject, startPos, 1), result));
SET startPos = startPos + startInc;
END IF;
END WHILE;
IF startInc = 1 AND startPos <= CHAR_LENGTH(subject) THEN
SET result = CONCAT(result, RIGHT(subject, CHAR_LENGTH(subject) + 1 - startPos));
ELSEIF startInc = -1 AND startPos >= 1 THEN
SET result = CONCAT(LEFT(subject, startPos), result);
END IF;
ELSE
SET result = subject;
END IF;
RETURN result;
END//
DELIMITER ;
Limitations
- The CREATE ROUTINE privilege is required for creating the function.
- It only supports VARCHAR(21845) rather than the larger TEXT data type.
- This method is of course going to take a while when the subject string is large.
- If ^ and/or $ anchors are used in the pattern, they must be at the very start and very end respectively - e.g. patterns such as (^start|end$) aren't supported.
- There is a "greedy" flag to specify whether the overall matching should be greedy or non-greedy. Combining greedy and lazy matching within a single regular expression (e.g. a.*?b.*) isn't supported.
- It won't allow substitution of backreferences (e.g. \1, \2 etc.) to replace capturing groups. If this functionality is needed, please see the more advanced version below, which attempts to provide a workaround by updating the function to allow a secondary find and replace within each found match (at the expense of increased complexity).
More advanced version
The following more advanced version goes some way towards addressing limitation (6) above by allowing a further "sub-pattern" to further restrict what is actually replaced within each match, as shown in this Rextester online demo. It actually consists of a function and an internally used stored procedure since a recursive call is needed and recursion is only supported by stored procedures, not functions in MySQL. Here's the code:
The following more advanced version goes some way towards addressing limitation (6) above by allowing a further "sub-pattern" to further restrict what is actually replaced within each match, as shown in this Rextester online demo. It actually consists of a function and an internally used stored procedure since a recursive call is needed and recursion is only supported by stored procedures, not functions in MySQL. Here's the code:
-- ------------------------------------------------------------------------------------
-- USAGE
-- ------------------------------------------------------------------------------------
-- SELECT reg_replace(<subject>,
-- <pattern>,
-- <subpattern>,
-- <replacement>,
-- <greedy>,
-- <minMatchLen>,
-- <maxMatchLen>,
-- <minSubMatchLen>,
-- <maxSubMatchLen>);
-- where:
-- <subject> is the string to look in for doing the replacements
-- <pattern> is the regular expression to match against
-- <subpattern> is a regular expression to match against within each
-- portion of text that matches <pattern>
-- <replacement> is the replacement string
-- <greedy> is TRUE for greedy matching or FALSE for non-greedy matching
-- <minMatchLen> specifies the minimum match length
-- <maxMatchLen> specifies the maximum match length
-- <minSubMatchLen> specifies the minimum match length
-- <maxSubMatchLen> specifies the maximum match length
-- (minMatchLen, maxMatchLen, minSubMatchLen and maxSubMatchLen are used to improve
-- efficiency but are optional and can be set to 0 or NULL if not known/required)
-- Example:
-- SELECT reg_replace(txt, '[A-Z0-9]{3}', '[0-9]', '_', TRUE, 3, 3, 1, 1) FROM tbl;
DROP FUNCTION IF EXISTS reg_replace;
DELIMITER //
CREATE FUNCTION reg_replace(subject VARCHAR(21845), pattern VARCHAR(21845),
subpattern VARCHAR(21845), replacement VARCHAR(21845), greedy BOOLEAN,
minMatchLen INT, maxMatchLen INT, minSubMatchLen INT, maxSubMatchLen INT)
RETURNS VARCHAR(21845) DETERMINISTIC BEGIN
DECLARE result VARCHAR(21845);
CALL reg_replace_worker(
subject, pattern, subpattern, replacement, greedy, minMatchLen, maxMatchLen,
minSubMatchLen, maxSubMatchLen, result);
RETURN result;
END;//
DELIMITER ;
DROP PROCEDURE IF EXISTS reg_replace_worker;
DELIMITER //
CREATE PROCEDURE reg_replace_worker(subject VARCHAR(21845), pattern VARCHAR(21845),
subpattern VARCHAR(21845), replacement VARCHAR(21845), greedy BOOLEAN,
minMatchLen INT, maxMatchLen INT, minSubMatchLen INT, maxSubMatchLen INT,
OUT result VARCHAR(21845))
BEGIN
DECLARE subStr, usePattern, useRepl VARCHAR(21845);
DECLARE startPos, prevStartPos, startInc, len, lenInc INT;
SET @@SESSION.max_sp_recursion_depth = 2;
IF subject REGEXP pattern THEN
SET result = '';
-- Sanitize input parameter values
SET minMatchLen = IF(minMatchLen < 1, 1, minMatchLen);
SET maxMatchLen = IF(maxMatchLen < 1 OR maxMatchLen > CHAR_LENGTH(subject),
CHAR_LENGTH(subject), maxMatchLen);
-- Set the pattern to use to match an entire string rather than part of a string
SET usePattern = IF (LEFT(pattern, 1) = '^', pattern, CONCAT('^', pattern));
SET usePattern = IF (RIGHT(pattern, 1) = '$', usePattern, CONCAT(usePattern, '$'));
-- Set start position to 1 if pattern starts with ^ or doesn't end with $.
IF LEFT(pattern, 1) = '^' OR RIGHT(pattern, 1) <> '$' THEN
SET startPos = 1, startInc = 1;
-- Otherwise (i.e. pattern ends with $ but doesn't start with ^): Set start pos
-- to the min or max match length from the end (depending on "greedy" flag).
ELSEIF greedy THEN
SET startPos = CHAR_LENGTH(subject) - maxMatchLen + 1, startInc = 1;
ELSE
SET startPos = CHAR_LENGTH(subject) - minMatchLen + 1, startInc = -1;
END IF;
WHILE startPos >= 1 AND startPos <= CHAR_LENGTH(subject)
AND startPos + minMatchLen - 1 <= CHAR_LENGTH(subject)
AND !(LEFT(pattern, 1) = '^' AND startPos <> 1)
AND !(RIGHT(pattern, 1) = '$'
AND startPos + maxMatchLen - 1 < CHAR_LENGTH(subject)) DO
-- Set start length to maximum if matching greedily or pattern ends with $.
-- Otherwise set starting length to the minimum match length.
IF greedy OR RIGHT(pattern, 1) = '$' THEN
SET len = LEAST(CHAR_LENGTH(subject) - startPos + 1, maxMatchLen), lenInc = -1;
ELSE
SET len = minMatchLen, lenInc = 1;
END IF;
SET prevStartPos = startPos;
lenLoop: WHILE len >= 1 AND len <= maxMatchLen
AND startPos + len - 1 <= CHAR_LENGTH(subject)
AND !(RIGHT(pattern, 1) = '$'
AND startPos + len - 1 <> CHAR_LENGTH(subject)) DO
SET subStr = SUBSTRING(subject, startPos, len);
IF subStr REGEXP usePattern THEN
IF subpattern IS NULL THEN
SET useRepl = replacement;
ELSE
CALL reg_replace_worker(subStr, subpattern, NULL, replacement, greedy,
minSubMatchLen, maxSubMatchLen, NULL, NULL, useRepl);
END IF;
SET result = IF(startInc = 1,
CONCAT(result, useRepl), CONCAT(useRepl, result));
SET startPos = startPos + startInc * len;
LEAVE lenLoop;
END IF;
SET len = len + lenInc;
END WHILE;
IF (startPos = prevStartPos) THEN
SET result = IF(startInc = 1, CONCAT(result, SUBSTRING(subject, startPos, 1)),
CONCAT(SUBSTRING(subject, startPos, 1), result));
SET startPos = startPos + startInc;
END IF;
END WHILE;
IF startInc = 1 AND startPos <= CHAR_LENGTH(subject) THEN
SET result = CONCAT(result, RIGHT(subject, CHAR_LENGTH(subject) + 1 - startPos));
ELSEIF startInc = -1 AND startPos >= 1 THEN
SET result = CONCAT(LEFT(subject, startPos), result);
END IF;
ELSE
SET result = subject;
END IF;
END;//
DELIMITER ;
0 comments:
Post a Comment