Oracle Regular Expressions: Comprehensive Guide and SQL Patterns
Regular expressions describe text patterns using literal characters and metacharacters. In Oracle Database (10g and later), SQL gains native regex support through a set of REGEXP_* functions that implement a POSIX-style dialect with several Perl-like features. This guide summarizes the syntax, options, and common patterns, and demonstrates them with Oracle SQL.
1. Character Matching
-
\d, \D
- \d: digit 0–9 (Oracle synonym: [:digit:])
- \D: not a digit
- Example: \d\d matches 72; does not match ab
-
\w, \W
- \w: word char [A-Za-z0-9_]
- \W: non-word char
- Example: \w+ matches Ab_2; does not match %*$
-
\s, \S
- \s: whitespace (space, tab, newline, vertical tab, form feed)
- \S: non-whitespace
- Example: \s+ matches standard blanks in text
-
. (dot)
- Any single character except newline, unless match parameter 'n' is used
-
[ ... ] character class
- Example: [abc] matches one of a, b, or c
- Ranges: [a-z], [A-Z], [0-9]
-
[^ ... ] negated class
- Example: [^abc] matches any single char other then a, b, or c
-
POSIX character classes (preferred in Oracle in side []):
- [:digit:], [:alpha:], [:alnum:], [:space:], [:lower:], [:upper:], [:punct:], [:xdigit:], [:blank:], [:cntrl:], [:print:], [:graph:]
2. Quantifiers
- ? 0 or 1 time
-
- 0 or more times
-
- 1 or more times
- {n} exactly n times
- {n,} at least n times
- {n,m} between n and m times (inclusive)
Examples:
- x{2} matches "xx"
- x{2,} matches "xx", "xxx", ...
- x{2,4} matches "xx", "xxx", "xxxx"
3. Anchors and Boundaries
- ^ start of string (or start of line when 'm' is set)
- $ end of string (or end of line when 'm' is set)
- \A start of string (ignores 'm')
- \z end of string (ignores 'm')
- \Z end of string or just before final newline
- \b word boundary (between \w and \W)
- \B not a word boundary
Anchors apply to the position in the input, not to characters themselves.
4. Grouping, Backreferences, and Lookarounds
-
(...) capturing group
- Captures the submatch for later reference
- Example: ([A-Z]{3}\d) over "ABC1DEF2XY" matches ABC1 and DEF2; \1 refers to the captured part in replacements
-
(?:...) non-capturing group
- Groups without capturing; useful for alternation and precedence
-
(?...) named capturing group
- Oracle supports named subexpressions in patterns and accessing them via subexpr index; use numeric backreferences (\1, \2, …) in SQL replacements
-
Lookahead
- (?=...) positive lookahead (asserts that the following text matches ...)
- (?!...) negative lookahead
- Example: \S+(?=.NET) over "Java, C#.NET, VB.NET" matches C# and VB
-
Lookbehind
- (?<=...) positive lookbehind (asserts that the preceding text matches ...)
- (?<!...) negative lookbehind
- Example: (?<=New\s)[A-Z][a-z]+ over "New Mexico; New England" matches Mexico and England
Notes:
- Atomic groups (?>...) and conditional subpatterns (?(...)) are not supported by Oracle’s regex engine.
5. Alternation and Character Sets
-
Alternation |
- A|B matches A or B, often combined with grouping: (ing|ed|s)
-
Character sets [ ... ] and ranges [a-z]
- Inside sets, most metacharacters lose special meaning except ^ (first position), - (range), and POSIX classes
6. Escaping and Literals
Escape metacharacters with a backslash when used literallly:
- ., *, +, ?, |, (, ), {, }, [, ], ^, $
Oracle focuses on POSIX classes for categories; use [:space:], [:digit:], etc., instead of control escapes like \t, \n for portability.
7. Match Parameters (REGEXP_* match_parameter)
- 'c' case-sensitive (default)
- 'i' case-insensitive
- 'm' treat input as multi-line (activates ^ and $ per line)
- 'n' dot (.) matches newline
- 'x' ignore unescaped whitespace and allow comments with # inside pattern
Combine flags as a string, e.g., 'im'.
8. Oracle SQL Regex Functions
-
REGEXP_LIKE(source, pattern[, match_parameter])
- Predicate version of LIKE
-
REGEXP_INSTR(source, pattern[, position[, occurrence[, return_option[, match_parameter[, subexpr]]]]])
- Returns character position of a match
- position: 1-based start (default 1)
- occurrence: which match to locate (default 1)
- return_option: 0 = start position, 1 = first char after match
- subexpr: 0 = entire match, >0 = specific capturing group
-
REGEXP_SUBSTR(source, pattern[, position[, occurrence[, match_parameter[, subexpr]]]])
- Returns the matching substring (or captured group when subexpr > 0)
-
REGEXP_REPLACE(source, pattern[, replace_string[, position[, occurrence[, match_parameter]]]])
- Replaces occurrences matching pattern
- Use \1, \2, ... in replace_string for backreferences
-
REGEXP_COUNT(source, pattern[, position[, match_parameter]])
- Returns the number of matches
9. Operator Precedence (high to low)
- Parentheses, lookarounds, character classes [ ]
- Quantifiers *, +, ?, {n}, {n,}, {n,m}
- Anchors and assertions (^, $, \b, \B)
- Alternation |
10. Sample Objects and Data
CREATE TABLE rx_test (val VARCHAR2(60));
INSERT INTO rx_test VALUES ('112233445566778899');
INSERT INTO rx_test VALUES ('22113344 5566778899');
INSERT INTO rx_test VALUES ('33112244 5566778899');
INSERT INTO rx_test VALUES ('44112233 5566 778899');
INSERT INTO rx_test VALUES ('5511 2233 4466778899');
INSERT INTO rx_test VALUES ('661122334455778899');
INSERT INTO rx_test VALUES ('771122334455668899');
INSERT INTO rx_test VALUES ('881122334455667799');
INSERT INTO rx_test VALUES ('991122334455667788');
INSERT INTO rx_test VALUES ('aabbccddee');
INSERT INTO rx_test VALUES ('bbaaaccddee');
INSERT INTO rx_test VALUES ('ccabbddee');
INSERT INTO rx_test VALUES ('ddaabbccee');
INSERT INTO rx_test VALUES ('eeaabbccdd');
INSERT INTO rx_test VALUES ('ab123');
INSERT INTO rx_test VALUES ('123xy');
INSERT INTO rx_test VALUES ('007ab');
INSERT INTO rx_test VALUES ('abcxy');
INSERT INTO rx_test VALUES ('The final test is is is how to find duplicate words.');
COMMIT;
11. Query Examples
11.1 REGEXP_LIKE
-- strings starting with 1 to 3 'a'
SELECT val FROM rx_test WHERE REGEXP_LIKE(val, '^a{1,3}');
-- 'a' occurs 1 to 3 times anywhere
SELECT val FROM rx_test WHERE REGEXP_LIKE(val, 'a{1,3}');
-- starts with 'a' and ends with 'e'
SELECT val FROM rx_test WHERE REGEXP_LIKE(val, '^a.*e$');
-- starts with lowercase or contains a digit
SELECT val FROM rx_test WHERE REGEXP_LIKE(val, '^[[:lower:]]|[[:digit:]]');
-- starts with a lowercase letter
SELECT val FROM rx_test WHERE REGEXP_LIKE(val, '^[[:lower:]]');
-- contains a non-digit
SELECT val FROM rx_test WHERE REGEXP_LIKE(val, '[^[:digit:]]');
-- begins with a non-digit
SELECT val FROM rx_test WHERE REGEXP_LIKE(val, '^[^[:digit:]]');
11.2 REGEXP_INSTR
-- position of a digit at end of string
SELECT REGEXP_INSTR(val, '[[:digit:]]$') AS pos FROM rx_test;
-- position where trailing run of digits begins
SELECT REGEXP_INSTR(val, '[[:digit:]]+$') AS pos FROM rx_test;
-- where does "$" followed by digits occur
SELECT REGEXP_INSTR('The price is $400.', '\$[[:digit:]]+') AS pos FROM dual;
-- first non-lowercase
SELECT REGEXP_INSTR('onetwothree', '[^[:lower:]]') AS pos FROM dual;
-- first non-comma (none: returns 0)
SELECT REGEXP_INSTR(',,,,,', '[^,]') AS pos FROM dual;
-- first run of non-commas (same as above with a quantifier)
SELECT REGEXP_INSTR(',,,,,', '[^,]+') AS pos FROM dual;
Subexpression index usage:
-- return position of submatch for capturing group 1
SELECT REGEXP_INSTR('cab', 'c(a)b', 1, 1, 0, NULL, 1) AS grp1_pos FROM dual;
11.3 REGEXP_SUBSTR
-- first lowercase run
SELECT REGEXP_SUBSTR(val, '[a-z]+') AS fragment FROM rx_test;
-- first digit run
SELECT REGEXP_SUBSTR(val, '[0-9]+') AS digits FROM rx_test;
-- from start through first 'b'
SELECT REGEXP_SUBSTR('aababcde', '^a.*?b') AS slice FROM dual;
11.4 REGEXP_REPLACE
-- collapse 2+ spaces to a single space
SELECT REGEXP_REPLACE('Joe Smith', '( ){2,}', ' ') AS normalized FROM dual;
-- reorder three whitespace-separated tokens: one two three -> three, two, one
SELECT REGEXP_REPLACE('aa bb cc', '([^ ]+) ([^ ]+) (.+)', '\3, \2, \1') AS reordered FROM dual;
Case-insensitive filtering:
CREATE TABLE rx_case (id CHAR(1), txt VARCHAR2(20));
INSERT INTO rx_case VALUES ('A', 'AAAAA');
INSERT INTO rx_case VALUES ('a', 'aaaaa');
INSERT INTO rx_case VALUES ('B', 'BBBBB');
INSERT INTO rx_case VALUES ('b', 'bbbbb');
SELECT * FROM rx_case WHERE REGEXP_LIKE(id, 'b', 'i');
12. Function Library Overview and Reference
12.1 REGEXP_COUNT
Syntax: REGEXP_COUNT(source, pattern[, position[, match_parameter]])
-- count occurrences of 'e' (case-insensitive)
SELECT REGEXP_COUNT(testcol, 'e', 1, 'i') AS result FROM (
SELECT 'Stephen' AS testcol FROM dual UNION ALL
SELECT 'Steven' FROM dual
);
-- count occurrences of "2a" ignoring case
SELECT REGEXP_COUNT(testcol, '2a', 1, 'i') AS result
FROM (
SELECT '12a45' testcol FROM dual UNION ALL
SELECT '222aa' FROM dual
);
12.2 REGEXP_INSTR examples
-- case-insensitive: find 'o' followed by 3 letters, first occurrence; return start
SELECT REGEXP_INSTR('500 Oracle Pkwy, Redwood Shores, CA', '[o][[:alpha:]]{3}', 1, 1, 0, 'i') AS pos FROM dual;
-- same, return the position after the match
SELECT REGEXP_INSTR('500 Oracle Pkwy, Redwood Shores, CA', '[o][[:alpha:]]{3}', 1, 1, 1, 'i') AS pos FROM dual;
-- second occurrence, start position
SELECT REGEXP_INSTR('500 Oracle Pkwy, Redwood Shores, CA', '[o][[:alpha:]]{3}', 1, 2, 0, 'i') AS pos FROM dual;
-- second occurrence, end position
SELECT REGEXP_INSTR('500 Oracle Pkwy, Redwood Shores, CA', '[o][[:alpha:]]{3}', 1, 2, 1, 'i') AS pos FROM dual;
-- locate try/trying/tried/tries
SELECT REGEXP_INSTR('We are trying to make the subject easier.', 'tr(y(ing)?|ied|ies)') AS pos FROM dual;
-- subexpression return: position of 'b' in 'ab'
SELECT REGEXP_INSTR('1ab2', 'a(b)', 1, 1, 0, NULL, 1) AS pos FROM dual;
12.3 REGEXP_LIKE examples
-- alphanumeric presence and lengths
SELECT * FROM test WHERE REGEXP_LIKE(testcol, '[[:alnum:]]');
SELECT * FROM test WHERE REGEXP_LIKE(testcol, '[[:alnum:]]{3}');
SELECT * FROM test WHERE REGEXP_LIKE(testcol, '[[:alnum:]]{5}');
-- alphabetic only
SELECT * FROM test WHERE REGEXP_LIKE(testcol, '^[[:alpha:]]+$');
SELECT * FROM test WHERE REGEXP_LIKE(testcol, '^[[:alpha:]]{3}$');
-- control characters
INSERT INTO test VALUES ('xy' || CHR(10) || 'z');
COMMIT;
SELECT * FROM test WHERE REGEXP_LIKE(testcol, '[[:cntrl:]]');
-- digits
SELECT * FROM test WHERE REGEXP_LIKE(testcol, '[[:digit:]]');
SELECT * FROM test WHERE REGEXP_LIKE(testcol, '^[[:digit:]]{5}$');
-- lowercase, uppercase
SELECT * FROM test WHERE REGEXP_LIKE(testcol, '[[:lower:]]{2,}');
SELECT * FROM test WHERE REGEXP_LIKE(testcol, '[[:upper:]]{2,}');
-- printable length
SELECT * FROM test WHERE REGEXP_LIKE(testcol, '^[[:print:]]{6}$');
-- punctuation
SELECT * FROM test WHERE REGEXP_LIKE(testcol, '[[:punct:]]');
-- whitespace occurrences
SELECT * FROM test WHERE REGEXP_LIKE(testcol, '[[:space:]]{2,}');
-- starts with 'ab*' (a followed by one or more b's)
SELECT testcol FROM test WHERE REGEXP_LIKE(testcol, '^ab+');
-- 'a' is the third character
SELECT testcol FROM test WHERE REGEXP_LIKE(testcol, '^..a');
-- two consecutive 'a' or 'z' (case-insensitive)
SELECT testcol FROM test WHERE REGEXP_LIKE(testcol, '([az])\1', 'i');
-- begins with 'Ste', ends with 'en', contains 'v' or 'ph'
SELECT testcol FROM test WHERE REGEXP_LIKE(testcol, '^Ste(v|ph)en$');
-- use in a CHECK constraint
CREATE TABLE mytest (
c1 VARCHAR2(20),
CONSTRAINT chk_letters_only CHECK (REGEXP_LIKE(c1, '^[[:alpha:]]+$'))
);
-- SSN pattern
CREATE TABLE ssn_test (ssn_col VARCHAR2(20));
INSERT INTO ssn_test VALUES ('111-22-3333');
INSERT INTO ssn_test VALUES ('111=22-3333');
INSERT INTO ssn_test VALUES ('111-A2-3333');
INSERT INTO ssn_test VALUES ('111-22-33339');
INSERT INTO ssn_test VALUES ('111-2-23333');
INSERT INTO ssn_test VALUES ('987-65-4321');
COMMIT;
SELECT ssn_col FROM ssn_test
WHERE REGEXP_LIKE(ssn_col, '^[0-9]{3}-[0-9]{2}-[0-9]{4}$');
12.4 REGEXP_REPLACE examples
-- format XXX.XXX.XXXX to (XXX) XXX-XXXX
WITH t AS (
SELECT '111.222.3333' AS testcol FROM dual UNION ALL
SELECT '222.333.4444' FROM dual
)
SELECT testcol,
REGEXP_REPLACE(testcol,
'([[:digit:]]{3})\.([[:digit:]]{3})\.([[:digit:]]{4})',
'(\1) \2-\3') AS reformatted
FROM t;
-- space after each character
SELECT testcol,
REGEXP_REPLACE(testcol, '(.)', '\1 ' ) AS spaced
FROM (
SELECT 'Steven' AS testcol FROM dual UNION ALL
SELECT 'Stephen' FROM dual
)
WHERE testcol LIKE 'S%';
-- collapse multiple spaces into one
SELECT REGEXP_REPLACE('500 Oracle Parkway, Redwood Shores, CA', '( ){2,}', ' ') AS result
FROM dual;
-- insert a space between lower->upper transition
SELECT REGEXP_REPLACE('GeorgeMcGovern', '([[:lower:]])([[:upper:]])', '\1 \2') AS fixed
FROM dual;
-- replace literal period
SELECT REGEXP_REPLACE('We are trying to make the subject easier.', '\.', ' for you.') AS demo
FROM dual;
-- trim trailing non-word chars and normalize
CREATE TABLE t_clean (testcol VARCHAR2(10));
INSERT INTO t_clean VALUES ('1');
INSERT INTO t_clean VALUES ('2 ');
INSERT INTO t_clean VALUES ('3 new ');
SELECT LENGTH(testcol) AS len,
testcol AS origval,
REGEXP_REPLACE(testcol, '\W+$', ' ') AS newval,
LENGTH(REGEXP_REPLACE(testcol, '\W+$', ' ')) AS newlen
FROM t_clean;
12.5 REGEXP_SUBSTR examples
-- comma, then non-commas, then comma
SELECT REGEXP_SUBSTR('500 Oracle Parkway, Redwood Shores, CA', ',[^,]+,') AS result
FROM dual;
-- extract URL-like token
SELECT REGEXP_SUBSTR(
'Go to http://www.oracle.com/products and click on database',
'http://([[:alnum:]]+\.?){3,4}/?'
) AS result
FROM dual;
-- extract try/trying/tried/tries
SELECT REGEXP_SUBSTR('We are trying to make the subject easier.', 'tr(y(ing)?|ied|ies)') AS token
FROM dual;
-- 3rd field using ':' as delimiter
SELECT REGEXP_SUBSTR('system/pwd@orabase:1521:sidval', '[^:]+', 1, 3) AS third
FROM dual;
-- vertical bar-delimited field
CREATE TABLE rx_split (txt VARCHAR2(50));
INSERT INTO rx_split VALUES ('One|Two|Three|Four|Five');
SELECT REGEXP_SUBSTR(txt, '[^|]+', 1, 3) AS third FROM rx_split;
-- equivalence class: base letter and its accents (within [ ])
SELECT REGEXP_SUBSTR('iSelfSchooling NOT ISelfSchooling', '[[=i=]]SelfSchooling') AS eqv
FROM dual;
Parsing demo with REGEXP_SUBSTR:
SET SERVEROUTPUT ON;
DECLARE
first_part VARCHAR2(10);
second_range VARCHAR2(10);
s VARCHAR2(40) := '1:3,4:6,8:10,3:4,7:6,11:12';
BEGIN
first_part := REGEXP_SUBSTR(s, '[^:]+', 1, 1);
second_range := REGEXP_SUBSTR(s, '[^,]+', 3, 1);
DBMS_OUTPUT.PUT_LINE(first_part || ' ' || second_range);
END;
/
13. Backreferences in Oracle
- Use \1, \2, ... in the pattern or replacement to refer to capturing groups.
- Example: reorder "alpha beta gamma" to "gamma, beta, alpha"
SELECT REGEXP_REPLACE('alpha beta gamma', '(\S+) (\S+) (\S+)', '\3, \2, \1') AS out
FROM dual;
14. Special Constructs
- Start/end anchors: ^, $
- Grouping: (...), non-capturing (?:...)
- Alternation: |
- Ranges and sets: [ ], [^ ]
- Equivalence classes inside sets: [[=a=]] (matches base lettter "a" and its accented variants)
Oracle implements POSIX bracket expressions, named and numbered capturing, lookahead/lookbehind, and backreferences. Atomic groups and conditional subptaterns are not implemented.