MySQL RegEx function
Often, while querieng MySQL, we use LIKE clause, basically to match some pattern with any column data.
For example, consider following SQL statement with LIKE clause:
1 |
SELECT * FROM `store` WHERE NAME LIKE "%Bike%"; |
which will give all the records from “store” table where there will be “Bike” anywhere in the ‘name’ field.
Similarly, MySQL provides another function called REGEX which is very handy and easy to use and enable us to match any specific pattern with the data in table.
REGEX is also called RLIKE. RLIKE is another synonym for REGEX.
So the above query can be written in REGEX as follows:
1 |
SELECT * FROM `store` WHERE NAME REGEXP 'Bike'; |
above will give same result as previous query.
To know more about REGEX function, one needs to have familiarity with Regular Expressions.
Following table can help to understand highlevel pattern which we can use in our REGEX query.
Pattern | Meaning |
---|---|
^ | Beginning of string |
$ | End of string |
. | Any single character |
[…] | Any character listed between the square brackets |
[^…] | Any character not listed between the square brackets |
x1|x2|x3 | matches any of the patterns x1 OR x2 OR x3 |
* | Zero or more instances of preceding element |
+ | One or more instances of preceding element |
{n} | n instances of preceding element |
{m,n} | m through n instances of preceding element |
Some examples:
1 2 3 4 5 6 |
SELECT * FROM `store` WHERE NAME LIKE "%Bike"; # Can be written as SELECT * FROM `store` WHERE NAME REGEXP "Bike$"; SELECT * FROM `store` WHERE NAME LIKE "Bike%"; # Can be written as SELECT * FROM `store` WHERE NAME REGEXP "^Bike"; |
REGEX over LIKE:
So, you might be thinking why an another function when we have LIKE doing same as REGEX; the answer is REGEX provides more flexiblity to match any pattern as compared to LIKE.
Consider following example:
If I want to find all the records having either bike or shop or mart in its ‘name’ field, then with LIKE and REGEX it will be as follows:
1 2 3 4 5 6 7 |
# with LIKE clause SELECT * FROM `store` WHERE NAME LIKE "%Bike%" OR NAME LIKE "%shop%" OR NAME LIKE "%mart%"; # with REGEX, can be written as simply SELECT * FROM `store` WHERE NAME REGEXP "Bike|shop|mart"; |
so if you can see we don’t need to write multiple LIKE clause, simply a pipe ‘|’ character will do everything for us in case of REGEX.
This is a single example, now consider little bit more complicated.
Suppose you want to find all the names of stores where its name contains either a vowel character or which starts with combination of “Ra” character,
think if you have to do this using LIKE Clause seems difficult, but using REGEX this can be written as follows:
1 |
SELECT * FROM `store` WHERE NAME REGEXP "^[aeiou]|^re"; |
Negation:
The way we use NOT LIKE clause for negation, similar fashion we can use NOT REGEX in case of REGEX, as follows:
1 2 3 4 |
SELECT * FROM `store` WHERE NAME NOT LIKE "%Bike%"; # Can be written as SELECT * FROM `store` WHERE NAME NOT REGEXP 'bike'; |
Drawbacks:
We have seen REGEX is very powerful and handy, but it has some drawbacks:
1: REGEX operator work in byte-wise fashion, so they are not multibyte safe and may produce unexpected results with multibyte character sets.
2: REGEX is slow, as it compares character by character (by their byte values, technically), with the given pattern it is damn slow, if you
use REGEX in a column having large text, then your query may take longer time to execute. To over come this issue, you may use a workaround, which is
use of LIKE clause along with REGEX function; for example, suppose you have a giant table called “blog_text”, having blog posts about various technologies as well as non technical stuffs. And you want to retrive all the blogs where “description” will have “technology” keyword along with some other keywords like “MySQL” or “NoSQL” or “Hadoop”, then to optimize the query a little you can do as follows:
1 |
select * from blog_text where description like "%technology%" and description REGEX "MySQL|NoSQL|Hadoop"; |
There is lot more you can do using this powerful function called REGEX, follow some reference links to know more:
http://dev.mysql.com/doc/refman/5.7/en/regexp.html