Regular Expressions: The Scary Tool That Every Developer Should Know

Regular Expressions: The Scary Tool That Every Developer Should Know

Regular Expressions are an amazing tool for every developer to have as part of their toolbox, but at times, they can feel cryptic, hard to understand and even like the tools of more advanced developers due to the knowledge required to use them.

While part of that is true, I believe the use of Regular Expressions is definitely a practice all developers should be able to achieve. And while a cheat sheet is not going to solve all your problems in that regard, it will help you get started!

In this quick cheat sheet I’m going to cover some of my favorite features of regular expressions and hopefully transmit to you how useful they are.

By the way, before we get started, in case you don’t know about it there is a great site for you to test regular expressions without having to write any code: www.regex101.com . Their site will let you test your expressions against different types of inputs and it’ll plot your results showing you exactly the parts that match and the parts that don’t.

Start and End of a string

Starting with the basics: you can use different indicators as part of your expressions to make sure that whatever you match is part of either the start or the end of the string.

In other words, if you’re looking for the word this inside a string such as “this is it, this is what you were looking for, this is it”, an expression like this would match all appearances of the word:

This code would match all three instances of this however if you wanted to only match the first one, because it’s at the start of the string, you can use the ^ character, and in the same way, if you wanted to match the last it of the string, you could use the $ character to indicate you’re looking for a match at the end of the string. Let me show you:

Notice how I’m using the ^ and $ at the right places, they can’t just be placed anywhere, but if you place them at the start or at the end of the match, the engine will understand what you’re trying to do and correctly look for your string in the right spot.

Quantifiers

A powerful modifier to use, the quantifiers allow you to specify the number of times a section of your expression can match. This gives you the ability to specify optional sections of your expression or even parts that need to be repeated several times (with no limit even).

For example, if you wanted to match an ISO formatted date string, you could use something like this:

[0-9]{4}-[0-9]{2}-[0-9]{2}

The quantifiers between {} are telling the regexp engine how many numbers to match in each case. You can also be less specific like this:

{x,}  matches at least x times (could be more)
{x,y} matches between x and y times
*     matches none or any amount of times (essentially, optional)
+     matches 1 or more times, the same as doing {1,}

The OR operator

Another very interesting bit of logic you can add to your regular expressions in order to make them even more flexible is the logical OR operator.

With it, you can make it so sections of your expressions can match one of several alternatives, for example:

Notice the | at the middle of the expression. We’re essentially writing a single RegExp that will accommodate both versions of valid hex colors. Quickly, the output from that code is:

Checking #fff = true
Checking #FEFEFE = true
Checking #999ccc = true
Checking fefefe = false
Checking #i0i0i0 = false

As an added bonus, the match method actually returns an array of matches or null if there are none found. Here however, I’m turning it into a true or false string, thanks to the fact that JavaScript can cast a null value into a falsy value and an array into a truthy value, then a simple cast to string will turn those values into actual “true” or “false” strings. The cast to boolean is done thanks to the !! prefixing the call to match. Groups

Groups are fantastic tools to help you deal with sections of a matched expression. If you’re just trying to check if there is a match like in the above example, then groups don’t really add a lot to the mix.

However, if you’re instead, trying to replace a complex portion of a string, or even just grabbing a section of the match for your internal logic outside of the expression, then groups are a great tool to have.

Groups are easy to spot inside a regexp because they’re defined by parenthesis. For example, let’s say you want to capture the name of all HTML tags inside a string (i.e getting “body” from <body> or even “html” from </html> ). For this, you’ll need to add the < and > characters to the expression, because you want to make sure you’re only matching HTML tags, but you want to only capture the name inside:

let nameExp = /<\/?([a-z0-9]+) *\/?>/gi

let htmlCode = "<html><body><H1>This is big!</h1></body></html>"

let match = null
while( match = nameExp.exec(htmlCode)) {
    console.log(`Tag found: ${match[1]}`)
}

The output of this code is:

Tag found: html
Tag found: body
Tag found: H1
Tag found: h1
Tag found: body
Tag found: html

Notice the group inside the expression, capturing only alphanumeric characters (because we’re specifying a range from a to z and 0 to 9). We then have a variable amount of whitespaces allowed, although they’re not being captured inside the group, and before and after it, we have optional / characters (notice how I added a ? after each one).

Finally, thanks to the flag I used at the end of the expression, we can match both uppercase and lowercase tags and any combination of both (I used the i flag to ignore the case on the match). The book

If you found any of these tips and tricks useful and would like to know more, I wrote a full booklet (around 80 pages, so calling it a book would be a stretch) that covers everything you’ll ever want to know about regular expressions in JavaScript.

Inside of it, you’ll find all the technical information you’ll want or need as a reference, but the key aspect of this book is that I’ve added a set of carefully designed exercises that’ll help you understand how to think in Regular Expressions. That means that you’ll make the cognitive switch required to apply this knowledge to your own problems. The book is designed to make you think and reason in terms of Regular Expressions, not just give you all the information and leave you alone to deal with it.

And because I’m self-publishing, I would love for you to check it out and let me know what you think! You can find the eBook version on the Kindle Store and in paperback right on Amazon . If you want to know more about the book, you can click here for more details.

Conclusion

Books aside, regular expressions are amazing tools to have available and there are some use cases that look like they were designed for them. This is why I always try to push this practice to both, expert and novice developers, they all should be able to use them and get the most out of them.

If you’re new to regular expressions, leave a comment down below and let me know if you found any of these tips useful or if you have any questions about them, I’d love to help out!

And if you’ve dealt with them before, leave a comment stating the most complex problem you solved with them. That’s always fun to read!

Have fun and keep on coding!

Did you find this article valuable?

Support The Rambling of an Old Developer by becoming a sponsor. Any amount is appreciated!