Automatically add <kbd>-tags with a Single Regex

Recent Posts

Introduction

Until about a year ago, I had the same approach to regular expressions as most developers: Doing trial and error until the damn thing was working.

Then came KeyCombiner and with it some very specific challenges around parsing text that contains keyboard shortcuts. One of these challenges is writing blog posts about KeyCombiner:

These posts contain a lot of keyboard shortcuts that should be highlighted via <kbd> tags. By now, these tags are supported by most static site generators and blogging platforms, such as DEV.to. However, typing them in manually is extremely tedious. Initially, I felt quite smart when I added a user snippet in VSCode:

"kbd": {
    "scope": "markdown",
    "prefix": "kbd",
    "body": [
        "<kbd>$0</kbd>"
    ],
    "description": "Insert HTML kbd tags"
},

This lets me type kbd, press ctrl+space and hit enter to put my cursor in-between an opening and a closing <kbd> tag. However, it does not help with copy-pasted text, which is an essential use case for me. I write the first draft of my posts in Notion, as described in my post about personal knowledge management. Even if this were not an issue, it’s still quite a lot of keys to hit for just adding two tags. Another solution would be to write a script in some interpreted language and do all kinds of fancy things to this text. The problem is, this script will likely end up like many single-purpose scripts do: Abandoned and buggy.

So, I needed a better, more timeless solution: Regular Expressions.
Fear not, I won’t try to teach you general regex syntax. Instead, I will show how to use it for this particular use case. I will explain in detail how and why this works, but you can also just copy-paste the final regex and apply it for yourself. One of the best tools for putting the regex to work is Visual Studio Code, but any editor that supports regex-based replace operations will do.

How-To

Suppose we have the following text:

Modifier keys, such as ctrl, shift, alt, and cmd do nothing by themselves but can be used in combination with other keys. To paste without the original formatting, use ctrl+shift+v. Debugging is controlled via f5, f6, f10, and f11.

What we want to have is this:

Modifier keys, such as ctrl, shift, alt, and cmd do nothing by themselves but can be used in combination with other keys. To paste without the original formatting, use ctrl+shift+v. Debugging is controlled via f5, f6, f10, and f11.

The next sections will describe how we get there.

Modifiers and F-keys

We can already go pretty far with this regex that uses or operators (|) to match a predefined set of strings:

(f1|f2|f3|f4|f5|f6|f7|f8|f9|f10|f11|f12|backspace|tab|enter|shift|ctrl|cmd|alt|capslock|pageup|pagedown|ins|del|option|meta)

Putting the expression in brackets () means that it is a capture group. Capture groups are an essential concept when working with regular expressions. One of their greatest features is that we get to reference them by number in the replacement string:

<kbd>$1</kbd>

This means we will replace a match of the first capture group with itself ($1) surround by the desired tags.

The above text will become:

Modifier keys, such as ctrl, shift, alt, and cmd do nothing by themselves but can be used in combination with other keys. To paste without the original formatting, use ctrl+shift+v. Debugging is controlled via f5, f6, f10, and f11.

This is a good start, but we are not quite there yet. Non-modifier keys that are part of key combinations do not yet have the proper tags. We will deal with them in the next section. But before, we can shorten our expression a bit, by switching out the f-keys with a single sub-expression: f([2-9]|1[0-2]?). This will match any f-character followed by either 2-9 or a sequence of 1 and 0-2. It is equal to typing out the f-keys, but a little more fun. So we end up with this expression:

(f([2-9]|1[0-2]?)|backspace|tab|enter|shift|ctrl|cmd|alt|capslock|pageup|pagedown|ins|del|option|meta)

Regular keys in combinations

Matching regular keys is a bit more tricky. We cannot just match any v characters in our text. One approach is to match any single alphanumeric character that is preceded by a +-sign. To do this, we need to use something called Lookarounds, more more specifically, a Lookbehind: (?<=\+) The nice thing about lookarounds is that they do not increase the matched character sequence. Instead, they specify what must (or must not) come before or after the matched part, which is very useful when applying replacement operations. Adding this to our expression results in the following:

((?<=\+)[\w\d]{1}|f([2-9]|1[0-2]?)|backspace|tab|enter|shift|ctrl|cmd|alt|capslock|pageup|pagedown|ins|del|option|meta)

Using this expression, we end up with the desired text already:

Modifier keys, such as ctrl, shift, alt, and cmd do nothing by themselves but can be used in combination with other keys. To paste without the original formatting, use ctrl+shift+v. Debugging is controlled via f5, f6, f10, and f11.

However, we shouldn’t celebrate too early. There are still some issues and special cases to take care of:

  1. What happens if a modifier appears within another word, such as ins in insert?
  2. What happens if we apply this to text that already has some <kbd> tags?

The answer to both questions is clear: Terrible things will happen. We will have to go a bit deeper.

Avoiding False Positives

Now that we already know the Lookaround concept, this is relatively easy. We will ensure that the expression only matches occurrences surrounded by one of these characters: [\+ .,\n].

(?<=[\+ .,\n])(backspace|tab|enter|shift|ctrl|cmd|alt|capslock|pageup|pagedown|ins|del|option|meta|f([2-9]|1[0-2]?)|(?<=\+)[\w\d]{1})(?=[\+ .,\n])

Achieving Idempotence

An idempotent operation is one that can be applied more than once without changing the outcome. To achieve idempotence, we will make use of another advantage of our beloved lookarounds: They can be stacked one after another. All we have to do is make sure that our matches are not already preceded by a <kbd> tag, using a negative Lookbehind:

(?<!<kbd>)(?<=[\+ .,\n])(backspace|tab|enter|shift|ctrl|cmd|alt|capslock|pageup|pagedown|ins|del|option|meta|f([2-9]|1[0-2]?)|(?<=\+)[\w\d]{1})(?=[\+ .,\n])

We could add a negative Lookahead for the closing tag, too, but it is not needed.

Final regex

That’s it. We have a regular expression that we can use in VSCode and other editors to automatically surround all occurences of keys with <kbd> tags. To use it, search for this expression with the Use Regular Expression (alt+r)-toggle enabled:

(?<!<kbd>)(?<=[\+ .,\n])(backspace|tab|enter|shift|ctrl|cmd|alt|capslock|pageup|pagedown|ins|del|option|meta|f([2-9]|1[0-2]?)|(?<=\+)[\w\d]{1})(?=[\+ .,\n])

Then, replace all occurrences string:

<kbd>$1</kbd>

If you want to go one step further, you could even apply this in a build pipeline, e.g., after pushing to GitHub Pages or GitLab Pages. Personally, I like to do it myself in VSCode. These are very satisfying 30 seconds of work before publishing a new blog post.

Conclusion

Hopefully, I could help a bit to improve the bad reputation of regular expressions. But I will let you in on a little secret. Just like everyone else, I have a hard time remembering the syntax. The good thing is, I have a superpower to overcome this: KeyCombiner’s instant shortcut lookup. Whenever I press super+c, I have instant access to the shortcuts of the active application, all other shortcuts that I like to use, and even regex syntax:

Instant shortcut lookup of KeyCombiner&rsquo;s free to use desktop app.

Instant shortcut lookup of KeyCombiner’s free to use desktop app.


comments powered by Disqus

  Blog Post Tags