Difference between revisions of "Regex"

From Mudlet
Jump to navigation Jump to search
(initial creation)
 
 
(5 intermediate revisions by the same user not shown)
Line 1: Line 1:
Mudlet uses Perl '''Reg'''ular '''Ex'''pressions in a number of important ways.  Regex is used to match alias commands or to match incoming lines with triggers. In triggers we capture this data by surrounding the expressions in parentheses.  Regex uses special characters to match text which is probably best shown through some examples.
+
Mudlet uses Perl '''Reg'''ular '''Ex'''pressions in a number of important ways.  Regex is used to [[Manual:Introduction#Regex_Patterns_in_Aliases|match alias commands]] or to [[Manual:Trigger_Engine#Perl_Regex|match incoming lines with triggers]]. In triggers we can also capture this incoming data by surrounding the expressions in parentheses.  Regex uses special characters to match text which is probably best shown through some examples.
 +
 
 +
{{note}} Regex is very powerful and goes beyond the basics shown on this page.
 +
 
 +
{{note}} Use a regex tester to test your code: [http://regex101.com regex101.com] is highly recommended by Mudlet users.
  
  
Line 58: Line 62:
 
matches when a single word <code>quest</code> is on its own line.
 
matches when a single word <code>quest</code> is on its own line.
  
== Cheat Sheet ==
+
== Match Everything ==
 +
 
 +
  * match zero or more
 +
  + match one or more
 +
  ? optionally match
 +
 
 +
  ^(.*)$  match everything on a line, even if blank
 +
  ^(.+)$  match everything on a line, but not a blank line
 +
 
 +
== Examples ==
 +
 
 +
  You see 30 gold.
 +
  You see 1 silver.
 +
 
 +
  ^You see (\d+) (gold|silver).$
 +
 
 +
matches both lines.  Capture group 1 would be 30 or 1.  Capture group 2 would be gold or silver.
 +
 
 +
  A warrior rests here.
 +
  A mage stands here preparing to cast fireball.
 +
 
 +
  ^A (\w+) (rests|stands) here(.+)
 +
 
 +
matches both lines.  Capture group 1 would be <code>warrior</code> or <code>mage</code>.  Capture group 2 would be <code>rests</code> or <code>stands</code>.  Capture group 3 would be <code>.</code> or <code>preparing to cast fireball.</code>
 +
 
 +
== Trigger Examples ==
 +
 
 +
=== Matching one unknown word === <!--T:94-->
 +
 
 +
<!--T:95-->
 +
You can also set up a trigger to gather the weapons, gold or whatever skeletons could carry around with them. Since we do not know what the loot is exactly just yet, we will need to set up a trigger to match the line, identify the loot and take whatever it is that was dropped.
 +
 
 +
<!--T:96-->
 +
Examples for messages received could be:
 +
 
 +
  <!--T:97-->
 +
The skeleton drops ring.
 +
  The skeleton drops gold.
 +
  The skeleton drops scimitar.
 +
  The skeleton drops wooden key.
 +
 
 +
<!--T:98-->
 +
"The skeleton drops " (including the last space character) is the generic segment of the line, but the loot itself varies. Thus, we need to tell the client to take whatever the skeleton dropped. We do this by setting up a so-called regular expression:
 +
 
 +
<!--T:99-->
 +
# In the data field titled "1" write the following '''perl regex''' type pattern: <syntaxhighlight lang="lua">^The skeleton drops (.+)\.$</syntaxhighlight>
 +
# Make sure to change the dropdown menu for this line to "perl regex" as well
 +
# In the big script box below write the following lua code: <syntaxhighlight lang="lua">send("take " .. matches[2])</syntaxhighlight>
 +
 
 +
<!--T:210-->
 +
[[File:Trigger intro 2.png|1000px|center]]
 +
 
 +
<!--T:100-->
 +
The regular expression (.+) matches any characters that the client receives between "The skeleton drops " (NB: notice the blank/space character at the end) and the full-stop symbol that ends the sentence. Know that the variable matches[2] simply transfers the first matched text fitting the search criteria into the output. For this example, it will be the dropped loot we now will take automatically. This text may actually contain more than one word, like in the fourth example shown above.
 +
 
 +
<!--T:101-->
 +
In case you may wonder, matches[1] contains the entire line in whitch the matched text was found, whereas matches[2] contains only the first capture group. More on this in section two of the manual. The symbols ^ and $ indicate the start and end of a whole line.
 +
 
 +
=== Matching multiple unknowns === <!--T:102-->
 +
 
 +
<!--T:103-->
 +
Now, we don't want to take only loot from skeletons but from many different sources.
 +
 
 +
<!--T:104-->
 +
Examples could be:
 +
 
 +
  <!--T:105-->
 +
The skeleton drops ring.
 +
  The giant drops gold.
 +
  The king drops scimitar.
 +
  The box drops wooden key.
 +
 
 +
<!--T:106-->
 +
So let’s make a trigger that would gather the loot from anybody:
 +
 
 +
<!--T:107-->
 +
# In data field "1" write: <syntaxhighlight lang="lua">^(.+) drops (.+)\.$</syntaxhighlight>
 +
# Select '''perl regex''' type pattern again
 +
# Below write the lua code: <syntaxhighlight lang="lua">send("take " .. matches[3])</syntaxhighlight>
 +
 
 +
<!--T:211-->
 +
[[File:Trigger intro 3.png|1000px|center]]
 +
 
 +
<!--T:108-->
 +
In this case, any time somebody, or something, "drops" something or someone else, the client will pick it up. Note that we used matches[3] instead of matches[2] this time, in order to pick up the second match. If we used matches[2], we’d end up picking up the skeleton’s corpse.
 +
 
 +
== Matching known variants == <!--T:109-->
 +
 
 +
<!--T:110-->
 +
If you’re playing a game in English, you’ll notice that these triggers probably won’t work due to English syntax. Compare:
 +
 
 +
  <!--T:111-->
 +
The skeleton drops ring.
 +
  The skeleton drops a ring.
 +
 
 +
<!--T:112-->
 +
Chances are that you’ll see the later case a little more often. If we used our old regex, the trigger would produce something like this.
 +
 
 +
  <!--T:113-->
 +
TRIGGERED LINE: The skeleton drops a ring.
 +
  OUR REACTION: take a ring
 +
 
 +
<!--T:114-->
 +
However most games can’t handle determiners in user-input, such as articles (i.e. a, an, the) or quantifiers (e.g. five, some, each). In effect, our triggered reaction won't suffice. Instead we would need to react with just "take ring" again.
 +
 
 +
<!--T:115-->
 +
To correctly handle lines like this, we could either create multiple triggers matching every possible article, which could become very cumbersome. Instead we make one regular expression filtering out all these words and phrases:
 +
 
 +
<!--T:116-->
 +
# Write a new '''perl regex''' type pattern: <syntaxhighlight lang="lua">(.+) drops (a|an|the|some|a couple of|a few|) (.+)\.$</syntaxhighlight>
 +
# With this script: <syntaxhighlight lang="lua">send("take " .. matches[4])</syntaxhighlight>
 +
 
 +
<!--T:212-->
 +
[[File:Trigger intro 4.png|1000px|center]]
 +
 
 +
<!--T:117-->
 +
Once again, note that this time we are using the third matched group through matches[4] now.
 +
 
 +
 
 +
=== Basic Regex Characters === <!--T:120-->
 +
 
 +
<!--T:121-->
 +
You already know <code>(.+)</code> which will match to any and all characters that follow until the end of line or another specific text that you may put in your regex. How about if you only want to match to certain type of characters?
 +
 
 +
 
 +
=== Retrieving numbers from triggers === <!--T:122-->
 +
 
 +
<!--T:123-->
 +
Wildcards from triggers are stored in the matches[] table. The first wildcard goes into matches[2], second into matches[3], and so on, for however many wildcards do you have in your trigger.
 +
 
 +
<!--T:124-->
 +
For example, you’d like to say out loud how much gold did you pick up from a slain monster. The message that you get when you pick up the gold is the following:
  
 +
 +
<!--T:125-->
 +
You pick up 16 gold.
 +
 +
 +
<!--T:126-->
 +
A trigger that matches this pattern could be:
 +
 +
 +
<!--T:127-->
 +
# Perl Regex: <syntaxhighlight lang="lua">^You pick up (\d+) gold\.$</syntaxhighlight>
 +
# Script: <syntaxhighlight lang="lua">echo("I got " .. tonumber(matches[2]) .. " gold!")</syntaxhighlight>
 +
 +
 +
<!--T:128-->
 +
In your code, the variable matches[2] will contain the amount of gold you picked up - in this case, 16. Now you say out loud how much gold you did loot. Notice also that (\d+) will only recognize numbers but not letters or a space character.
 +
 +
 +
=== Retrieving alphanumeric characters === <!--T:129-->
 +
 +
 +
<!--T:130-->
 +
Here’s a more advanced example by Heiko, which makes you talk like Yoda:
 +
 +
 +
<!--T:131-->
 +
# Perl Regex: <syntaxhighlight lang="lua">^say (\w+) *(\w*) .*?(.*)</syntaxhighlight>
 +
# Script: <syntaxhighlight lang="lua">send( "say "..matches[4].." "..matches[2].." "..matches[3] )</syntaxhighlight>
 +
 +
 +
<!--T:132-->
 +
The trigger will recognize that you say something, and save in seperate groups the first word, the second word and then the rest of you text. Here the \w wildcards will match any numbers or letters but no non-alphanumeric characters. It then shows you say the rest of the text first, then the first word and finally the second word. Notice this will only affect the text displayed for yourself, but if you want to also adjust the text you are sending to other players, please see the [[#Aliases|chapter about aliases]].
  
 
== Helpful Links ==  
 
== Helpful Links ==  
 +
 +
* Mudlet manual: [[Manual:Trigger_Engine#Perl_Regex | trigger pattern matching with regex]]
  
 
* Regex tutorial: https://regexone.com/
 
* Regex tutorial: https://regexone.com/
 
* Regex tester: https://regex101.com/
 
* Regex tester: https://regex101.com/

Latest revision as of 16:00, 16 July 2024

Mudlet uses Perl Regular Expressions in a number of important ways. Regex is used to match alias commands or to match incoming lines with triggers. In triggers we can also capture this incoming data by surrounding the expressions in parentheses. Regex uses special characters to match text which is probably best shown through some examples.

Note Note: Regex is very powerful and goes beyond the basics shown on this page.

Note Note: Use a regex tester to test your code: regex101.com is highly recommended by Mudlet users.


Match a Digit

  \d   match a single digit
  \d+  match one or more digits
 You get 5 gold.
 You get (\d) gold.
 You get 150 gold.
 You get (\d+) gold.

Match an Alphanumeric Character

  \w   match a letter or number
  \w+  match one or more letters or numbers (e.g. a single word)
 You see a dragon.
 You see (\w) dragon.
 You see 1 dragon.
 You see (\w) dragon.
 You see three dragons.
 You see (\w+) dragons.

Match a Letter Only

  [a-z]     match a single lower case only letter
  [A-Z]     match a single upper case only letter
  [a-zA-Z]  match an upper or lower case letter
 You see a dragon.
 You see [a-z] dragon.

Will not match You see 1 dragon.

Match from List

  (mage|warrior|cleric)   match the word mage, warrior or cleric, but nothing else
  Before you stands a mighty warrior.
  Before you stands a mighty mage.
  Before you stands a mighty cleric.
  Before you stands a mighty (mage|warrior|cleric).

Will not match Before you stands a mighty ogre.

Start and End of a Line

  ^  match the start of a line
  $  match the end of a line
  ^quest$

matches when a single word quest is on its own line.

Match Everything

  * match zero or more
  + match one or more
  ? optionally match
  ^(.*)$   match everything on a line, even if blank
  ^(.+)$   match everything on a line, but not a blank line

Examples

 You see 30 gold.
 You see 1 silver.
 ^You see (\d+) (gold|silver).$

matches both lines. Capture group 1 would be 30 or 1. Capture group 2 would be gold or silver.

 A warrior rests here.
 A mage stands here preparing to cast fireball.
 ^A (\w+) (rests|stands) here(.+)

matches both lines. Capture group 1 would be warrior or mage. Capture group 2 would be rests or stands. Capture group 3 would be . or preparing to cast fireball.

Trigger Examples

Matching one unknown word

You can also set up a trigger to gather the weapons, gold or whatever skeletons could carry around with them. Since we do not know what the loot is exactly just yet, we will need to set up a trigger to match the line, identify the loot and take whatever it is that was dropped.

Examples for messages received could be:

The skeleton drops ring.

 The skeleton drops gold.
 The skeleton drops scimitar.
 The skeleton drops wooden key.

"The skeleton drops " (including the last space character) is the generic segment of the line, but the loot itself varies. Thus, we need to tell the client to take whatever the skeleton dropped. We do this by setting up a so-called regular expression:

  1. In the data field titled "1" write the following perl regex type pattern:
    ^The skeleton drops (.+)\.$
  2. Make sure to change the dropdown menu for this line to "perl regex" as well
  3. In the big script box below write the following lua code:
    send("take " .. matches[2])
Trigger intro 2.png

The regular expression (.+) matches any characters that the client receives between "The skeleton drops " (NB: notice the blank/space character at the end) and the full-stop symbol that ends the sentence. Know that the variable matches[2] simply transfers the first matched text fitting the search criteria into the output. For this example, it will be the dropped loot we now will take automatically. This text may actually contain more than one word, like in the fourth example shown above.

In case you may wonder, matches[1] contains the entire line in whitch the matched text was found, whereas matches[2] contains only the first capture group. More on this in section two of the manual. The symbols ^ and $ indicate the start and end of a whole line.

Matching multiple unknowns

Now, we don't want to take only loot from skeletons but from many different sources.

Examples could be:

The skeleton drops ring.

 The giant drops gold.
 The king drops scimitar.
 The box drops wooden key.

So let’s make a trigger that would gather the loot from anybody:

  1. In data field "1" write:
    ^(.+) drops (.+)\.$
  2. Select perl regex type pattern again
  3. Below write the lua code:
    send("take " .. matches[3])
Trigger intro 3.png

In this case, any time somebody, or something, "drops" something or someone else, the client will pick it up. Note that we used matches[3] instead of matches[2] this time, in order to pick up the second match. If we used matches[2], we’d end up picking up the skeleton’s corpse.

Matching known variants

If you’re playing a game in English, you’ll notice that these triggers probably won’t work due to English syntax. Compare:

The skeleton drops ring.

 The skeleton drops a ring.

Chances are that you’ll see the later case a little more often. If we used our old regex, the trigger would produce something like this.

TRIGGERED LINE: The skeleton drops a ring.

 OUR REACTION: take a ring

However most games can’t handle determiners in user-input, such as articles (i.e. a, an, the) or quantifiers (e.g. five, some, each). In effect, our triggered reaction won't suffice. Instead we would need to react with just "take ring" again.

To correctly handle lines like this, we could either create multiple triggers matching every possible article, which could become very cumbersome. Instead we make one regular expression filtering out all these words and phrases:

  1. Write a new perl regex type pattern:
    (.+) drops (a|an|the|some|a couple of|a few|) (.+)\.$
  2. With this script:
    send("take " .. matches[4])
Trigger intro 4.png

Once again, note that this time we are using the third matched group through matches[4] now.


Basic Regex Characters

You already know (.+) which will match to any and all characters that follow until the end of line or another specific text that you may put in your regex. How about if you only want to match to certain type of characters?


Retrieving numbers from triggers

Wildcards from triggers are stored in the matches[] table. The first wildcard goes into matches[2], second into matches[3], and so on, for however many wildcards do you have in your trigger.

For example, you’d like to say out loud how much gold did you pick up from a slain monster. The message that you get when you pick up the gold is the following:


You pick up 16 gold.


A trigger that matches this pattern could be:


  1. Perl Regex:
    ^You pick up (\d+) gold\.$
  2. Script:
    echo("I got " .. tonumber(matches[2]) .. " gold!")


In your code, the variable matches[2] will contain the amount of gold you picked up - in this case, 16. Now you say out loud how much gold you did loot. Notice also that (\d+) will only recognize numbers but not letters or a space character.


Retrieving alphanumeric characters

Here’s a more advanced example by Heiko, which makes you talk like Yoda:


  1. Perl Regex:
    ^say (\w+) *(\w*) .*?(.*)
  2. Script:
    send( "say "..matches[4].." "..matches[2].." "..matches[3] )


The trigger will recognize that you say something, and save in seperate groups the first word, the second word and then the rest of you text. Here the \w wildcards will match any numbers or letters but no non-alphanumeric characters. It then shows you say the rest of the text first, then the first word and finally the second word. Notice this will only affect the text displayed for yourself, but if you want to also adjust the text you are sending to other players, please see the chapter about aliases.

Helpful Links