Welcome to the PtokaX Wiki!


This wiki is devoted to let users share information on the great hubsoft PtokaX. However, it is not limited to PtokaX, it is destined to share and discuss information on different parts of the LUA programming language, but the site is NOT intended to be a replacement for the primary LUA Board aka the PtokaX Portal, nor the secondary LUA Board or the PtokaX resources.

If you’re not familiar with wikis then in short: they are user-contributed sites: pages can be edited by (almost) anyone. As such, there is always kind of up-to-date and (hopefully) proper content. They are better than forums, since relevant pieces of information are on the same page. Do not hesitate to share your knowledge, and we all hope you can learn and teach a lot here! — bastya_elvtars



Using the string library in LUA

Probably the most powerful library of lua is the one that deals with strings.

:!: This will undergo major changes in LUA 5.1, which is currently in alpha stage.

Introduction

Strings are chains of characters. The string library has some functions related to pattern matching, and some related to other kinds of string manipulation (e. g. string.sub). Numeric values are handled as strings without having to tostring them first. This is called coercion. Strings can contain any character a computer knows. However, there is the backslash (\) which you have to use in a different way, as it represents a kind of escape character. Here are it's uses:

Sequence Meaning
\a Bell
\b Backspace
\f Form feed
\n Newline (use this to save files)
\r Carriage return (use \r\n to send messages in DC with newlines!)
\t Horizontal tab
\v Vertical tab
\\ Backslash
\” Double quote
\' Single quote
\[ Left square bracket
\] Right square bracket

In long strings, which are the ones delimited by [[ and ]] these don't have to be used. In these long strings you can use normal text, like:

str=[[A line.
Another, with a single quote "
And this is: \"
Also this: \\ \r \n \t ... hmm :)
]]
print(str)

You will have to use the LUA interpreter (5.0.2).

Introduction to Pattern Matching in Lua by plop

Note that I have touched this document in many places, with permission. — bastya_elvtars

The most powerful functions in the string library are string.find, string.gsub (Global Substitution), and string.gfind (Global Find). They all are based on patterns. They can be used in 2 ways. First, the 'standard' way: you would like to match a specific word inside a sentence, like the “or” in “To be or not to be…”. The second one uses character classes, when you have a wider range of characters. See the table below!

Character classes

Character class Meaning
. Any character
%a Letters
%c Control characters
%d Decimal digits
%l Lowercase letters
%p Punctuation characters
%s Space characters
%u Uppercase characters
%w Alphanumeric characters
%x Hexadecimal digits
%z The character with representation 0
%bc1c2 Used to match a string between c1 and c2. If captured (see below), the two limiting characters are included in the return string. So if you apply (%b<>) on “A HTML line.<br />” then it will return ”<br />”.

An uppercase version of any of the above represents the opossite of the class. For instance, %A represents all non-letter characters. Except %b , every class matches one character, if you want more you can use the following:

  • + 1 or more repetitions (returns nil if not found)
  • * 0 or more repetitions (returns ”” on 0 repetitions). It matches the longest possible sequence (see below.)
  • - also 0 or more repetitions (returns ”” on 0 repetitions). Matches the shortest possible sequence (see below).
  • ? Optional character.
    ":-?%)"

    matches a smiley with an optional nose.

What is the difference between * and exactly? Consider this code:

teststring="/a/b/c/"
 
print("We wanna find text between 2 slashes:","/.../")
 
_,_,str=string.find(teststring,"/(.*)/")
print("Using "..teststring.." as string, and * we get:",str)
_,_,str=string.find(teststring,"/(.-)/")
print("Using "..teststring.." as string, and - we get:",str)

This wil output:

 We wanna find text between 2 slashes:	/.../
 Using /a/b/c/ as string, and * we get:	a/b/c
 Using /a/b/c/ as string, and - we get:	a

You can read more on string.find later.

Some characters, called magic characters, have special meanings when used in a pattern. The magic characters are:

(   )   .   %   +   -   *   ?   [   ^   $ 

The character %' works as an escape for those magic characters. So, %. matches a dot, and

%%

matches the % itself. You can use the escape % not only for the magic characters, but, for any non-alphanumeric character. When in doubt, play safe and put an escape.

:!: You can find more on the magic characters in the next section.

Now, let's see the functions used in pattern matching.

string.find (s, pattern , [init [, plain]])

Looks for the first match of pattern in the string s. If it finds one, then find returns the indices of s where this occurrence starts and ends; otherwise, it returns nil. If the pattern specifies captures (see below), the captured strings are returned as extra results. An example:

teststring="/a/b/c/"
s,e,str=string.find(teststring,"/.*/")
print(s,e,str)

This outputs:

 1	7	nil

But if you change ”/.*/” to ”/(.*)/” you will get:

 1	7	a/b/c

1 and 7 mean that the pattern it found starts at character #1 in the string and ends at character #7.

The round brackets represent a capture (the string surrounded by them and which you want string.find to return it as its third argument – there can be many captures!), so you also 'grab' what you find.

A third, optional numerical argument init specifies where to start the search; its default value is 1 and may be negative (so start in the end).

A value of true as a fourth, optional argument plain turns off the pattern matching facilities, so the function does a plain “find substring” operation, with no characters in pattern being considered “magic”. Note that if plain is given, then init must be given too.

Meanings of magic characters

Time to explain a bit more about the so-called magic characters. Here they all are again.

(   )   .   %   +   -   *   ?   [   ^   $ 
^

This one can do two things but gonna explain only one of them here right now. On a pattern like this: “^%b<>%s+(%S+)” it means it has to start right at the beginning of the string. If %b<> isn't found it's going to return nil.

$

Works the opposite way was ^. On a pattern like this: ”%b<>%s+(%S+)$” it starts searching from the end of the string, and just like ^ it's going to return nil, if the %S+ thing in the pattern isn't found.

Lets show this with an example script.

print(" ")
print("the full string were using is: \""..str.."\"")
print(" ")
print("1st searching for the word \"why\"")
s,e,tmp =string.find(str, "(why)")
print("the pattern \"(why)\" returns: "..tmp)
s,e,tmp =string.find(str, "(why)$")
print("the pattern \"(why)$\" returns: "..(tmp or "not found"))
s,e,tmp =string.find(str, "^(why)")
print("the pattern \"^(why)\" returns: "..(tmp or "not found"))
print(" ")
print("now lets do the same patterns on the word \"cooler\"")
s,e,tmp =string.find(str, "(cooler)")
print("the pattern \"(cooler)\" returns: "..tmp)
s,e,tmp =string.find(str, "(cooler)$")
print("the pattern \"(cooler)$\" returns: "..(tmp or "not found"))
s,e,tmp =string.find(str, "^(cooler)")
print("the pattern \"^(cooler)\" returns: "..(tmp or "not found"))
print(" ")
print("now lets do the same patterns on the word \"scripting\"")
s,e,tmp =string.find(str, "(scripting)")
print("the pattern \"(scripting)\" returns: "..tmp)
s,e,tmp =string.find(str, "(scripting)$")
print("the pattern \"(scripting)$\" returns: "..(tmp or "not found"))
s,e,tmp =string.find(str, "^(scripting)")
print("the pattern \"^(scripting)\" returns: "..(tmp or "not found"))
%

This is not new either, you've already seen it, it's a so called escape.

We used it in things like %S . In case we want to find one of the magic characters were gone need this one to. For example, we want to find the % itself, we have to escape it from being magic. Sounds more complex than it is, %% this is all. Now, can you guess what it needs to find the ?, again simply put a % in front of it, %? and we're done. Once again, a simple example script.

str = "why is ptokax is way cooler then yhub? because it can do scripting"
 
print(" ")
print("the full string were using is: \""..str.."\"")
print(" ")
print("now lets lets find the ? in the string, and 2 make it easy the word before it")
s,e,tmp = string.find(str, "(yhub%?)")
print(tmp)
[ and ]

These can do real magic. We know that %d represents numbers and %a represents letters. But what if we don't want to find the whole range? Then this is going to be your best friend, but just like (), he has a brother which he always needs, the ]. You can make your own ranges with this baby. Let's start with a example script to show the basics. This time not by using string.find butstring.gsub (replace what is found) - first without the magic [] then with it, so we can clearly see the difference. But first, we need to learn string.gsub.

string.gsub (s, pat, repl [, n])

Returns a copy of s in which all occurrences of the pattern pat have been replaced by a replacement string specified by repl. string.gsub also returns, as a second value, the total number of substitutions made:

_,letterO=string.gsub("To be or not to be","o","")
print("Letter \"o\" was found"..letterO.." times.")

If repl is a string, then its value is used for replacement. Any sequence in repl of the form %n, with n between 1 and 9, stands for the value of the n-th captured substring (see below). So this means %2 represents the second match, %4 represents the fourth.

x = string.gsub("hello world", "(%w+)", "%1 %1")
print(x)
 
x = string.gsub("hello world from Lua", "(%w+)%s*(%w+)", "%2 %1")
print(x)

If repl is a function, then this function is called every time a match occurs, with all captured substrings passed as arguments, in order; if the pattern specifies no captures, then the whole match is passed as a sole argument. If the value returned by this function is a string, then it is used as the replacement string; otherwise, the replacement string is the empty string. For instance:

print("Original string is: \"To be or not to be\".")
tbl={}
print("Sorting the words")
 
 
string.gsub("To be or not to be","(%S+)",function(word)
  table.insert(tbl,word)
  table.sort(tbl)
end)
 
print(unpack(tbl))
 
tbl={}
cnt=0
 
print("Printing the word numbers")
 
str=string.gsub("To be or not to be","(%S+)",function(word)
  cnt=cnt+1
  return cnt
end)
 
print(str)

The optional last parameter n limits the maximum number of substitutions to occur. For instance, when n is 1 only the first occurrence of pat is replaced.

x = string.gsub("hello world", "(%w+)", "%1 %1", 1)
print(x)

Now back to the square brackets. We said that between the square brackets any individual character classes can be specified, as it can be seen in this example:

str = "why is ptokax is way cooler then yhub? because it can do scripting"
 
print(" ")
print("the full string were using is: \""..str.."\"")
print(" ")
print("now lets lets replace word yhub so we don't have to see that, for the letter \"x\"")
print("first without the magic [ ]")
tmp = string.gsub(str, "yhub", "x")
print(tmp)
print("now lets lets replace all the letters which make up the word yhub for the letter \"x\"")
print("now with the magic [ ]")
tmp = string.gsub(str, "[yhub]", "x")
print(tmp)

You can see the difference, the first one matches the whole word only, the last one takes individual characters. But we can do more with it. I promised you that I would return to the ^, and now is that time. Just like %S , the opossite of %s, we can use ^ to reverse the magic between []. Again an example script.

str = "why is ptokax is way cooler then yhub? because it can do scripting"
print(" ")
print("the full string were using is: \""..str.."\"")
print(" ")
print("now lets lets replace all but the letters which make up the word yhub for the letter \"x\"")
print("now with the magic [ ]")
tmp = string.gsub(str, "[^yhub%s]", "x")
print(tmp)

string.gfind (s, pat)

This was introduced in LUA5. Returns an iterator function that, each time it is called, returns the next captures pat over string s. If pat specifies no captures, then the whole match is produced in each call.

First, we calculate the decimal value of an IP.

function CalcIP(ip)
  local c,decip=1,0
  for bit in string.gfind(ip,"%d+") do
    if not tonumber(bit) or tonumber(bit) > 255 then
      break
    else
      decip=decip+(tonumber(bit)*c)
      c=c*256
    end
  end
 
  return decip or "bad IP"
end
 
print(CalcIP("127.0.0.1"))

Now we collect all pairs key=value from the given string into a table:

t = {}
s = "from=world, to=Lua"
 
for k, v in string.gfind(s, "(%w+)=(%w+)") do
  t[k] = v
end

We can use

for v1,v2,v3,v4 in string.gfind("abcd","(.)(.)(.)(.)") do; something(); end

if we specify 4 captures.

Well, this is all you ned to know about pattern matching, other (including PtokaX-related) examples can be found in the end of this document.
FIXME maybe later.


Personal Tools