Advanced-ish string manip

excessive use of patterns included

by Goate_So

Author Avatar

String Library

the string library contains many functions that can be used for manipulating strings, such as gsub, sub, gmatch, match, find, lower, upper, byte, char, format, rep, and reverse I will be covering the gsub, gmatch, match, find , sub, byte, char, and format functions as the other functions are pretty self-explanatary There are two ways to use these functions the first being a function of the string library

like such:

local str = "Hey This is a string"
print(string.upper(str))--"HEY THIS IS A STRING"

The second way of doing this would be to use it as a method of a string

local str = "Hey this is a string"
print(str:upper())--"HEY THIS IS A STRING"

With that said, let's get into the functions

Gsub

the gsub, or global substitution function replaces all instances of one string / string

pattern with another.

string.gsub(Original,Substring,Replacement)

ex:

local str = "Hey their"
print(str:gsub("their","there"))--"Hey there",1
--1 in this case being the number of instances "their" was replaced with "there"

Gmatch

The gmatch function returns a pattern finding iterator,the iterator will search through the string passed looking for instances of the pattern you passed

local str = "Hey, Find all the non-spac e instances"
for piece in str:gmatch("[^%s]+") do
    print(piece)-- Hey, Find, all, the, non-spac, e, instances
end

Sub

the sub function returns a sub-string(hense the name) with the given interval.

string.sub(Original,Start,End)

An example of it would be :

local str = "Definitely a string"
print(str:sub(1,5))--Defin

Find/ Match

the find function finds the first occurance of a given string pattern within the start/

end parameters if they are given, or inside the entire string if neither the start nor

the end parameters are given. It returns in interval from which the pattern is found

string.find(Original,pattern,start,finish)

An example of the find function would be:

local str = "Hello There"
print(str:find("%a+"))--1,5
print(str:find("%a+",1,3))--1,3
print(str:find("%a+",6))--7,11

Byte and Char

The byte and char functions are corresponding functions, that meaning, if you input a character like "a", the byte function will return an integer such as 97, then if you use the char function, it will return the original character inputted, which in this case is "a"

local char = "t"
local byte = string.byte(char)
local newChar = string.char(byte)

print(char,byte,newChar)--t,117,t
print(char == newChar)--true

Format

The format function creates a formatted string from the format and arguments provided, it is very similar to the prinf function in C language / C++

print(string.format("%s,%q","Hello", "RandomDudo")) --Hello "RandomDudo"
--%s is string and %q is quoated string (for format only)
print(string.format("%d,%i,%u",-1,-1,-1)) -- -1,-1,18446744073709551615
--%d and %i are unsigned integers while %u is signed integers

the reason why the unsigned int for -1 glitches out so much is that unsigned integers go all the way up to their maximum value when a negative number is inputted, due to them being unsigned integers A couple other notable patterns for string.format are:

%a and %A for hexadecimals with binary exponents

%c for converting numbers to characters (like string.char)

%g for floats and exponents

%f for floats

%o for octals

%x and %X for hexdecimals

%e and %E for exponents

String patterns

I think the thing most people , including me struggle with at first is understanding the string patterns.

I would say that chapter 20.2 of the lua pil and the lua users wiki would be great tools for learning this if you are a beginner However, I will be explaining the gist of the string patterns and "magic characters" in this section With that said, here are some of the most important string patters

. - all characters

%a - letters

%d - digits

%w - alphanumeric characters (digits + letters)

%s - whitespace

%l - lowercase letters

%u - uppercase letters

%p - punctuation / special characters (!,.,?,#,$,...)

%x - hexadecimals

Another interesting thing about these patterns is that the upper case version of each represents the complement of the lower case version, for example, %A represents non letter characters and %S represents non-space characters

Magic Characters

There are officially 10 so called "magic characters" in lua. Those being %, (), ^, * , [], +,-,?,$,and.

The % character cancels out the characteristics of the other special characters

The ^ character represents the complement of whatever pattern is after is

The [] characters form a character set Say you want to make a search function that only finds vowels: simply putting the list of vowels inside [] characters would be a good way to do so

local Vowels = "[AEIOUaeiou]"--list of all vowels
local str = "Hey I'm Alive" --random string

local NoVowel = string.gsub(str,Vowels,"")

print(NoVowel) --Hy 'm lv

Or say you want to filter for non-period characters, but since the . character is a string pattern onto itself. well, all three of the showcased magic characters can be used here

local pattern = "[^%.]" --non period character
local str = "workspace.model.Humanoid" -- example string

for sub in str:gmatch(pattern) do
    print(sub) --workspace, model, Humanoid
end

Congradulations, you finished this tutorial on string manipulations!

Challenges:

What is life without challenges? See if you can complete any of these given challenges:

1: make a printf function with string.format()

2: create a character set equivalent to the alphanumeric (%w) character set

3: subsitute all consonants (non-vowel) letters with an empty string: "" Hope you enjoyed!

View in-game to comment, award, and more!