Golang regexp Examples [In-Depth Tutorial]


GO

Author: Tuan Nguyen
Reviewer: Deepak Prasad

In the Python section, we have introduced some examples of working with regex (regular expression). In Golang, the built-in package regexp implements regular expression search. We will walk through some examples using regex in this article.

regexp: Package regexp implements regular expression search.

The syntax of the regular expressions accepted is the same general syntax used by Perl, Python, and other languages. More precisely, it is the syntax accepted by RE2 and described at https://golang.org/s/re2syntax, except for \C. For an overview of the syntax, run:

go doc regexp/syntax

 

Performing Regex Match using Golang regexp

The regex package provides us with 3 functions to check if the compared string is the same as the original string pattern. If it is the same, the result will be true, if it is false, it will return false:

func Match(pattern string, b []byte) (matched bool, error error)
func MatchReader(pattern string, r io.RuneReader) (matched bool, error error)
func MatchString(pattern string, s string) (matched bool, error error)

 

Regex Match() function

func Match(pattern string, b []byte) (matched bool, err error): Match reports whether the byte slice b contains any match of the regular expression pattern. More complicated queries need to use Compile and the full Regexp interface.

Example of using the Match() function in Golang:

package main

import (
	"fmt"
	"regexp"
)

func main() {
	matched, err := regexp.Match(`golinux`, []byte(`This is golinuxcloud string. welcome! to@ my(a pages!morrrring`))
	fmt.Println(matched, err)
	matched, err = regexp.Match(`pages!*`, []byte(`This is golinuxcloud string. welcome! to@ my(a pages!morrrring`))
	fmt.Println(matched, err)
	matched, err = regexp.Match(`ay(b`, []byte(`This is golinuxcloud string. welcome! to@ my(a pages!morrrring`))
	fmt.Println(matched, err)
}

Output:

true <nil>
true <nil>
false error parsing regexp: missing closing ): `ay(b`

 

Regex MatchString() function

The only difference between Match() and MatchString() is the type of input. Here is an example of using MatchString() function:

package main

import (
	"fmt"
	"regexp"
)

func main() {
	matched, err := regexp.MatchString(`golinux`, `This is golinuxcloud string. welcome! to@ my(a pages!morrrring`)
	fmt.Println(matched, err)
	matched, err = regexp.MatchString(`monday`, `This is golinuxcloud string. welcome! to@ my(a pages!morrrring`)
	fmt.Println(matched, err)
}

Output:

true <nil>
false <nil>

 

A complex example of MatchString() with Compile and Regex interface

Here is an example of using compile to check if a string contains numeric characters only:

package main

import (
	"fmt"
	"regexp"
)

func main() {
	//pattern
	r, _ := regexp.Compile("^[0-9]+")

	fmt.Println(r.MatchString("books"))
	fmt.Println(r.MatchString("ewq028"))
	fmt.Println(r.MatchString("09852"))
	fmt.Println(r.MatchString("2398410055"))
}

Output:

false
false
true
true

 

Find a sub-string in a string using regex

FindStringIndex() function to return the index of the start match string position

Only the first match is returned by the FindStringIndex function, going from left to right. An int slice is used to represent the match, with the first value indicating where in the string the match began and the second number showing how many characters were matched. A nil result denotes the absence of any matches. Here is a simple example of using FindStringIndex() function to find the first occurrence of numeric characters in a string:

package main

import (
	"fmt"
	"regexp"
)

func main() {
	//pattern
	re := regexp.MustCompile("[0-9]+")

	fmt.Println(re.FindStringIndex("books"))
	fmt.Println(re.FindStringIndex("sss0254books082"))
}

Output:

[]
[3 7]    //0254

 

FindString() function to return string holding the text of the leftmost match

Here is an example of using FindString() in Golang:

package main

import (
	"fmt"
	"regexp"
)

func main() {
	//pattern
	re := regexp.MustCompile("[0-9]+")

	fmt.Println(re.FindString("books"))
	fmt.Println(re.FindString("sss0254books082"))
}

Output:

0254

 

FindAllString() to return all successive matches of the expression

FindAllString() is the 'All' version of FindString(); it returns a slice of all successive matches of the expression. The second parameter indicates how many maximum results we want to return (-1 means find all). Here is an example of finding all sub-numeric-characters-string in Golang:

package main

import (
	"fmt"
	"regexp"
)

func main() {
	//pattern
	re := regexp.MustCompile("[0-9]+")

	fmt.Println(re.FindAllString("books", -1))
	fmt.Println(re.FindAllString("sss0254books082", 0))
	fmt.Println(re.FindAllString("sss0254books082", 1))
	fmt.Println(re.FindAllString("sss0254books082", 2))
	fmt.Println(re.FindAllString("sss0254books082eqwew2698", -1))
}

Output:

[]
[]
[0254]
[0254 082]
[0254 082 2698]

 

Some more Find functions with golang regexp package

regexp package provide multiple functions to help us find and locate sub string which match the pattern:

func (re *Regexp) Find(b []byte) []byte
func (re *Regexp) FindAll(b []byte, n int) [][]byte
func (re *Regexp) FindAllIndex(b []byte, n int) [][]int
func (re *Regexp) FindAllString(s string, n int) []string
func (re *Regexp) FindAllStringIndex(s string, n int) [][]int
func (re *Regexp) FindAllStringSubmatch(s string, n int) [][]string
func (re *Regexp) FindAllStringSubmatchIndex(s string, n int) [][]int
func (re *Regexp) FindAllSubmatch(b []byte, n int) [][][]byte
func (re *Regexp) FindAllSubmatchIndex(b []byte, n int) [][]int
func (re *Regexp) FindIndex(b []byte) (loc []int)
func (re *Regexp) FindReaderIndex(r io.RuneReader) (loc []int)
func (re *Regexp) FindReaderSubmatchIndex(r io.RuneReader) []int
func (re *Regexp) FindString(s string) string
func (re *Regexp) FindStringIndex(s string) (loc []int)
func (re *Regexp) FindStringSubmatch(s string) []string
func (re *Regexp) FindStringSubmatchIndex(s string) []int
func (re *Regexp) FindSubmatch(b []byte) [][]byte
func (re *Regexp) FindSubmatchIndex(b []byte) []int

 

Compiling and reusing regular expression patterns

A regular expression pattern is compiled using the compile function so that it may be utilized in more intricate queries. In the below examples, we use Compile() and MustCompile() function to do that. regexp has 4 functions to compile the patterns:

func Compile(expr string) (*Regexp, error)
func CompilePOSIX(expr string) (*Regexp, error)
func MustCompile(str string) *Regexp
func MustCompilePOSIX(str string) *Regexp

Compile parses a regular expression and returns, if successful, a Regexp object that can be used to match against the text. MustCompile is like Compile but panics if the expression cannot be parsed. CompilePOSIX is like Compile but restricts the regular expression to POSIX ERE (egrep) syntax and changes the match semantics to leftmost-longest.

Here is an example of using MustCompile() function to ensure phone number, fax number, and id contains numeric characters only:

package main

import (
	"fmt"
	"regexp"
)

func main() {
	//pattern
	re := regexp.MustCompile("^[0-9]+")

	phoneNumber := "02692698"
	id := "DA02158"
	faxNumber := "956381"

	fmt.Println("Phone Number valid:", re.MatchString(phoneNumber))
	fmt.Println("Id valid:", re.MatchString(id))
	fmt.Println("faxNumber valid:", re.MatchString(faxNumber))
}

Output:

Phone Number valid: true
Id valid: false
faxNumber valid: true

 

Using a regular expression to replace strings

We have some functions in regexp package to replace string:

func (re *Regexp) ReplaceAll(src, repl []byte) []byte
func (re *Regexp) ReplaceAllFunc(src []byte, repl func([]byte) []byte) []byte
func (re *Regexp) ReplaceAllLiteral(src, repl []byte) []byte
func (re *Regexp) ReplaceAllLiteralString(src, repl string) string
func (re *Regexp) ReplaceAllString(src, repl string) string
func (re *Regexp) ReplaceAllStringFunc(src string, repl func(string) string) string

 

ReplaceAllString function

ReplaceAllString returns a copy of src, replacing matches of the Regexp with the replacement string repl. Here is an example of using ReplaceAllString() function to replace numeric substring:

package main

import (
	"fmt"
	"regexp"
)

func main() {
	//pattern
	re := regexp.MustCompile("[0-9]+")

	str := "this0258is45a0369number968"

	//replace all sub string with @@
	fmt.Println(re.ReplaceAllString(str, "@@"))

}

Output:

this@@is@@a@@number@@

 

ReplaceAllStringFunc function

ReplaceAllStringFunc returns a copy of src in which all matches of the Regexp have been replaced by the return value of function repl applied to the matched substring. Here is an example of replacing all the consonants with their capital in a string:

package main

import (
	"fmt"
	"regexp"
	"strings"
)

func main() {
	re := regexp.MustCompile(`[^aeiou]`)
	fmt.Println(re.ReplaceAllStringFunc("go linux cloud members!", strings.ToUpper))
}

Output:

Go LiNuX CLouD MeMBeRS!

 

Using a regular expression to split strings using Split() function

Split slices s into substrings separated by the expression and return a slice of the substrings between those expression matches. The slice returned by this method consists of all the substrings of s not contained in the slice returned by FindAllString.

A slice of the substrings from each subsequent match of the pattern is returned by the int parameter of -1.

In order to output each split substring, we created a simple for loop:

package main

import (
	"fmt"
	"regexp"
)

func main() {
	pattern := regexp.MustCompile("Go|to")
	welcomeMessage := "Hello Go Linux Cloud members welcome to our page! To be honest you are awesome"
	result := pattern.Split(welcomeMessage, -1)
	printOutput(result)

	fmt.Println("----")
	result2 := pattern.Split(welcomeMessage, 2)
	printOutput(result2)
}

func printOutput(split []string) {
	for _, s := range split {
		if s != "" {
			fmt.Println("substring:", s)
		}
	}
}

Output:

substring: Hello 
substring:  Linux Cloud members welcome 
substring:  our page! To be honest you are awesome
----
substring: Hello 
substring:  Linux Cloud members welcome to our page! To be honest you are awesome

 

Summary

In this post, we've looked at the go regexp package and its built-in ways to handle simple to sophisticated regular expression matches as well as compile and reuse regular expression patterns. For additional details on regular expressions in Golang, consult the go regexp documentation.

 

References

https://en.wikipedia.org/wiki/Regular_expression
https://pkg.go.dev/regexp

 

Tuan Nguyen

Tuan Nguyen

He is proficient in Golang, Python, Java, MongoDB, Selenium, Spring Boot, Kubernetes, Scrapy, API development, Docker, Data Scraping, PrimeFaces, Linux, Data Structures, and Data Mining. With expertise spanning these technologies, he develops robust solutions and implements efficient data processing and management strategies across various projects and platforms. You can connect with him on his LinkedIn profile.

Can't find what you're searching for? Let us assist you.

Enter your query below, and we'll provide instant results tailored to your needs.

If my articles on GoLinuxCloud has helped you, kindly consider buying me a coffee as a token of appreciation.

Buy GoLinuxCloud a Coffee

For any other feedbacks or questions you can send mail to admin@golinuxcloud.com

Thank You for your support!!

Leave a Comment