User Tools

Site Tools


mmbasic:regular_expression_function

Regular Expression Function

The following is the Regular Expression Match function lifted from user Rave's MMUNIX _ UNIX_like commands GREP routine. That in turn used Rob Pike's algorithm. It is reasonably fast and brings a powerful feature to MMBasic. It is not a full-blooded regexp function but catches the following functionality:

 .     matches any single character
 [abc] matches a or b or c
 ?     matches zero or one of the previous character 
 *     matches zero or more of the previous character e.g. A* is any number of A's
 +     matches one or more of the previous character 
 ^     matches begin of text e.g. ^A means the string must start with A
 $     matches end of text e.g. A$ means the string must end with A
 \     Following character to be matched literally. e.g. \? matches ?, \\ matches \ etc.

I extracted the RegExp function so that it could be used as a general function to compare if a string “looks like”…

Notes:

  • It is case sensitive.
  • Not a complete set of expressions, but what there is will cover 90% of requirements
  • Recursive matching is not supported, e.g. [hc]?at matching “at”, “hat”, and “cat” will not work correctly.
  • RegExp returns a non-zero on a match (it might not be 1).

Syntax:

=RegExp(RegularExpressionString$,MyString$)

Example: simple match for a string that “looks like” it might be a time. RegExp is used here to determine the “look and feel” of the tested string - the actual values remain to be checked. See IsTime for a complete solution.
Print RegExp(“^[012][0123456789]:[012345][0123456789]$”,“10:55”)

Dependencies:

None

	Function RegExp(regex$,text$) ' case sensitive
		Local Integer s,n,m,f
		n=Len(regex$):m=Len(text$)
		RegExp=0
		If Mid$(regex$,1,1)="^" Then
			MatchHere regex$,text$,2,1,n,m,f
			RegExp=f
		Else
			For s=1 To m
				MatchHere regex$,text$,1,s,n,m,f
				If f Then RegExp=s: Exit For
			Next
		EndIf
	End Function

	Sub MatchHere(regex$,text$,r As Integer,s As Integer,n As Integer,m As Integer,f As Integer)
		Local Integer i
		Do While r <= n
			If Mid$(regex$,r)="$" Then f=s>m: Exit Sub
			If Mid$(regex$,r,1)="[" Then
				i=Instr(r+2,regex$,"]")
				If i Then i=i+1 Else i=r+1
			ElseIf Mid$(regex$,r,1)="\" Then
				i=r+2
				Else
				i=r+1
			EndIf
			If Mid$(regex$,i,1)="*" Then
				MatchStar regex$,text$,r,i,s,n,m,f
				Exit Sub
			EndIf
			If Mid$(regex$,i,1)="?" Then
				MatchOptional regex$,text$,r,i,s,n,m,f
				Exit Sub
			EndIf
			If s>m Then f=0: Exit Sub
			i=r
			MatchChar regex$,text$,r,s,f
			If Not f Then Exit Sub
			If Mid$(regex$,r+1,1)="+" Then
				MatchStar regex$,text$,i,r+1,s+1,n,m,f
				Exit Sub
			EndIf
			r=r+1:s=s+1
		Loop
		f=1
	End Sub

	Sub MatchStar(regex$,text$,r As Integer,i As Integer,s As Integer,n As Integer,m As Integer,f As Integer)
		Do
			MatchHere regex$,text$,i+1,(s),n,m,f
			If f or s>m Then Exit Sub
			MatchChar regex$,text$,(r),s,f
			If Not f Then Exit Sub
			s=s+1
		Loop
	End Sub

	Sub MatchOptional(regex$,text$,r As Integer,i As Integer,s As Integer,n As Integer,m As Integer,f As Integer)
		MatchHere regex$,text$,i+1,(s),n,m,f
		If f or s>m Then Exit Sub
		MatchChar regex$,text$,(r),s,f
		If Not f Then Exit Sub
		MatchHere regex$,text$,i+1,s+1,n,m,f
	End Sub

	Sub MatchChar(regex$,text$,r As Integer,s As Integer,f As Integer)
		Local Integer i,j
		f=Mid$(regex$,r,1)="."
		If f Then Exit Sub
		If Mid$(regex$,r,1)="[" Then
			i=Instr(r+2,regex$,"]")
			If i Then
				j=Instr(r+1,regex$,Mid$(text$,s,1))
				f=j>0 And j<i:r=i
				Exit Sub
			EndIf
		ElseIf Mid$(regex$,r,1)=Chr$(92) Then
			r=r+1
		EndIf
		f=Mid$(regex$,r,1)=Mid$(text$,s,1)
	End Sub
mmbasic/regular_expression_function.txt · Last modified: 2024/02/07 14:15 by gerry