[Return to Main Page]

Pattern Matcher by Paul Guertin
[Up to Source Code Repository]


Pattern Matcher

String pattern matcher in 6502 assembly.
By Paul Guertin (pg@sff.net), 30 August 2000.

This routine matches a string against a pattern and returns with the carry bit set if they match, or clear if they don't. The two characters ? and * have a special meaning when they appear in the pattern. All other characters match themselves.

? matches any one character. For example, F?? matches FOO but not FU, and ?? matches all two-character strings.

* matches any string, including the empty string. For example, F* matches all strings starting with F. *O*O* matches all strings with at least two Os. Finally, ?* matches all non-empty strings.

Both the pattern and the string must be NUL-terminated (that it, followed with a 00 byte) and at most 255 characters long (excluding the NUL).

; Input:  A NUL-terminated, <255-length pattern at address PATTERN.
;         A NUL-terminated, <255-length string pointed to by STR.
;
; Output: Carry bit = 1 if the string matches the pattern, = 0 if not.
;
; Notes:  Clobbers A, X, Y. Each * in the pattern uses 4 bytes of stack.
;

MATCH1  EQU "?"         ; Matches exactly 1 character
MATCHN  EQU "*"         ; Matches any string (including "")
PATTERN EQU $2000       ; Address of pattern
STR     EQU $6          ; Pointer to string to match

PATTERNMATCH:
        LDX #$00        ; X is an index in the pattern
        LDY #$FF        ; Y is an index in the string
NEXT    LDA PATTERN,X   ; Look at next pattern character
        CMP #MATCHN     ; Is it a star?
        BEQ STAR        ; Yes, do the complicated stuff
        INY             ; No, let's look at the string
        CMP #MATCH1     ; Is the pattern caracter a ques?
        BNE REG         ; No, it's a regular character
        LDA (STR),Y     ; Yes, so it will match anything
        BEQ FAIL        ;  except the end of string
REG     CMP (STR),Y     ; Are both characters the same?
        BNE FAIL        ; No, so no match
        INX             ; Yes, keep checking
        CMP #0          ; Are we at end of string?
        BNE NEXT        ; Not yet, loop
FOUND   RTS             ; Success, return with C=1

STAR    INX             ; Skip star in pattern
        CMP PATTERN,X   ; String of stars equals one star
        BEQ STAR        ;  so skip them also
STLOOP  TXA             ; We first try to match with * = ""
        PHA             ;  and grow it by 1 character every
        TYA             ;  time we loop
        PHA             ; Save X and Y on stack
        JSR NEXT        ; Recursive call
        PLA             ; Restore X and Y
        TAY
        PLA
        TAX
        BCS FOUND       ; We found a match, return with C=1
        INY             ; No match yet, try to grow * string
        LDA (STR),Y     ; Are we at the end of string?
        BNE STLOOP      ; Not yet, add a character
FAIL    CLC             ; Yes, no match found, return with C=0
        RTS
Last page update: September 8, 2000.