Regex with a variable number of match groups

I would like to be able to write patterns to recognize filenames in a list:

import re

NOTES = ["c", "c#", "d", "d#", "e", "f", "f#", "g", "g#", "a", "a#", "b"]

filelist1 = ["piano c3.wav", "piano c#3.wav", "piano d4.wav"]
pattern1 = "piano %notename.wav"

filelist2 = ["72__54.wav", "60__127.wav", "48__61.wav"]
pattern2 = "%midinote__%velocity.wav"

The keywords :

  • %midinote and %velocity should be integers
  • %notename should be a string like in the list NOTES

The following code works and parses the filenames, but only if the 3 keywords are present in the pattern, in the order %midinote, %velocity, %notename:

pattern1 = pattern1.replace("%midinote", r"(d+)").replace("%velocity", r"(d+)").replace("%notename", r"([A-Ga-g]#?[0-9])")
for fname in filelist1:
    m = re.match(pattern1, fname)
    if m:
        midinote = int(m.groups()[0])
        velocity = int(m.groups()[1])
        notename = m.groups()[2]
        notenametomidi = NOTES.index(notename[:-1].lower()) + (int(notename[-1])+2) * 12
        print fname, midinote, velocity, notename, notenametomidi

But if a pattern:

  • has only 1 or 2 keywords

  • or has the 3 keywords but in another order the the order defined before,

then the code fails.

How to be able to regex with a variable number of match groups?


Source: regex

Leave a Reply