python re, not matching pattern

i have a simple re expression, that searches parses a number of source code files and for each line searches for pattern and extracts the content enclosed in double quotes to use in gettext.po file

here is my regex

gettext_subject = re.compile(r"""[subject: |summary: ]"(.*?)"""").findall

here is a sample file

exports.onAppointment = (appt, user, lang, isNew) ->
  if not user then return Promise.reject "Appointment has no user."
  moment.locale(lang)
  start = moment(appt.when)
  cal = new ICal()
  console.log appt.when
  cal.addEvent
    start: start.toDate()
    end: moment(start).add(2,"hours").toDate()
    summary: "Continental showroom visit"
  mail =
    to: user.emailId
    subject: if isNew then "New appointment" else "Appointment updated"
    alternatives: [
        contentType: "text/calendar",
        contents: new Buffer(cal.toString()),
        contentEncoding: "7bit"
      ]
  template =
    name: "booking"
    lang: lang
    locals:
      name: "#{user.firstName} #{user.lastName}"
      datetime: moment(appt.when).format("dddd Do MMMM [at] HH:mm A")
      cancelurl: config.server.baseUrl + "/appointment/cancel/#{appt._id}"
  emailClient.send2 mail, template

This code runs correct:

gettext_subject = re.compile(r"""subject: "(.*?)"""").findall

and testing this from the command line also returns, the right answer

$ python
Python 2.7.6 (default, Mar 22 2014, 22:59:56) 
[GCC 4.8.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import re
>>> gettext = re.compile(r"""[subject: |summary: ]"(.*?)"""").findall
>>> pattern = """subject: "blah blah blah"nsummary: "summary text"nsubject: "second subject line"nsummary: if isNew then "New appointment" else "Appointment updated"n"""
>>> print gettext(pattern)
['blah blah blah', 'summary text', 'second subject line', 'New appointment', 'Appointment updated']
>>> 

but when i run it through my code this does not work, here is the code:

import os
import sys
import re
from operator import itemgetter

walk_dir = ["app", "email", "views"]
#t=(" ")
gettext_messages = re.compile(r""""(.*)"""", re.MULTILINE).findall
gettext_re = re.compile(r"""[=|#|{]t("(.*?)"""").findall
gettext_subject = re.compile(r"""[subject: |summary: ]"(.*?)"""").findall

gettext = []
for x in walk_dir:
    curr_dir = "../node-blade-boiler-template/" + x
    for root, dirs, files in os.walk(curr_dir, topdown=False):
        if ".git" in dirs:
            dirs.remove(".git")
        if "node-modules" in dirs:
            dirs.remove("node-modules")
        if "models" in dirs:
            dirs.remove("models")

        for filename in files:
            file_path = os.path.join(root, filename)
            #print('n- file %s (full path: %s)' % (filename, file_path))
            with open(file_path, 'rb') as f:
                f_content = f.read()
                if 'messages.coffee' == filename:
                    #pass
                    msgids = gettext_messages(f_content)
                elif 'map.coffee' == filename:
                    pass
                elif 'emailtrigger.coffee' == filename:
                    #print f_content
                    if 'subject: ' in f_content:
                        print gettext_subject(f_content)
                        msgids = gettext_subject(f_content)

                else:
                    msgids = gettext_re(f_content)
                for msgid in msgids:
                    msgid = '"' + msgid + '"'
                    #print msgid
                    dic = {
                    'path' : file_path,
                    'msgid' : "%s" % msgid
                    }
                    gettext.append(dic)

any advice much appreciated.


Source: regex

Leave a Reply