Continuing pattern matches to identify SQL clauses

I am attempting to write a function in R that uses regular expressions to identify table names from SQL code.

I am using capturing groups to identify any words/alpha-num expressions that are found after from or join, but if my understanding is correct, capturing groups only allow the last captured group to be accessible.

##sample SQL code:
mySql <- 'select all from table1 join table2 join table3 join new_table3'

gsub('.*(from|join)s*([[:alnum:]]+_*[[:alnum:]]+).*', '1 : 2', mySql)

##result is only the last table in SQL code:
##"join : new_table3"

But what I want are all the table names to be returned, like so:

from : table1
join : table2
join : table3
join : new_table3

Or is this even a valid use of regex, as I think it is?

Source: regex

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.