regex

Replacing HTML code in python

I’m using regular expressions to parse a websites source code and display a news headline in a Tkinter window. Have been told parsing HTML with regex isn’t the best idea, but unfortunately do not have the time to change now. Can’t seem to be able to replace the HTML code for special characters such as apostrophe. Currently have the following:…
Read more

String containing hashtag to reactjs element

Hi all I am using coffee script and ReactJS for my site. So far I have a smooth development until I encounter String with hashtag and then convert hash tag to react element React ELEMENT @HashtagLink = React.createClass displayName: ‘hashtag link’ render: -> React.DOM.a null, href:’/hashtag/’ + @props.hashtag @props.hashtag JS function @hashTagToLink = (string) -> string.replace(/#(S*)/g,'<a href=”http://emilenriquez.com”>$1</a>’) Sample String :…
Read more

How do I find the 1st character after the nth separation space

I have a text file composed by lines such that (example): Oct 10 21:56:21 2015 QST Aldrin completed quest ‘has proven their patience & kindness in their path to becoming a ranger.’ Suppose that I want to land always on the character ‘c’ of word ‘completed’ above using Vim Any ideas? Thank you! Source: regex

Trying to get a regex to recognize and extract words from both camelCase and CamelCase

I’ve got this halfway working. This works great: ‘MyOwnVar’.match(/([a-z]*)([A-Z][a-z]+)/g) Result: [“My”, “Own”, “Var”] The goal is to pull out the individual words. But if I pass a camelCase name to it: ‘myOwnVar’.match(/([a-z]*)([A-Z][a-z]+)/g) I get: [“myOwn”, “Var”] I can’t figure out what I’m doing wrong. As far as I can tell, two sets of () should store the matching results in…
Read more

Handling single and double quotes using SED

Lets say I have a file test.txt which contains the following: This or ‘is nothing or’ or “that or” if it is or Now I would like to replace the or which are not present in the quotes(either single quote or double quote). I want to achieve this by using SED. So my Input would look like: This or ‘is…
Read more

How to replace non-character surrounded string in python

I have a variable with a collection of arguments (newArgs), I want to replace any occurrence of a certain string inside this variable. I can get into my if condition using the re.search command: if ( re.search(item, newArgs) ): the ‘if’ condition wont do anything if the condition found the exact item inside the newArgs when it enters the ‘else’…
Read more

Substitute Emoji with its description or name

I’m working on getting all emojis from a text retrieved form an API. What I’d like to do is substitute each emoji for its description or name. I’m working on Python 3.4 and my current approach is accesing the unicode’s name with unicodedata like this: nname = unicodedata.name(my_unicode) And I’m substituting with re.sub: re.sub(‘[U0001F602-U0001F64F]’, ‘new string’, str(orig_string)) I’ve tried re.search…
Read more

How can I index group elements in re.sub?

how can I index a character in a matched group from within the re.sub match. So I am trying to do this: import re string_test = “this.day” string_test = re.sub(“^(.*?).(.*?)”, “g<1>[0].g<2>”, string_test) My result should be: “t.day”. Is it possible to somehow index the g<1> group within the re.sub ? Obviously g<1>[0] doesn’t work. Source: regex

Pyparsing ignore except

I have a file, with pythonStyleComments in lines, for example: def foo(): # declare # Simple function a = 0 # TODO: add random return a So, then I want to add .ignore(pythonStyleComments) to pyparsing, but want to handle any meta (such as TODO: ). I know all meta words, so how I can exclude this comments from ignoring? Maybe…
Read more