Regular expressions

r

#1

I have a column with different item names…and i want to remove all the items starting with “SP”.
for ex- SP icecream, SP tea …etc

How can i remove?

Thanks


#2

You can go with ‘gsub’ function !
For example :
#if you want to replace ‘da’ with ‘sh’ in string ‘florida’.

gsub(“da”,“sh”,“florida”,ignore.case=T) # for ignoring the case
[1] florish

Your example :

column_name[- grep("^SP",column_name,ignore.case=T)]
#here ignore.case is optional, this will return all the columns values ignoring value started with “SP”


#3

If you are using python, you can use pandas to select the rows and remove them

# find rows which start with "SP"
condition = data.column_name.str.contains('^SP')

# select data which doesn't satisfy the condition
cleaned_data = data[~condition]