How to select records containing certain names in the column in R

r
data_wrangling

#1

Hello,

I am trying to keep only the records which contain a particular promotion name from a list of promotion names in R.

top_10_promo_name <- c("Price Winners",
                       "Two Day Sale",
                       "Weekend Markdown",
                       "Go For It",
                       "Save-It Sale",
                       "One Day Sale",
                       "Shelf Clearing Days",
                       "Big Time Discounts",
                       "Super Duper Savers",
                       "Two for One")

sales_promo_10 <- sales_promo[as.character(sales_promo$promotion_name) %in% top_10_promo_name,]

However,even though there is no error all the promotion names are getting selected.

I know this is a very basic question but I am not being able to figure it out.Can someone please help me with this??


#2

Hi @pagal_guy,

You code is perfectly fine and if you see your output i.e sales_promo_10 dataset you will find that it has been subsetted with only top 10 promo name. Your method of checking the output is actually wrong.

Fact 1 : Subsetting a dataset in R will not change the level of column. Say you have 51 levels in promotion name and you subset on top 10. This action will not change the level of that column, it will still be 51 level factor variable.

So two points -

  • Your code is fine , export this dataset or do View(sales_promo_10) and you will find that it is actually subsetted

  • If you want to change levels in promotion name to Factor w/ 10 levels, then you have to do re-leveling and define the level youself.

Hope this helps.

Regards,
Aayush