Web scraping in R with top 100 movies on IMDB post

r
data_wrangling
data_science

#1

The original page suggested posting here in discussion because of age.

First, IMDB site for movies (page being used below) is a little different now than at time of post so I can’t do everything as in original post. At this point, I’m only trying to web scrape ranking, title, and IMDB rating.

Judging based on the head of data pulled, ranking and title was pulled successfully.
Two different issues with IMDB rating. First issue is likely solved it with your method used to go through the metascores and go through IMDB rating. Ok, no problem.
Second issue I can’t figure out, because based on the head of data pulled, I am getting “The Godfather, Part II” and “Jennifer Aniston”, and I can’t figure out how. The CSS selector used was “strong”.

Still will play with it more, but any help you can offer is appreciated.

Thank you.


#2

Hi @herbacidal

Please use “td strong” instead of “strong” as CSS selector to extract ratings from the page.


#3

Well, it worked for that error, but it gave me results that still aren’t what I wanted, because there’s huge amounts of results, with large spaces in between. Still trying to solve that. Thank you for solving that issue.