I have been working on an assignment where I have to get the names of the movies from the ImDb site.
The above can be done using web-scrapping.I am confused on how to start with it, what libraries to use etc.
Can anyone help me by outlining the basic steps to get some data from a website.
There are several ways in going about this. The easiest one probably being using of import.io. You can check their website here. Basically they take web data and convert it into tables, which can be read easily into pandas dataframes.
Alternately, you can also look at libraries like BeautifulSoup & Scrapy. The tutorials included on the site can serve a good way to learn these libraries.