How to extract data from an html file using python?



I have been trying to do google basic exercises on python.I tried to read a html file using pd.read_html().
But since the data is not in the table format only ranks are in order and the string content like names come in the form of NAN ,NAT

Data can be accessed from the following link:



you can extract information from Html file using library urllib. It helps you to read the html source code and after that you can apply regular expression to extract required information.

You can read more detail about urllib here: