How to get the text content of a html while scraping using BeautifulSoup in Python

python

#1

Hi,

I want to get the values in a new output file from the contents which are between html tags.

i am pasting my html:

""div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:160px; top:172px; width:124px; height:17px;"
span style="font-family: ABCDEE+Calibri; font-size:5px"Average Monthly Use:

                                        brAverage price per kWh by TDSP territory:

                                            br
                                            /span
                                        /div
                                        div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:311px; top:171px; width:28px; height:18px;"
                                            span style="font-family: ABCDEE+Calibri,Bold; font-size:5px"500 kWh

                                                br
                                                /span
                                                span style="font-family: ABCDEE+Calibri; font-size:5px"12.1

                                                    br
                                                    /span
                                                /div
                                                div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:382px; top:171px; width:32px; height:18px;"
                                                    span style="font-family: ABCDEE+Calibri,Bold; font-size:5px"1000 kWh

                                                        br
                                                        /span
                                                        span style="font-family: ABCDEE+Calibri; font-size:5px"5.1

                                                            br
                                                            /span
                                                        /div
                                                        div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:457px; top:171px; width:32px; height:18px;"
                                                            span style="font-family: ABCDEE+Calibri,Bold; font-size:5px"2000 kWh

                                                                br
                                                                /span
                                                                span style="font-family: ABCDEE+Calibri; font-size:5px"7.6

                                                                    br
                                                                    /span
                                                                /div

Sorry i have removed the <> from the tags.

Please help me to get the values from the tag.

Thanks in advance.

Neel


#2

@Neel,

You can use regular expression to read specific content from html file. It helps to find a sequence of character(s) and replace patterns in a string or file. It is supported by most of the programming languages like python, perl, R, Java and many others. For more detail on Regular expression, you can read article here.

Regards,
Imran


#3

@Imran,

Thanks Imran. I got my values using regular expression.

Thanks for the help!

-Neel