How To Extract Text From A HTML Tag In Text Format?


The act of extracting text from an HTML file is essentially equivalent to copying and pasting website content onto a notepad. It might sound easy, but it wouldn't be as enjoyable if you had to extract text from millions of HTML files (webpages).

Let's dive into the article for getting better understanding on extracting text from a HTML tag in text format.

Extracting text from HTML tag

Numerous elements in HTML can be used to give text a specific meaning. For getting more idea on extracting text from a HTML tag in text format, let's look into the following examples.

Example

In the following example, we are running the script to extract text from an HTML tag.

<!DOCTYPE html>
<html>
   <body>
      <script>
         function gettext(html){
            var tempDivElement = document.createElement("div");
            tempDivElement.innerHTML = html;
            return tempDivElement.textContent || tempDivElement.innerText || "";
         }
         var sentence= "<div><h1>Welcome to Tutorialspoint</h1></div>";
         document.write(gettext(sentence));
      </script>
   </body>
</html>

When the script gets executed, it will generate an output consisting of the data obtained from the above script and display it on the webpage.

Example

Considering the following example, where we are running the script to get the text from the HTML tag.

<!DOCTYPE html>
<html>
   <body>
      <script>
         var statement= "<div><h1>TutorialsPoint</h1>
<p> is the Best E-Learning</p></div>"; var result = statement.replace(/<[^>]+>/g, ''); document.write(result) </script> </body> </html>

On running the above script, the output window will pop up, consisting of the text that was extracted by running the script displayed on the webpage.

Updated on: 23-Nov-2023

1K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements