jsoup - Set Attributes



Following example will showcase use of method to set attributes of a dom element, bulk updates and add/remove class methods after parsing an HTML String into a Document object.

Syntax

Document document = Jsoup.parse(html);
Element link = document.select("a").first();         
link.attr("href","www.yahoo.com");     
link.addClass("header"); 
link.removeClass("header");    

Where

  • document − document object represents the HTML DOM.

  • Jsoup − main class to parse the given HTML String.

  • html − HTML String.

  • link − Element object represent the html node element representing anchor tag.

  • link.attr() − attr(attribute,value) method set the element attribute the corresponding value.

  • link.addClass() − addClass(class) method add the class under class attribute.

  • link.removeClass() − removeClass(class) method remove the class under class attribute.

Description

Element object represent a dom elment and provides various method to get the attribute of a dom element.

Example

Create the following java program using any editor of your choice in say C:/> jsoup.

JsoupTester.java

import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;

public class JsoupTester {
   public static void main(String[] args) {
   
      String html = "<html><head><title>Sample Title</title></head>"
         + "<body>"
         + "<p>Sample Content</p>"
         + "<div id='sampleDiv'><a id='googleA' href='www.google.com'>Google</a></div>"
         + "<div class='comments'><a href='www.sample1.com'>Sample1</a>"
         + "<a href='www.sample2.com'>Sample2</a>"
         + "<a href='www.sample3.com'>Sample3</a><div>"
         +"</div>"
         + "<div id='imageDiv' class='header'><img name='google' src='google.png' />"
         + "<img name='yahoo' src='yahoo.jpg' />"
         +"</div>"
         +"</body></html>";
      Document document = Jsoup.parse(html);

      //Example: set attribute
      Element link = document.getElementById("googleA");
      System.out.println("Outer HTML Before Modification :"  + link.outerHtml());
      link.attr("href","www.yahoo.com");      
      System.out.println("Outer HTML After Modification :"  + link.outerHtml());
      System.out.println("---");
      
      //Example: add class
      Element div = document.getElementById("sampleDiv");
      System.out.println("Outer HTML Before Modification :"  + div.outerHtml());
      link.addClass("header");      
      System.out.println("Outer HTML After Modification :"  + div.outerHtml());
      System.out.println("---");
      
      //Example: remove class
      Element div1 = document.getElementById("imageDiv");
      System.out.println("Outer HTML Before Modification :"  + div1.outerHtml());
      div1.removeClass("header");      
      System.out.println("Outer HTML After Modification :"  + div1.outerHtml());
      System.out.println("---");
      
      //Example: bulk update
      Elements links = document.select("div.comments a");
      System.out.println("Outer HTML Before Modification :"  + links.outerHtml());
      links.attr("rel", "nofollow");
      System.out.println("Outer HTML Before Modification :"  + links.outerHtml());
   }
}

Verify the result

Compile the class using javac compiler as follows:

C:\jsoup>javac JsoupTester.java

Now run the JsoupTester to see the result.

C:\jsoup>java JsoupTester

See the result.

Outer HTML Before Modification :<a id="googleA" href="www.google.com">Google</a>
Outer HTML After Modification :<a id="googleA" href="www.yahoo.com">Google</a>
---
Outer HTML Before Modification :<div id="sampleDiv">
 <a id="googleA" href="www.yahoo.com">Google</a>
</div>
Outer HTML After Modification :<div id="sampleDiv">
 <a id="googleA" href="www.yahoo.com" class="header">Google</a>
</div>
---
Outer HTML Before Modification :<div id="imageDiv" class="header">
 <img name="google" src="google.png">
 <img name="yahoo" src="yahoo.jpg">
</div>
Outer HTML After Modification :<div id="imageDiv" class="">
 <img name="google" src="google.png">
 <img name="yahoo" src="yahoo.jpg">
</div>
---
Outer HTML Before Modification :<a href="www.sample1.com">Sample1</a>
<a href="www.sample2.com">Sample2</a>
<a href="www.sample3.com">Sample3</a>
Outer HTML Before Modification :<a href="www.sample1.com" rel="nofollow">Sample1</a>
<a href="www.sample2.com" rel="nofollow">Sample2</a>
<a href="www.sample3.com" rel="nofollow">Sample3</a>
Advertisements