How to write a simple internet/web robot in Java?

When we say robot, yes, it is some kind of a difficult thing to do. But before we say it, let’s try writing them first.

I started writing simple internet/web robot application in Java. A robot that will scrape data from the web. A robot that will add entry to your web. And of course, you can write any kind of robots you like.

There are lot of web tools you can freely use and download from the internet. Since I use Java for my robot, I use HtmlUnit, and Selenium RC (Remote Control). These are only web tools I tested and tried so far. They have their own specific usage and purpose.

In my own experience, I use HtmlUnit for gathering and processing data from the website; and I use Selenium Remote Control for automating my own website to login and process data. These are very interesting thing I’ve learn so far.

Here is a Selenium Remote Control Java source code to open a Google site, type “Ziplok Java” in the search box and click search button:

import com.thoughtworks.selenium.*;
import junit.framework.*;

public class GoogleRobotSearch {
 private Selenium sel;

 public GoogleRobotSearch () {
  sel = new DefaultSelenium("localhost", 4444, "*firefox", "http://www.google.com");
  sel.start();
 }

 public void search() {
  sel.open("http://www.google.com/webhp?hl=en");
  sel.type("q", "Ziplok Java");
  sel.click("btnG");
  sel.waitForPageToLoad("5000");
  sel.stop();
 }

 public static void main (String args[]) {
  GoogleRobotSearch xybot = new GoogleRobotSearch ();
  xybot.search();
 }
}

And here is an equivalent Java source code written in Java with HmlUnit:

import com.gargoylesoftware.htmlunit.WebClient;
import com.gargoylesoftware.htmlunit.html.HtmlForm;
import com.gargoylesoftware.htmlunit.html.HtmlPage;
import com.gargoylesoftware.htmlunit.html.HtmlSubmitInput;
import com.gargoylesoftware.htmlunit.html.HtmlTextInput;

public class GoogleRobotSearch {
 private String bUrl;

 public GoogleRobotSearch (String url) throws Exception {
  bUrl = url;
 }

 public void search () throws Exception {
  WebClient wb = new WebClient ();
  HtmlPage p = (HtmlPage) wb.getPage(bUrl);

  HtmlForm f = p.getFormByName("f");
  HtmlTextInput text = (HtmlTextInput) f.getInputByName("q");
  HtmlSubmitInput submit = (HtmlSubmitInput) f.getInputByName("btnG");
  text.setValueAttribute("Ziplok Java");

  HtmlPage resultPage = (HtmlPage) submit.click();
  System.out.println(resultPage.asText());
 }

 public static void main (String args[]) throws Exception {
  GoogleRobotSearch xyro = new GoogleRobotSearch ("http://www.google.com/");
  xyro.search ();
 }
}

That’s it! It is just easy to write an internet/web robot.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s