Get all urls from a web page

Here I’m going to post a class that will extract all valid urls from a web page. My class uses “URLConnectionReader” provided by Sun Tutorial

Class defines 2 constructors.

  1. One by default returns you the vector containing only text/html url objects from page.
  2. For the other you can specify the type of urls you want from a page. This is helpful when you want to get all images, videos or any other media urls.

The class also considers relative urls. It returns relative urls with http and host name prefixed.
E.g. If you have urls like “/about.php”, then class will return “http://hostname.domain/about.php”

The URLFinder

Usage

This will get you all the urls from any web page.

Add a Comment

Your email address will not be published. Required fields are marked *