Israel Science and Technology Directory

Internet Linkcheck

Introduction to linkcheck

Hyperlink is the formal name of a link associated with keywords/images that lead to another location upon clicking on the text or the image. Most hyperlinks available on the internet websites generally have short lives and disappear for a variety of reasons listed at TechTarget. Clicking a hyperlink that has disappeared leads to the well known 404 error that flashes a message that the page is not found. This phenomenon is known as link-rot.

To avoid link-rot within a website, website managers should regularly check the validity of the links in their websites. To serve this need many link checkers have been developed for various operating systems. On this page I describe the installation and use of Linkcheck for the Linux systems. The reason I selected linkcheck is that it is the fastest of the link checkers available in the Linux world and it is freely available to all. Linkcheck was developed using Google's Dart programming language. So, the first step in installing linkcheck is installation of Dart.

Installing Dart

To install Dart on Debian/Ubuntu systems enter the following in a terminal:

sudo apt-get update; sudo apt-get install dart

Note: If 'dart' is not found, you must first add the Dart repository to your system as described on the page for Dart."

Next, add the Dart bin directory to the $PATH variable:

export PATH="$PATH":"~/.pub-cache/bin"

This command adds the location "$HOME/.pub-cache/bin" to the PATH in Linux. The $PATH variable contains a colon-separated list of directories that can be searched for a command. If the $PATH variable contains the location of the program, the user can run the program from any location within the OS without specifying the full path to the program. An address added to PATH by this command will be valid until the system is reset.

To make this change permanent, append the following line to your ~/.bashrc or ~/.zshrc file.

PATH="$PATH":"$HOME/.pub-cache/bin"

The file .bashrc is located in the HOME directory of the user. Since this is a system file, the name of which starts with a period, the file may be hidden. If you do not see it in the directory listing, while in file manager click Ctrl-H to see the hidden files.

Installing linkcheck

To install linkcheck using Dart, enter the following command.

dart pub global activate linkcheck

Verify Linkcheck installation

To verify that Linkcheck is correctly installed and accessible, type:

linkcheck --version

Working with linkcheck in localhost

If there is a local copy of your website in your PC, it is best to check links directly in localhost as shown below:

linkcheck localhost/

By default, linkcheck checks only internal links. To check links to external (remote) sites, add the -e option. To get a report of links that are directed to a different URL, add
--show‑redirects as shown below.

linkcheck -e --show-redirects localhost/

You can then examine lc-report.txt and update the links that are not working.

Limiting the range of checking

The range of linkcheck checking can be limited to a directory by specifying the directory name after localhost/. Example: localhost/Documents/. To prevent linkcheck from examining a file or a folder, add the name of the file(s), or the folder into a file named skip.txt and place this file in the Home directory of the user. To read this file enter the following command:

linkcheck -e --show-redirects --skip-file skip.txt localhost/

Note: The list in skip.txt may include Regular Expressions. Example: The regular expression ^.*\.zip$ tells linkcheck to skip all files with a zip extension.

Redirecting linkcheck report to a file

The default output of linkcheck is the terminal screen. If your website includes many files in many subdirectories, the output may be too long to analyze directly on screen. The output of linkcheck can be directed to a file using standard Linux method of output redirection. For example, to redirect linkcheck's output to a file named lc-report.txt in your /home/username directory, add the command for redirection:

linkcheck -e --show-redirects localhost/ > /home/$USER/lc-report.txt

Updating website directory

After checking and editing the links in your localhost directory, copy the updated files to your website using your favorite FTP software.

ADVERTISEMENT