Check for Broken Links on Your Website Using a Postman Collection

Avatar

We are in the process of refreshing the online documentation for the Postman app. As we introduce new documentation pages, we could manually test every link to make sure it’s working, but that’s boring and time-consuming.  If you can throw together a few lines of code, then you too can build a handy dandy link checker using a Postman Collection.

Let’s create a collection to automatically crawl all the pages on our website and check every link for a healthy HTTP status code.

We can do this with 2 simple requests, run together as a collection.

  1. Initialize: The first request will kick things off. Under the Tests tab, we will use the setEnvironmentVariable() method to establish some important environment variables to be used in the subsequent request.
    – “links” – an array to contain all the links to be checked
    – “index” – an index to iterate through the “links” array
    – “url” – an initial URL to start our page crawler
    setEnvironmentVariable
  2. Check URL: When you send a GET request to a webpage, an HTML representation of the page will be returned, including all the HTML anchor tags for hyperlinks on the page. We can collect, or scrape, these links, and store them in an array to be checked. And so on, and so forth, we can continue looping through every page and scraping every page’s links until we’ve crawled the entire site and checked every link.

So we have 2 requests, the first to set environment variables, and the second to crawl our pages and then scrape and check every link. The second request does the heavy lifting and will continue looping through every page until every link has been checked.


This process illustrates 2 important capabilities of the Postman app: HTML scraping and branching and looping.

HTML scraping

Finding all the links on a page requires scraping HTML. The Postman Sandbox supports Cheerio as a library for scraping HTML elements. Read more about using the Postman Sandbox and other libraries and utilities supported in the pre-request and test scripts sections.

Cheerio HTML scraping

Branching and looping

The setNextRequest() method accepts a request name or id within the same collection as a parameter. Use this method to establish a workflow sequence and designate which request in the same collection to run next, instead of defaulting to the linear execution. Read more about building workflows.

In this example, we will call the same request, again and again, until all the links have been checked.

setNextRequest method


Quickstart

Click the Run in Postman button to import the sample collection and environment template into your Postman app, and check out the collection documentation for more details. You should now see the collection in the sidebar to the left and the environment selected in the dropdown in the top right.

import collection and environment

  1. Update environment: Click the Quick Look icon in the top right to view and edit the environment variables. This is where you can update the values to check links on your own website. In many cases, your root_url will be the same as your start_url. However, in this example, we will use https://www.postman.com/ as our root_url, and start checking links on https://www.postman.com/docs/.
    environment template
  2. Open Postman Console: This step is optional. If you want to see a stream of requests and view any logged statements, go to the application menu, and select View > Show Postman Console to open the console in a separate window. Do this before you send any requests or run the collection.
    Postman Console
  3. Run collection: Click the right angle bracket (>) to expand the collection details view. Click the Run button to open the collection runner in a separate window.
    link collection details
    Verify that your collection and environment are selected in the respective dropdowns, and click Start Run to begin running your collection.
    collection runner
    You should now see your tests running and passing, crawling all the links until there are no more links to check.
    tests passing

This example of traversing links on a page is similar to how you can use Postman with Hypermedia APIs. Rather than knowing the specifications up front, a Hypermedia API response can provide guidance for the next links to check, using environment variables and conditional logic to loop through the data in a nonlinear fashion.

 

What do you think about this topic? Tell us in a comment below.

Comment

Your email address will not be published. Required fields are marked *


This site uses Akismet to reduce spam. Learn how your comment data is processed.

12 thoughts on “Check for Broken Links on Your Website Using a Postman Collection

    Avatar

    Getting error when trying to run collection:
    cheerio is no defined
    Postman for Chrome
    Version 5.0.2
    linux / x86-64
    Chrome 59.0.3071.115

      Avatar

      Give it a try on the Postman native app.

    Avatar

    Hi Joyce,

    I import this sample collection and when running it, it displays the error as: “there was an error in evaluating the test script: cheerio is not defined postman”. Please support me to resolve this problem.

    Many thanks,

      Avatar

      This might not work if you’re using the Postman Chrome app. Give it a try on the Postman native app.

        Avatar

        Many thanks Joyce!

    Avatar

    Hi Joyce, Thanks for sharing the Link Checker, it can be really useful. I’m able to run the script and can see results in the console but I have some links that include Null and some that go outside of my application. These links fail. I would like to be able to ignore some of the links that are captured. Is that possible?
    Thanks
    Tim

    Avatar

    Great post Joyce! Thank you so much. My only catch now is getting past pesky 0Auth! Very grateful for your post, made it easy to get a link checker started.

    Andrew

    Avatar

    Great post Joyce! Thank you so much. My only catch now is getting past pesky 0Auth! Very grateful for your post, made it easy to get a link checker started.

    Andrew

    Avatar

    The “Run in Postman” link seems to be broken. Despite that, a very helpful article. 🙂

      Avatar

      Thank you! I “checked the link”, and it works for me. Both the Run in Postman button and the link to collection documentation.

    Avatar

    Hi Joyce,
    This is just perect!
    Looking forward for some improvements for your collection – is it possible to add a new array dimension – so we can see where from comes particular broken link.