Check for Broken Links on Your Website Using a Postman Collection
We are in the process of refreshing the online documentation for the Postman app. As we introduce new documentation pages, we could manually test every link to make sure it’s working, but that’s boring and time-consuming. If you can throw together a few lines of code, then you too can build a handy dandy link checker using a Postman Collection.
Let’s create a collection to automatically crawl all the pages on our website and check every link for a healthy HTTP status code.
We can do this with 2 simple requests, run together as a collection.
- Initialize: The first request will kick things off. Under the Tests tab, we will use the
setEnvironmentVariable()
method to establish some important environment variables to be used in the subsequent request.
– “links” – an array to contain all the links to be checked
– “index” – an index to iterate through the “links” array
– “url” – an initial URL to start our page crawler
- Check URL: When you send a
GET
request to a webpage, an HTML representation of the page will be returned, including all the HTML anchor tags for hyperlinks on the page. We can collect, or scrape, these links, and store them in an array to be checked. And so on, and so forth, we can continue looping through every page and scraping every page’s links until we’ve crawled the entire site and checked every link.
So we have 2 requests, the first to set environment variables, and the second to crawl our pages and then scrape and check every link. The second request does the heavy lifting and will continue looping through every page until every link has been checked.
This process illustrates 2 important capabilities of the Postman app: HTML scraping and branching and looping.
HTML scraping
Finding all the links on a page requires scraping HTML. The Postman Sandbox supports Cheerio as a library for scraping HTML elements. Read more about using the Postman Sandbox and other libraries and utilities supported in the pre-request and test scripts sections.
Branching and looping
The setNextRequest()
method accepts a request name or id within the same collection as a parameter. Use this method to establish a workflow sequence and designate which request in the same collection to run next, instead of defaulting to the linear execution. Read more about building workflows.
In this example, we will call the same request, again and again, until all the links have been checked.
Quickstart
Click the Run in Postman button to import the sample collection and environment template into your Postman app, and check out the collection documentation for more details. You should now see the collection in the sidebar to the left and the environment selected in the dropdown in the top right.
- Update environment: Click the Quick Look icon in the top right to view and edit the environment variables. This is where you can update the values to check links on your own website. In many cases, your
root_url
will be the same as yourstart_url
. However, in this example, we will usehttps://www.postman.com/
as ourroot_url
, and start checking links onhttps://www.postman.com/docs/
.
- Open Postman Console: This step is optional. If you want to see a stream of requests and view any logged statements, go to the application menu, and select View > Show Postman Console to open the console in a separate window. Do this before you send any requests or run the collection.
- Run collection: Click the right angle bracket (>) to expand the collection details view. Click the Run button to open the collection runner in a separate window.
Verify that your collection and environment are selected in the respective dropdowns, and click Start Run to begin running your collection.
You should now see your tests running and passing, crawling all the links until there are no more links to check.
This example of traversing links on a page is similar to how you can use Postman with Hypermedia APIs. Rather than knowing the specifications up front, a Hypermedia API response can provide guidance for the next links to check, using environment variables and conditional logic to loop through the data in a nonlinear fashion.
Getting error when trying to run collection:
cheerio is no defined
Postman for Chrome
Version 5.0.2
linux / x86-64
Chrome 59.0.3071.115
Give it a try on the Postman native app.
Hi Joyce,
I import this sample collection and when running it, it displays the error as: “there was an error in evaluating the test script: cheerio is not defined postman”. Please support me to resolve this problem.
Many thanks,
This might not work if you’re using the Postman Chrome app. Give it a try on the Postman native app.
Many thanks Joyce!
Hi Joyce, Thanks for sharing the Link Checker, it can be really useful. I’m able to run the script and can see results in the console but I have some links that include Null and some that go outside of my application. These links fail. I would like to be able to ignore some of the links that are captured. Is that possible?
Thanks
Tim
Great post Joyce! Thank you so much. My only catch now is getting past pesky 0Auth! Very grateful for your post, made it easy to get a link checker started.
Andrew
Great post Joyce! Thank you so much. My only catch now is getting past pesky 0Auth! Very grateful for your post, made it easy to get a link checker started.
Andrew
The “Run in Postman” link seems to be broken. Despite that, a very helpful article. 🙂
Thank you! I “checked the link”, and it works for me. Both the Run in Postman button and the link to collection documentation.
Hi Joyce,
This is just perect!
Looking forward for some improvements for your collection – is it possible to add a new array dimension – so we can see where from comes particular broken link.
Hello, I modified a little. Please check out who wants
https://github.com/OlegKorn/test_tasks/tree/main/Postman%20snippets/postman%20-%20check%20all%20nested%20links%20of%20a%20site