This is a GitHub Action to check for broken links in your static files or web pages. It uses muffet for the URL checking task.
See the basic GitHub Action example to run periodic checks (weekly) against mkdocs.org:
on:
schedule:
- cron: '0 0 * * 0'
name: Check markdown links
jobs:
my-broken-link-checker:
name: Check broken links
runs-on: ubuntu-latest
steps:
- name: Check for broken links
uses: ruzickap/action-my-broken-link-checker@v2
with:
url: https://www.mkdocs.org
cmd_params: "--one-page-only --max-connections=3 --color=always" # Check just one page
Check out the real demo:
This deploy action can be combined with
Static Site Generators (Hugo, MkDocs, Gatsby,
GitBook, mdBook, etc.). The following examples expect to have the web pages
stored in the ./build
directory. A caddy web server
is started during the tests, using the hostname from the URL
parameter and
serving the web pages (see details in entrypoint.sh).
- name: Check for broken links
uses: ruzickap/action-my-broken-link-checker@v2
with:
url: https://www.example.com/test123
pages_path: ./build/
cmd_params: '--buffer-size=8192 --max-connections=10 --color=always --skip-tls-verification --header="User-Agent:curl/7.54.0" --timeout=20' # muffet parameters
Do you want to skip the Docker build step? OK, script mode is also available:
- name: Check for broken links
env:
INPUT_URL: https://www.example.com/test123
INPUT_PAGES_PATH: ./build/
INPUT_CMD_PARAMS: '--buffer-size=8192 --max-connections=10 --color=always --header="User-Agent:curl/7.54.0" --skip-tls-verification' # --skip-tls-verification is mandatory parameter when using https and "PAGES_PATH"
run: wget -qO- https://raw.githubusercontent.com/ruzickap/action-my-broken-link-checker/v2/entrypoint.sh | bash
Environment variables used by ./entrypoint.sh
script.
Variable | Default | Description |
---|---|---|
INPUT_CMD_PARAMS |
--buffer-size=8192 --max-connections=10 --color=always --verbose |
Command-line parameters for the URL checker muffet |
INPUT_DEBUG |
false | Enable debug mode for the ./entrypoint.sh script (set -x ) |
INPUT_PAGES_PATH |
Relative path to the directory with local web pages | |
INPUT_URL |
(Mandatory / Required) | URL that will be checked |
Pipeline for periodic link checking:
name: periodic-broken-link-checks
on:
workflow_dispatch:
push:
paths:
- .github/workflows/periodic-broken-link-checks.yml
schedule:
- cron: '3 3 * * 3'
jobs:
broken-link-checker:
runs-on: ubuntu-latest
steps:
- name: Setup Pages
id: pages
uses: actions/configure-pages@v3
- name: Check for broken links
uses: ruzickap/action-my-broken-link-checker@v2
with:
url: ${{ steps.pages.outputs.base_url }}
cmd_params: '--buffer-size=8192 --max-connections=10 --color=always --header="User-Agent:curl/7.54.0" --timeout=20'
GitHub Action example:
name: Checks
on:
push:
branches:
- main
jobs:
build-deploy:
runs-on: ubuntu-latest
steps:
- name: Create web page
run: |
mkdir -v public
cat > public/index.html << EOF
<!DOCTYPE html>
<html>
<head>
My page, which will be stored on the my-testing-domain.com domain
</head>
<body>
Links:
<ul>
<li><a href="https://my-testing-domain.com">https://my-testing-domain.com</a></li>
<li><a href="https://my-testing-domain.com:443">https://my-testing-domain.com:443</a></li>
</ul>
</body>
</html>
EOF
- name: Check links using script
env:
INPUT_URL: https://my-testing-domain.com
INPUT_PAGES_PATH: ./public/
INPUT_CMD_PARAMS: '--skip-tls-verification --verbose --color=always'
INPUT_DEBUG: true
run: wget -qO- https://raw.githubusercontent.com/ruzickap/action-my-broken-link-checker/v2/entrypoint.sh | bash
- name: Check links using container
uses: ruzickap/action-my-broken-link-checker@v2
with:
url: https://my-testing-domain.com
pages_path: ./public/
cmd_params: '--skip-tls-verification --verbose --color=always'
debug: true
Let's try to automate the creation of web pages as much as possible.
The ideal situation requires a repository naming convention where the name of the GitHub repository matches the URL where it will be hosted.
The mandatory part is the repository name awsug.cz
, which is the same as the
domain:
- Repository name: awsugcz/awsug.cz -> Web pages: https://awsug.cz
The web pages will be stored as GitHub Pages on their own domain.
The GitHub Action file may look like:
name: hugo-build
on:
pull_request:
types: [opened, synchronize]
push:
jobs:
hugo-build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- name: Checkout submodules
shell: bash
run: |
auth_header="$(git config --local --get http.https://github.com/.extraheader)"
git submodule sync --recursive
git -c "http.extraheader=$auth_header" -c protocol.version=2 submodule update --init --force --recursive --depth=1
- name: Setup Hugo
uses: peaceiris/actions-hugo@v2
with:
hugo-version: '0.62.0'
- name: Build
run: |
hugo --gc
cp LICENSE README.md public/
echo "${{ github.event.repository.name }}" > public/CNAME
- name: Check for broken links
env:
INPUT_URL: https://${{ github.event.repository.name }}
INPUT_PAGES_PATH: public
INPUT_CMD_PARAMS: '--verbose --buffer-size=8192 --max-connections=10 --color=always --skip-tls-verification --exclude="(mylabs.dev|linkedin.com)"'
run: |
wget -qO- https://raw.githubusercontent.com/ruzickap/action-my-broken-link-checker/v2/entrypoint.sh | bash
- name: Check links using container
uses: ruzickap/action-my-broken-link-checker@v2
with:
url: https://my-testing-domain.com
pages_path: ./public/
cmd_params: '--verbose --buffer-size=8192 --max-connections=10 --color=always --skip-tls-verification --header="User-Agent:curl/7.54.0" --exclude="(mylabs.dev|linkedin.com)"'
debug: true
- name: Deploy
uses: peaceiris/actions-gh-pages@v3
if: ${{ github.event_name }} == 'push' && github.ref == 'refs/heads/main'
env:
ACTIONS_DEPLOY_KEY: ${{ secrets.ACTIONS_DEPLOY_KEY }}
PUBLISH_BRANCH: gh-pages
PUBLISH_DIR: public
with:
forceOrphan: true
The example is using Hugo.
GitHub Pages with github.io domain
The mandatory part is the repository name k8s-harbor
, which is the directory
part at the end of ruzickap.github.io
:
- Repository name: ruzickap/k8s-harbor -> Web pages: https://ruzickap.github.io/k8s-harbor
In this example, the web pages will use GitHub's domain github.io.
name: vuepress-build-check-deploy
on:
pull_request:
types: [opened, synchronize]
paths:
- .github/workflows/vuepress-build-check-deploy.yml
- docs/**
- package.json
- package-lock.json
push:
paths:
- .github/workflows/vuepress-build-check-deploy.yml
- docs/**
- package.json
- package-lock.json
jobs:
vuepress-build-check-deploy:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- name: Install Node.js 12
uses: actions/setup-node@v1
with:
node-version: 12.x
- name: Install VuePress and build the document
run: |
npm install
npm run build
cp LICENSE docs/.vuepress/dist
sed -e "s@(part-@(https://github.com/${GITHUB_REPOSITORY}/tree/main/docs/part-@" -e 's@.\/.vuepress\/public\/@./@' docs/README.md > docs/.vuepress/dist/README.md
ln -s docs/.vuepress/dist ${{ github.event.repository.name }}
- name: Check for broken links
uses: ruzickap/action-my-broken-link-checker@v2
with:
url: https://${{ github.repository_owner }}.github.io/${{ github.event.repository.name }}
pages_path: .
cmd_params: '--exclude=mylabs.dev --max-connections-per-host=5 --rate-limit=5 --timeout=20 --header="User-Agent:curl/7.54.0" --skip-tls-verification'
- name: Deploy
uses: peaceiris/actions-gh-pages@v3
if: ${{ github.event_name }} == 'push' && github.ref == 'refs/heads/main'
env:
ACTIONS_DEPLOY_KEY: ${{ secrets.ACTIONS_DEPLOY_KEY }}
PUBLISH_BRANCH: gh-pages
PUBLISH_DIR: ./docs/.vuepress/dist
with:
forceOrphan: true
In this case I'm using VuePress to create my page.
Both examples can be used as a generic template, and you do not need to change them for your projects.
It's possible to use the checking script locally. It will install Caddy and Muffet binaries if they are not already installed on your system.
export INPUT_URL="https://debian.cz/info/"
export INPUT_CMD_PARAMS="--buffer-size=8192 --ignore-fragments --one-page-only --max-connections=10 --color=always --verbose"
./entrypoint.sh
Output:
*** INFO: [2024-01-26 05:12:20] Start checking: "https://www.mkdocs.org"
https://www.mkdocs.org/
200 https://cdnjs.cloudflare.com/ajax/libs/highlight.js/11.8.0/highlight.min.js
200 https://cdnjs.cloudflare.com/ajax/libs/highlight.js/11.8.0/languages/django.min.js
200 https://cdnjs.cloudflare.com/ajax/libs/highlight.js/11.8.0/languages/yaml.min.js
200 https://cdnjs.cloudflare.com/ajax/libs/highlight.js/11.8.0/styles/github.min.css
200 https://github.com/mkdocs/catalog#-theming
200 https://github.com/mkdocs/mkdocs/blob/master/docs/index.md
200 https://github.com/mkdocs/mkdocs/wiki/MkDocs-Themes
200 https://twitter.com/starletdreaming
200 https://www.googletagmanager.com/gtag/js?id=G-274394082
200 https://www.mkdocs.org/
200 https://www.mkdocs.org/#mkdocs
200 https://www.mkdocs.org/about/contributing/
200 https://www.mkdocs.org/about/license/
200 https://www.mkdocs.org/about/release-notes/
200 https://www.mkdocs.org/about/release-notes/#maintenance-team
200 https://www.mkdocs.org/assets/_mkdocstrings.css
200 https://www.mkdocs.org/css/base.css
200 https://www.mkdocs.org/css/bootstrap.min.css
200 https://www.mkdocs.org/css/extra.css
200 https://www.mkdocs.org/css/font-awesome.min.css
200 https://www.mkdocs.org/dev-guide/
200 https://www.mkdocs.org/dev-guide/api/
200 https://www.mkdocs.org/dev-guide/plugins/
200 https://www.mkdocs.org/dev-guide/themes/
200 https://www.mkdocs.org/dev-guide/translations/
200 https://www.mkdocs.org/getting-started/
200 https://www.mkdocs.org/img/favicon.ico
200 https://www.mkdocs.org/js/base.js
200 https://www.mkdocs.org/js/bootstrap.min.js
200 https://www.mkdocs.org/js/jquery-3.6.0.min.js
200 https://www.mkdocs.org/search/main.js
200 https://www.mkdocs.org/user-guide/
200 https://www.mkdocs.org/user-guide/choosing-your-theme
200 https://www.mkdocs.org/user-guide/choosing-your-theme/
200 https://www.mkdocs.org/user-guide/choosing-your-theme/#mkdocs
200 https://www.mkdocs.org/user-guide/choosing-your-theme/#readthedocs
200 https://www.mkdocs.org/user-guide/cli/
200 https://www.mkdocs.org/user-guide/configuration/
200 https://www.mkdocs.org/user-guide/configuration/#markdown_extensions
200 https://www.mkdocs.org/user-guide/configuration/#plugins
200 https://www.mkdocs.org/user-guide/customizing-your-theme/
200 https://www.mkdocs.org/user-guide/deploying-your-docs/
200 https://www.mkdocs.org/user-guide/installation/
200 https://www.mkdocs.org/user-guide/localizing-your-theme/
200 https://www.mkdocs.org/user-guide/writing-your-docs/
*** INFO: [2024-01-26 05:12:21] Checks completed...
Another example is checking a web page stored locally on your disk. In this
case, I'm using the web page created in the ./tests/
directory from this
Git repository:
export INPUT_URL="https://my-testing-domain.com"
export INPUT_PAGES_PATH="${PWD}/tests/"
export INPUT_CMD_PARAMS="--skip-tls-verification --verbose --color=always"
./entrypoint.sh
Output:
*** INFO: Using path "/home/pruzicka/git/action-my-broken-link-checker/tests/" as domain "my-testing-domain.com" with URI "https://my-testing-domain.com"
*** INFO: [2019-12-30 14:54:22] Start checking: "https://my-testing-domain.com"
https://my-testing-domain.com/
200 https://my-testing-domain.com
200 https://my-testing-domain.com/run_tests.sh
200 https://my-testing-domain.com:443
200 https://my-testing-domain.com:443/run_tests.sh
https://my-testing-domain.com:443/
200 https://my-testing-domain.com
200 https://my-testing-domain.com/run_tests.sh
200 https://my-testing-domain.com:443
200 https://my-testing-domain.com:443/run_tests.sh
*** INFO: [2019-12-30 14:54:22] Checks completed...
Some other examples of building and checking web pages using Static Site Generators and GitHub Actions can be found here: https://github.com/peaceiris/actions-gh-pages/.
The following links contain real examples of My Broken Link Checker:
-
- Static page generated by Hugo with checked links: hugo-build.yml.
-
- Static page generated by VuePress with checked links: vuepress-build-check-deploy.yml.
-
- Periodic link checks of the xvx.cz website using: periodic-broken-link-checks.yml.