Skip to content

Use HTML Tag Processor to audit blocking scripts & styles in Site Health’s enqueued-assets test #2059

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 16 commits into
base: trunk
Choose a base branch
from

Conversation

b1ink0
Copy link
Contributor

@b1ink0 b1ink0 commented Jun 18, 2025

Summary

Fixes #2030

Relevant technical choices

This PR overhauls the Site Health "Enqueued Scripts" and "Enqueued Styles" tests to accurately detect and report only truly render-blocking scripts and styles, whether loaded from external files or defined inline. It achieves this by performing an unauthenticated loopback request and analyzing the resulting front-end HTML with the WP_HTML_Tag_Processor.

@b1ink0 b1ink0 added this to the performance-lab n.e.x.t milestone Jun 18, 2025
@b1ink0 b1ink0 requested a review from westonruter June 18, 2025 19:27
@b1ink0 b1ink0 added [Type] Bug An existing feature is broken [Plugin] Performance Lab Issue relates to work in the Performance Lab Plugin only labels Jun 18, 2025
Comment on lines 114 to 117
$path = perflab_aea_get_path_from_resource_url( $href );
if ( '' === $path ) {
continue;
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, our audit only scans assets located in the WordPress content directory. This raises an question,
how should the audit treat scripts and styles served from a CDN? Should these third-party resources be included in the render-blocking report, excluded entirely, or perhaps flagged separately so we can distinguish external blocking assets from local ones?

I think we’ll also need to consider how to measure their sizes efficiently. One approach could be, sending an HTTP HEAD request to the CDN-hosted URL to check for a Content‑Length header.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From a performance perspective, CDN-served assets could be even worse for performance since they require a new TCP connection, now that browsers don't reuse cached resources across origins. So we definitely should be including them in the render-blocking report.

Sending an HTTP HEAD request for all resources regardless of whether they are on the same origin or not makes sense to me. If the request returns in a 404 then this would be important to report as well.

Once we have the report, then a future enhancement would be digging in to find the theme/plugin responsible for the resource being added in the first place. The AMP plugin implements a lot of this, and it was getting extracted a separate package via https://github.com/GoogleChromeLabs/wp-origination but that effort got stalled and was abandoned. See also #1095.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we consider a case where a HEAD request does not return a content length due to server configuration? If so, should we then make a GET request?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps, but that might be overkill.

Comment on lines 110 to 112
if ( ! is_string( $href ) || false !== strpos( $href, 'wp-includes' ) ) {
continue;
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We currently skip any URL containing wp-includes (i.e. core assets). Since the goal is to surface all render-blocking resources, should we remove that exclusion and include core scripts and styles in the audit as well?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this exclusion should be removed, yes.

Comment on lines 80 to 86
// Process blocking inline scripts.
if ( ! is_string( $src ) ) {
$script_size = mb_strlen( $processor->get_modifiable_text(), '8bit' );
if ( false !== $script_size ) {
$assets['scripts'][] = array(
'src' => 'inline',
'size' => $script_size,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently each inline SCRIPT is reported as its own entry, do we want to continue treating every inline script separately, or would it make sense to aggregate inline scripts into a single summary?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think inline scripts need to be counted since the render blocking is not significant compared to blocking external scripts.

Copy link

codecov bot commented Jun 20, 2025

Codecov Report

Attention: Patch coverage is 90.32258% with 9 lines in your changes missing coverage. Please review.

Project coverage is 67.75%. Comparing base (ff5cc68) to head (e58e451).
Report is 29 commits behind head on trunk.

Files with missing lines Patch % Lines
...cludes/site-health/audit-enqueued-assets/hooks.php 91.07% 5 Missing ⚠️
...ludes/site-health/audit-enqueued-assets/helper.php 89.18% 4 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##            trunk    #2059      +/-   ##
==========================================
- Coverage   68.03%   67.75%   -0.29%     
==========================================
  Files          92       93       +1     
  Lines        7631     7688      +57     
==========================================
+ Hits         5192     5209      +17     
- Misses       2439     2479      +40     
Flag Coverage Δ
multisite 67.75% <90.32%> (-0.29%) ⬇️
single 36.96% <90.32%> (-0.06%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

if (
! is_admin() ||
! current_user_can( 'view_site_health_checks' ) ||
( false !== get_transient( 'aea_enqueued_front_page_scripts' ) && false !== get_transient( 'aea_enqueued_front_page_styles' ) )
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would make sense to combine these two transients into one aea_blocking_assets, or something like that

Comment on lines +297 to +299
if ( false !== strpos( $resource_url, '?' ) ) {
$resource_url = substr( $resource_url, 0, strpos( $resource_url, '?' ) );
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if ( false !== strpos( $resource_url, '?' ) ) {
$resource_url = substr( $resource_url, 0, strpos( $resource_url, '?' ) );
}
$resource_url = preg_replace( '/\?.*/', '', $resource_url );

Nevertheless, the perflab_aea_get_path_from_resource_url() function isn't being used now anymore, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes its not being used anymore I just put comment about it here #2059 (comment).

Comment on lines +296 to +298
// Remove query string if present.
if ( false !== strpos( $resource_url, '?' ) ) {
$resource_url = substr( $resource_url, 0, strpos( $resource_url, '?' ) );
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to remove the function perflab_aea_get_path_from_resource_url entirely, as a HEAD request is now made for local assets as well?

/**
* Convert full URL paths to absolute paths.
* Covers Standard WP configuration, wp-content outside WP directories and subdirectories.
* Ex: https://example.com/content/themes/, https://example.com/wp/wp-includes/
*
* @since 1.0.0
*
* @param string $resource_url URl resource link.
* @return string Returns absolute path to the resource.
*/
function perflab_aea_get_path_from_resource_url( string $resource_url ): string {

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think the function can be removed.

Comment on lines +33 to +36
'headers' => array(
'Accept' => 'text/html',
'Cache-Control' => 'no-cache',
),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the site has HTTP Basic auth, then these can be copied in the request headers. This is done in core in the plugin/theme file editors.


if ( ! $style->src || false !== strpos( $style->src, 'wp-includes' ) ) {
// Skip external script with a "type" attribute that is not JavaScript.
if ( is_string( $type ) && '' !== $type && 'text/javascript' !== strtolower( $type ) ) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if ( is_string( $type ) && '' !== $type && 'text/javascript' !== strtolower( $type ) ) {
if ( 'text/javascript' !== strtolower( (string) $type ) ) {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, it is necessary to keep '' !== $type since an empty string value for type is considered valid in JavaScript.

Then it can be done like this:

Suggested change
if ( is_string( $type ) && '' !== $type && 'text/javascript' !== strtolower( $type ) ) {
if ( '' !== $type && 'text/javascript' !== strtolower( (string) $type ) ) {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, this change breaks things because the get_attribute method returns null when the attribute is not present. Therefore, a null !== $type check or the previous is_string( $type ) check needs to be added explicitly.

which check would be better?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, it is necessary to keep '' !== $type since an empty string value for type is considered valid in JavaScript.

Right, this is executed:

<script type>document.write('hello');</script>

There's also this code in core which is relevant here:

https://github.com/WordPress/wordpress-develop/blob/27233a0a3b00cc5d683080722711bca15970fe8c/src/wp-includes/script-loader.php#L2948-L2958

With the HTML API, the tag would be JavaScript if:

  1. The type attribute is absent ($type === null)
  2. The type attribute is boolean and has no value ($type === true)
  3. The type attribute is an empty string ($type === ''), but not if the value has whitespace in which case it is not executed.
  4. The type attribute contains javascript, ecmascript, jscript, or livescript.

}
} elseif ( 'LINK' === $tag ) {
$rel = $processor->get_attribute( 'rel' );
if ( ! is_string( $rel ) || 'stylesheet' !== strtolower( $rel ) ) {
Copy link
Member

@westonruter westonruter Jun 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is another special case to consider here, and that is a classic hack for non-blocking stylesheets.

This is commonly implemented by adding media="print" and onload="this.media = 'screen'":

https://www.filamentgroup.com/lab/load-css-simpler/
https://scottjehl.com/posts/async-css-already/

So this case can be accounted for as well. Namely, this can just skip considering any stylesheets which have a media attribute that begins with anything other than all or screen.

Co-authored-by: Weston Ruter <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[Plugin] Performance Lab Issue relates to work in the Performance Lab Plugin only [Type] Bug An existing feature is broken
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Site Health tests for enqueued scripts and styles are inaccurate
2 participants