Skip to content

Add gzip compression for URL metrics using Compression Streams API #1959

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

b1ink0
Copy link
Contributor

@b1ink0 b1ink0 commented Mar 27, 2025

Summary

Fixes #1893

Relevant technical choices

This PR implements the Compression Streams API to efficiently compress URL Metrics JSON before sending it. This significantly reduces the payload size, allowing more data to be sent without exceeding the 64 KiB limit.

Previously some of these changes where reverted due to the bug #1928.

This PR is able to avoid the issue of URL metrics not being sent on pagehide event by compressing the URL metrics before pagehide event and then recompressing it whenever URL metrics are updated by the extensions.

@b1ink0 b1ink0 added [Type] Enhancement A suggestion for improvement of an existing feature [Plugin] Optimization Detective Issues for the Optimization Detective plugin labels Mar 27, 2025
Copy link

codecov bot commented Mar 27, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 72.51%. Comparing base (583a4e9) to head (6aa261c).
Report is 27 commits behind head on trunk.

Additional details and impacted files
@@            Coverage Diff             @@
##            trunk    #1959      +/-   ##
==========================================
+ Coverage   72.34%   72.51%   +0.17%     
==========================================
  Files          85       85              
  Lines        6979     7023      +44     
==========================================
+ Hits         5049     5093      +44     
  Misses       1930     1930              
Flag Coverage Δ
multisite 72.51% <100.00%> (+0.17%) ⬆️
single 40.33% <0.00%> (-0.26%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@b1ink0
Copy link
Contributor Author

b1ink0 commented Mar 27, 2025

If we do this, we should add a pagehide event handler before we load any of the extensions, so that our pagehide callback is called first, which then causes extendElementData and extendRootData to no-op (or throw an error) because at this point we can't re-compress the URL Metric since it is too late.

#1893 (comment)

I would like to understand more about adding a pagehide event handler before loading any extensions. From my interpretation, the goal is to override the extendElementData and extendRootData functions to throw an error if an extension tries to modify URL metrics after the pagehide event.

However, we're currently still supporting the finalize method, even though it's deprecated. Introducing this change might cause issues for extensions that still rely on finalize.

At the moment, I’ve added a flag called enableCompression, which is set to false if a finalize function is detected. This effectively disables compression. I could use this same flag to prevent the function overrides that would otherwise throw errors.

cc: @westonruter

/**
* Timeout ID for debouncing URL metric compression.
*
* @type {ReturnType<typeof setTimeout> | null}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this any different than the following?

Suggested change
* @type {ReturnType<typeof setTimeout> | null}
* @type {number|null}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes its different TypeScript throws as error if we only put number

Type 'Timeout' is not assignable to type 'number'

recompressionTimeout = setTimeout( async () => {

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, good to know.

@@ -660,6 +724,10 @@ export default async function detect( {
initializingExtensionModuleUrls.push( extensionModuleUrl );
}
}

if ( extension.finalize instanceof Function ) {
compressionEnabled = false;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe better to have a separate variable like extensionHasFinalize which is initially false but then is set to true here. Then after the loop finishes, if extensionHasFinalize is true, it can at that point set compressionEnabled to false and at the same time emit a warning to say that URL Metric compression is disabled due to the use of the deprecated finalize function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 7e8ec60.

recompressionTimeout = setTimeout( async () => {
if ( typeof requestIdleCallback === 'function' ) {
await new Promise( ( resolve ) => {
requestIdleCallback( resolve );
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The requestIdleCallback() returns a handle value which can be passed to cancelIdleCallback(). I think this should be done as well in addition to clearing the debouncing timeout, as otherwise it could happen that two idle callbacks for re-compressing the URL Metric could happen right after each other.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in aeda409.

@westonruter
Copy link
Member

I would like to understand more about adding a pagehide event handler before loading any extensions. From my interpretation, the goal is to override the extendElementData and extendRootData functions to throw an error if an extension tries to modify URL metrics after the pagehide event.

@b1ink0 I think that comment I made is obsolete. We should prevent compression from happening if there is a finalize function exported from an extension, like you're doing in the PR.

} );
idleCallbackHandle = null;
}
compressedPayload = await compress( JSON.stringify( urlMetric ) );
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just in case this throws an error, perhaps there should be a try/catch block around this which when an error is caught the error is logged out to the console and then compressionEnabled is set to false.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For logging the error, should console.log be used since debouncedCompressUrlMetric function is defined outside the detect function and therefore doesn't have access to the custom logger? Also, if we want to use the custom logger, then we would need to pass it through multiple functions, as debouncedCompressUrlMetric is called inside both extendElementData and extendRootData. Another option could be to define the logger constant as a let outside the detect function which could let us use the custom logger.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function could also just create a new logger which is only used here for logging out the error.

Another option would be to re-throw the error after setting compressionEnabled to false. But better to log it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function could also just create a new logger which is only used here for logging out the error.

So should the debugMode param should be set to true regardless of the isDebug passed to detect function? One other thing creating a logger for each call to debouncedCompressUrlMetric does not seem correct to me, should there be a new variable outside function like compressionLogger?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logger would only be created in the catch block and then it would never be created again since upon an error compressionEnabled would be set to false and the function would short-circuit. So that seems fine.

And the debugMode param shouldn't matter since the error function logs out unconditionally.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ohh, It makes sense thank you!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 6c8c257

compressionEnabled = false;
}
recompressionTimeout = null;
}, 1000 );
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor point: let's put this 1000 in a constant defined up near the top of this module so we can more easily understand what the magic number is. For example, compressionDebounceWaitDuration.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in dc4d5e9

@b1ink0 b1ink0 marked this pull request as ready for review March 31, 2025 06:39
@b1ink0 b1ink0 requested a review from felixarntz as a code owner March 31, 2025 06:39
Copy link

github-actions bot commented Mar 31, 2025

The following accounts have interacted with this PR and/or linked issues. I will continue to update these lists as activity occurs. You can also manually ask me to refresh this list by adding the props-bot label.

If you're merging code through a pull request on GitHub, copy and paste the following into the bottom of the merge commit message.

Co-authored-by: b1ink0 <[email protected]>
Co-authored-by: westonruter <[email protected]>
Co-authored-by: phanduynam <[email protected]>

To understand the WordPress project's expectations around crediting contributors, please review the Contributor Attribution page in the Core Handbook.

@b1ink0 b1ink0 requested a review from westonruter April 2, 2025 10:02
@phanduynam

This comment was marked as spam.

Copy link
Member

@westonruter westonruter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks so much for bringing this back to life!

@westonruter westonruter merged commit ac1a6ad into WordPress:trunk Apr 3, 2025
19 checks passed
@@ -718,12 +908,16 @@ export default async function detect( {
urlMetric.elements.push( elementData );
elementsByXPath.set( elementData.xpath, elementData );
}
breadcrumbedElementsMap.clear();
Copy link
Member

@westonruter westonruter Apr 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops. This duplicate call (which I added) I've removed in #1968

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[Plugin] Optimization Detective Issues for the Optimization Detective plugin [Type] Enhancement A suggestion for improvement of an existing feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Explore compressing URL Metrics to work within the 64 KiB limit
3 participants