Skip to content

Garbage collection time window for URL Metrics posts could be increased and made customizable #1948

@westonruter

Description

@westonruter

When considering that a URL Metric can have an indefinite freshness TTL (#1940), where freshness then depends exclusively on the ETag, I think we should consider increasing the garbage collection time window to be greater than 1 month for od_url_metrics posts that haven't been updated:

/**
* Deletes posts that have not been modified in the past month.
*
* @since 0.1.0
*/
public static function delete_stale_posts(): void {
$one_month_ago = gmdate( 'Y-m-d H:i:s', strtotime( '-1 month' ) );
$query = new WP_Query(
array(
'post_type' => self::SLUG,
'posts_per_page' => 100,
'date_query' => array(
'column' => 'post_modified_gmt',
'before' => $one_month_ago,
),
)
);
foreach ( $query->posts as $post ) {
if ( $post instanceof WP_Post && self::SLUG === $post->post_type ) { // Sanity check.
wp_delete_post( $post->ID, true );
}
}
}

Consider sites that get very little traffic. Just because a page doesn't get visited once a month shouldn't result in the URL Metrics being purged. Maybe something like 3 months or 6 months would be more reasonable for the long tail of sites without much traffic.

The garbage collection TTL should probably have a filter to allow it to be customized (or disabled).

Note that if someone deletes a post there is no logic to attempt to delete any corresponding od_url_metrics posts, since there is no connection between the two. I don't think this is necessarily needed.

Docs will also need to be updated here:

When an `od_url_metrics` post has not been updated in a month then it is garbage-collected, since it is likely the original URL has gone away.

Metadata

Metadata

Assignees

Labels

Projects

Status

Done 😃

Relationships

None yet

Development

No branches or pull requests

Issue actions