Skip to content

Archive failed jobs to object storage to reduce RDS table size. #263

@sharkinsspatial

Description

@sharkinsspatial

Since the project's inception, we have been maintaining all failed jobs in our log database for auditing and potential reprocessing purposes. Rather than being stored in our active RDS logging instance we should be periodically writing these failed jobs to archive storage and removing them from the live instance. Initially, we should use a long running process which,

  1. Queries failed jobs for a date and exports those rows as ndjson and stores them in an S3 bucket with the key structure year/month/date.json.
  2. Deletes all of those corresponding rows from the table.
  3. Initially this should run in a loop for all dates for a specified range (like Jan-Jul 2022).

After we've done the legacy process a daily cron job should run to execute the same process for any dates > than 2 months ago.

This will provide us an auditable archive of failed granules that we can query and reprocess if necessary while keeping our active production logging RDS instance smaller and easier to manage.

Metadata

Metadata

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions