Skip to content

Add --sync-db to generate_changelog, plus many fixes #1642

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 32 commits into from
Jul 22, 2025

Conversation

ksylvan
Copy link
Collaborator

@ksylvan ksylvan commented Jul 21, 2025

Add --sync-db to generate_changelog, plus many fixes

Summary

This PR introduces database synchronization capabilities to the changelog generation system, along with enhanced merge commit detection, improved timestamp handling, and better error reporting. The changes focus on maintaining database integrity and providing comprehensive validation tools for the CI/CD pipeline.

Files Changed

GitHub Workflow Configuration

  • .github/workflows/update-version-and-create-tag.yml: Added exclusion for incoming changelog files (cmd/generate_changelog/incoming/*.txt) to prevent unnecessary workflow triggers when processing PR files.

Database and Binary Files

Core Application Changes

  • cmd/generate_changelog/main.go: Added --sync-db command-line flag for database synchronization operations.
  • cmd/generate_changelog/internal/config/config.go: Added SyncDB boolean field to configuration structure.

Cache System Enhancements

  • cmd/generate_changelog/internal/cache/cache.go:
    • Enhanced time parsing with RFC3339Nano fallback support for better timestamp precision
    • Added VersionExists() and CommitExists() methods for database validation
    • Improved error handling with detailed feedback messages

Changelog Generation System

  • cmd/generate_changelog/internal/changelog/generator.go:

    • Implemented comprehensive SyncDatabase() method for full database synchronization
    • Added syncGitHistory() and verifyCommitPRMappings() helper methods
    • Modified fetchPRs() to accept explicit force sync parameter
    • Enhanced version date calculation using actual commit timestamps
  • cmd/generate_changelog/internal/changelog/processing.go:

    • Added isMergeCommit() function with dual detection methods (parent count + message patterns)
    • Implemented calculateVersionDate() for accurate version timestamping
    • Enhanced PR processing with proper git operations and improved error handling
    • Better integration with cache system for commit and PR data

GitHub API Integration

  • cmd/generate_changelog/internal/github/client.go:

    • Added email field population from GitHub API responses
    • Enhanced commit timestamp capture from GitHub API
    • Implemented parent commit SHA collection for merge detection
  • cmd/generate_changelog/internal/github/types.go:

    • Added Email, Date, and Parents fields to PRCommit struct
    • Updated GraphQL query structure to include AuthoredDate field

Git Operations

  • cmd/generate_changelog/internal/git/walker.go: Added comprehensive documentation for version pattern regex matching.

Test Coverage

  • cmd/generate_changelog/internal/changelog/merge_detection_test.go: New comprehensive test suite for merge commit detection functionality.
  • cmd/generate_changelog/internal/github/email_test.go: New test suite for email field handling in PR commits.

Code Changes

Key Implementation: Database Synchronization

func (g *Generator) SyncDatabase() error {
    // Force PR sync from GitHub
    if err := g.fetchPRs(true); err != nil {
        return fmt.Errorf("failed to sync PRs: %w", err)
    }
    
    // Verify git history and version completeness
    if err := g.syncGitHistory(); err != nil {
        return fmt.Errorf("failed to sync git history: %w", err)
    }
    
    // Verify commit-PR mappings
    if err := g.verifyCommitPRMappings(); err != nil {
        return fmt.Errorf("failed to verify commit-PR mappings: %w", err)
    }
    
    return nil
}

Enhanced Merge Commit Detection

func isMergeCommit(commit github.PRCommit) bool {
    // Primary method: Check parent count
    if len(commit.Parents) > 1 {
        return true
    }
    
    // Fallback method: Check commit message patterns
    mergePatterns := []string{
        `^Merge pull request #\d+`,
        `^Merge branch '.*' into .*`,
        // ... additional patterns
    }
    
    for _, pattern := range mergePatterns {
        if matched, _ := regexp.MatchString(pattern, commit.Message); matched {
            return true
        }
    }
    
    return false
}

Improved Time Parsing

// Try RFC3339Nano first, then fall back to RFC3339
v.Date, err = time.Parse(time.RFC3339Nano, dateStr.String)
if err != nil {
    v.Date, err = time.Parse(time.RFC3339, dateStr.String)
    if err != nil {
        fmt.Fprintf(os.Stderr, "Error parsing date '%s': %v\n", dateStr.String, err)
    }
}

Reason for Changes

These changes address several critical needs in the changelog generation system:

  1. Database Integrity: The --sync-db flag provides a comprehensive way to validate and synchronize the local database with git history and GitHub API data.

  2. Merge Commit Detection: Improved detection prevents merge commits from being included as regular commits in changelog entries, ensuring cleaner output.

  3. Timestamp Accuracy: Enhanced time parsing and actual commit timestamp capture from GitHub API provides more accurate version dating.

  4. Error Resilience: Better error handling and fallback mechanisms improve system reliability.

  5. CI/CD Integration: Workflow exclusions prevent unnecessary builds when processing changelog files.

Impact of Changes

Positive Impacts

  • Data Consistency: Database synchronization ensures all git history and PR data is properly cached and consistent
  • Accuracy: Better timestamp handling and merge commit detection improve changelog quality
  • Reliability: Enhanced error handling prevents system failures and provides actionable feedback
  • Performance: Existence checks prevent duplicate database entries during sync operations

Potential Risks

  • Database Operations: New database methods could potentially cause issues if the database schema is inconsistent
  • API Rate Limits: Force syncing PRs may hit GitHub API rate limits in repositories with many PRs
  • Merge Detection: The dual detection method for merge commits might occasionally misclassify commits, though this is unlikely given the comprehensive patterns

Test Plan

The changes include comprehensive test coverage:

  1. Merge Detection Tests: merge_detection_test.go validates all merge commit detection scenarios
  2. Email Handling Tests: email_test.go ensures proper email field population
  3. Integration Testing: The --sync-db flag should be tested on a repository with various git history patterns
  4. Manual Validation: Compare database contents before and after sync operations to ensure data integrity

Additional Notes

  • The database synchronization is designed to be non-destructive - it preserves existing data while adding missing entries
  • Error messages include actionable troubleshooting steps for manual intervention when automated recovery fails
  • The system maintains backward compatibility with existing database schemas
  • Performance optimizations include existence checks to avoid redundant database operations during sync

This PR significantly enhances the robustness and reliability of the changelog generation system while maintaining full backward compatibility.

ksylvan and others added 3 commits July 21, 2025 13:27
…orkflow

## CHANGES

- Add database sync command with comprehensive validation
- Implement version and commit existence checking methods
- Enhance time parsing with RFC3339Nano fallback support
- Cache fetched PRs during changelog entry creation
- Remove individual incoming files using git operations
- Add sync-db flag for database integrity validation
- Improve commit-PR mapping verification process
- Exclude incoming directory from workflow trigger paths
@ksylvan ksylvan requested a review from Copilot July 21, 2025 21:41
Copilot

This comment was marked as outdated.

…log generator

## CHANGES

- Add debug logging for date parsing failures
- Pass forcePRSync parameter explicitly to fetchPRs method
- Implement comprehensive merge commit detection using parents
- Capture actual commit timestamps from GitHub API
- Calculate version dates from most recent commit
- Add parent commit SHAs for merge detection
- Use real commit dates instead of current time
- Add timestamp validation with fallback handling
@ksylvan ksylvan requested a review from Copilot July 21, 2025 22:43
Copilot

This comment was marked as resolved.

ksylvan and others added 2 commits July 21, 2025 16:14
…generation

## CHANGES

- Add Email field to PRCommit struct for author information
- Extract version date calculation into reusable helper function
- Redirect error messages from stdout to stderr properly
- Populate commit email from GitHub API responses correctly
- Add comprehensive test coverage for email field handling
- Remove duplicate version date calculation code blocks
- Import os package for proper stderr output handling
@ksylvan ksylvan requested a review from Copilot July 21, 2025 23:17
Copilot

This comment was marked as resolved.

ksylvan and others added 2 commits July 21, 2025 16:37
…it history sync

## CHANGES

- Enhance warning message with additional context details
- Add guidance for users when version check fails
- Improve error handling feedback in sync operation
- Provide actionable steps for troubleshooting sync issues
@ksylvan ksylvan requested a review from Copilot July 21, 2025 23:39
Copilot

This comment was marked as resolved.

ksylvan and others added 2 commits July 21, 2025 16:45
…n components

## CHANGES

- Add date string to RFC3339 parsing error messages
- Enhance isMergeCommit function documentation with detailed explanation
- Document calculateVersionDate function with comprehensive behavior description
- Improve error context for date parsing failures
- Add implementation details for merge commit detection methods
- Clarify fallback behavior in version date calculation
@ksylvan ksylvan requested a review from Copilot July 21, 2025 23:51
Copilot

This comment was marked as resolved.

ksylvan and others added 2 commits July 21, 2025 16:59
…e operations

## CHANGES

- Clarify RFC3339Nano date parsing error message
- Improve PR batch cache save error description
- Add context for commit timestamp fallback warning
- Specify git index in file removal error message
@ksylvan ksylvan requested a review from Copilot July 22, 2025 00:04
Copilot

This comment was marked as resolved.

### CHANGES

- Rename `changelogDate` to `versionDate` for clarity
- Enhance error message for git index removal failure
- Add comments for `versionPattern` regex in `walker.go`
@ksylvan ksylvan requested a review from Copilot July 22, 2025 00:15
Copilot

This comment was marked as resolved.

## CHANGES

- Expand version pattern regex documentation with examples
- Add matching and non-matching commit message examples
- Clarify version pattern behavior with optional prefix
- Update PR commit field comments for clarity
- Document email field availability from GitHub API
- Simplify timestamp and parents field descriptions
@ksylvan ksylvan requested a review from Copilot July 22, 2025 00:23
Copilot

This comment was marked as resolved.

@ksylvan ksylvan requested a review from Copilot July 22, 2025 02:04
Copilot

This comment was marked as resolved.

ksylvan and others added 2 commits July 21, 2025 19:44
## CHANGES

- Replace sync.Once with mutex and boolean flag
- Add thread-safe initialization check for merge patterns
- Remove overly broad merge pattern regex rule
- Improve error messaging for file removal failures
- Clarify filesystem vs git index error contexts
- Add detailed manual intervention instructions for failures
@ksylvan ksylvan requested a review from Copilot July 22, 2025 02:54
Copilot

This comment was marked as resolved.

…ration

## CHANGES

- Import runtime package for OS detection
- Add Windows-specific file deletion commands in error messages
- Provide both Command Prompt and PowerShell alternatives
- Maintain existing Unix/Linux rm command for non-Windows systems
- Improve user experience across different operating systems
@ksylvan ksylvan requested a review from Copilot July 22, 2025 02:59
Copilot

This comment was marked as resolved.

ksylvan and others added 2 commits July 21, 2025 20:08
## CHANGES

- Capture first RFC3339Nano parsing error for better diagnostics
- Display both RFC3339Nano and RFC3339 errors in output
- Extract merge patterns to variable for cleaner code
- Improve error message clarity in date parsing failures
@ksylvan ksylvan requested a review from Copilot July 22, 2025 03:12
Copilot

This comment was marked as resolved.

ksylvan and others added 2 commits July 21, 2025 20:25
…t in changelog generation

## CHANGES

- Remove unused runtime import from processing.go
- Simplify date parsing error messages in cache
- Replace global merge pattern variables with struct
- Use sync.Once for thread-safe pattern initialization
- Remove OS-specific file deletion instructions from errors
- Clean up merge commit detection function documentation
- Eliminate redundant error variable in date parsing
@ksylvan ksylvan requested a review from Copilot July 22, 2025 03:30
Copilot

This comment was marked as resolved.

…n changelog generator

## CHANGES

- Replace emoji prefixes with bracketed text labels
- Standardize synchronization step logging format across methods
- Simplify version existence check error message text
- Update commit author email extraction comment clarity
- Maintain consistent stderr output formatting throughout sync process
@ksylvan ksylvan requested a review from Copilot July 22, 2025 03:37
Copilot

This comment was marked as resolved.

…truct wrapper

## CHANGES

- Remove mergePatternManager struct wrapper for patterns
- Replace struct fields with package-level variables
- Simplify getMergePatterns function implementation
- Clean up merge commit detection documentation
- Reduce code complexity in pattern initialization
- Maintain thread-safe lazy initialization with sync.Once
@ksylvan ksylvan requested a review from Copilot July 22, 2025 04:05
Copilot

This comment was marked as resolved.

ksylvan and others added 2 commits July 21, 2025 21:17
## CHANGES

- Add actual error details to date parsing failure message
- Include error variable in stderr output formatting
- Enhance debugging information for invalid date formats
@ksylvan ksylvan requested a review from Copilot July 22, 2025 04:28
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces database synchronization capabilities to the changelog generation system with the addition of a --sync-db flag, enhanced merge commit detection, improved timestamp handling, and comprehensive validation tools for maintaining database integrity in CI/CD pipelines.

  • Added --sync-db command-line flag that performs comprehensive database synchronization and validation
  • Enhanced merge commit detection using both parent commit analysis and message pattern matching
  • Improved timestamp handling with RFC3339Nano precision and better error reporting throughout the system

Reviewed Changes

Copilot reviewed 15 out of 16 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
cmd/generate_changelog/main.go Added --sync-db flag and routing logic for database synchronization
cmd/generate_changelog/internal/config/config.go Added SyncDB boolean field to configuration structure
cmd/generate_changelog/internal/github/types.go Extended PRCommit struct with email, date, and parents fields for enhanced commit data
cmd/generate_changelog/internal/github/client.go Enhanced GitHub API integration to capture email, timestamps, and parent commit data
cmd/generate_changelog/internal/cache/cache.go Added existence checking methods and improved time parsing with RFC3339Nano fallback
cmd/generate_changelog/internal/changelog/generator.go Implemented comprehensive SyncDatabase() method with git history validation
cmd/generate_changelog/internal/changelog/processing.go Added merge commit detection and version date calculation with improved error handling
cmd/generate_changelog/internal/git/walker.go Enhanced version pattern regex with comprehensive documentation
Test files Added comprehensive test coverage for merge detection and email handling
Workflow configuration Added exclusion for incoming changelog files to prevent unnecessary triggers
Comments suppressed due to low confidence (4)

cmd/generate_changelog/internal/changelog/processing.go:267

  • [nitpick] The error message structure is inconsistent - it uses 'Error:' prefix while other messages use 'Warning:'. Consider using consistent error message formatting.
			if err := os.Remove(file); err != nil {

cmd/generate_changelog/internal/changelog/processing.go:271

  • [nitpick] The error message refers to 'OS-specific command' but doesn't provide specific examples. Consider providing concrete examples like 'rm' for Unix/Linux or 'del' for Windows.
				fmt.Fprintf(os.Stderr, "  1. Remove the file %s manually (using the OS-specific command)\n", file)

cmd/generate_changelog/internal/cache/cache.go:210

  • [nitpick] The error message mentions 'RFC3339 or RFC3339Nano' but doesn't provide examples of these formats. Consider adding format examples for better user understanding.
					fmt.Fprintf(os.Stderr, "Error parsing date '%s' for version '%s': %v. Expected format: RFC3339 or RFC3339Nano.\n", dateStr.String, v.Name, err)

cmd/generate_changelog/internal/changelog/generator.go:745

  • [nitpick] The warning message is verbose and could be more actionable. Consider suggesting specific troubleshooting steps or recovery actions.
			fmt.Fprintf(os.Stderr, "Warning: Failed to check existence of version %s: %v. This may affect the completeness of the sync operation.\n", version.Name, err)

@ksylvan ksylvan marked this pull request as ready for review July 22, 2025 04:50
@ksylvan ksylvan merged commit 7254571 into danielmiessler:main Jul 22, 2025
1 check passed
@ksylvan ksylvan deleted the 0721-fix-changelog-processing branch July 22, 2025 04:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant