Skip to content

Extensions to the peek logic to get it to work correctly with version vector/unicast #12281

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

sbodagala
Copy link
Contributor

@sbodagala sbodagala commented Jul 29, 2025

More version vector/unicast related changes in order to get the peek logic to work correctly during recovery:

  • Ensure that non-buddy servers can return an empty version range only if they are known to have been locked
  • Make the MergePeekCursor follow the same logic that the SetPeekCursor does when version vector/unicast is enabled

More details:

  • It is fine for a buddy server (that is available and known to have been locked) to return an empty version range because the proxy sends all versions (that mutate the tag(s) that the server is buddy of) to the buddy server, and this check (

    if (((tLogs & availableTLogs) & ~versionAvailableTLogs[version]).any()) {
    ) ensures that the buddy has received all such (relevant) versions till the recovery version.

  • It is fine for a non-buddy server (that is available and known to have been locked) to return an empty version range because the above check ensures that all non-buddies (that the proxies selected for all versions till the recovery version) that are available during recovery have received those versions and the SetPeekCursor logic waits for (N - RF + 1) servers (where N = number of tLog servers and RF is the replication factor) to send an empty version range before inferring that it has received all versions for a tag (check:

    messageVersion = sortedVersions[sortedVersions.size() - (logSets[logIdx]->logServers.size() + 1 -
    ). Non-buddy servers that aren't known to have been locked aren't guaranteed to have cleared this check (
    if (((tLogs & availableTLogs) & ~versionAvailableTLogs[version]).any()) {
    ) for all versions till the recovery version and hence they cannot send an empty version range (since that can invalidate the above SetPeekCursor logic check).

Testing:

Joshua id (with version vector disabled): 20250730-200107-sre-2a0c3442f5eae704 (in progress).

Code-Reviewer Section

The general pull request guidelines can be found here.

Please check each of the following things and check all boxes before accepting a PR.

  • The PR has a description, explaining both the problem and the solution.
  • The description mentions which forms of testing were done and the testing seems reasonable.
  • Every function/class/actor that was touched is reasonably well documented.

For Release-Branches

If this PR is made against a release-branch, please also check the following:

  • This change/bugfix is a cherry-pick from the next younger branch (younger release-branch or main if this is the youngest branch)
  • There is a good reason why this PR needs to go into a release branch and this reason is documented (either in the description above or in a linked GitHub issue)

peek logic to work correctly during recovery:
- Ensure that non-buddy servers can return an empty version range
only if they are known to have been locked
- Make the MergePeekCursor follow the same logic that the
SetPeekCursor does when unicast is enabled
@sbodagala sbodagala requested review from dlambrig and jzhou77 July 29, 2025 22:56
@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-clang-ide on Linux RHEL 9

  • Commit ID: 1e1803a
  • Duration 0:17:51
  • Result: ❌ FAILED
  • Error: Error while executing command: ninja -v -C build_output -j ${NPROC} all. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-clang on Linux RHEL 9

  • Commit ID: 1e1803a
  • Duration 0:22:08
  • Result: ❌ FAILED
  • Error: Error while executing command: ninja -v -C build_output -j ${NPROC} all packages strip_targets. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-cluster-tests on Linux RHEL 9

  • Commit ID: 1e1803a
  • Duration 0:23:03
  • Result: ❌ FAILED
  • Error: Error while executing command: ninja -v -C build_output -j ${NPROC} all packages strip_targets. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)
  • Cluster Test Logs zip file of the test logs (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr on Linux RHEL 9

  • Commit ID: 1e1803a
  • Duration 0:23:46
  • Result: ❌ FAILED
  • Error: Error while executing command: ninja -v -C build_output -j ${NPROC} all packages strip_targets. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-clang-arm on Linux CentOS 7

  • Commit ID: 1e1803a
  • Duration 0:25:30
  • Result: ❌ FAILED
  • Error: Error while executing command: ninja -v -C build_output -j ${NPROC} all packages strip_targets. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-macos-m1 on macOS Ventura 13.x

  • Commit ID: 1e1803a
  • Duration 0:39:16
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-macos on macOS Ventura 13.x

  • Commit ID: 1e1803a
  • Duration 1:04:18
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

specific data structure) when version vector is disabled.
@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-clang-ide on Linux RHEL 9

  • Commit ID: c4aa0fa
  • Duration 0:18:00
  • Result: ❌ FAILED
  • Error: Error while executing command: ninja -v -C build_output -j ${NPROC} all. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-clang on Linux RHEL 9

  • Commit ID: c4aa0fa
  • Duration 0:23:06
  • Result: ❌ FAILED
  • Error: Error while executing command: ninja -v -C build_output -j ${NPROC} all packages strip_targets. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-cluster-tests on Linux RHEL 9

  • Commit ID: c4aa0fa
  • Duration 0:23:59
  • Result: ❌ FAILED
  • Error: Error while executing command: ninja -v -C build_output -j ${NPROC} all packages strip_targets. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)
  • Cluster Test Logs zip file of the test logs (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr on Linux RHEL 9

  • Commit ID: c4aa0fa
  • Duration 0:24:07
  • Result: ❌ FAILED
  • Error: Error while executing command: ninja -v -C build_output -j ${NPROC} all packages strip_targets. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-clang-arm on Linux CentOS 7

  • Commit ID: c4aa0fa
  • Duration 0:24:54
  • Result: ❌ FAILED
  • Error: Error while executing command: ninja -v -C build_output -j ${NPROC} all packages strip_targets. Reason: exit status 1
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-macos-m1 on macOS Ventura 13.x

  • Commit ID: c4aa0fa
  • Duration 0:38:01
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

@foundationdb-ci
Copy link
Contributor

Result of foundationdb-pr-macos on macOS Ventura 13.x

  • Commit ID: c4aa0fa
  • Duration 1:02:14
  • Result: ✅ SUCCEEDED
  • Error: N/A
  • Build Log terminal output (available for 30 days)
  • Build Workspace zip file of the working directory (available for 30 days)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants