agentHost: Remember and reuse CLI processes for ssh#307770
agentHost: Remember and reuse CLI processes for ssh#307770roblourens wants to merge 1 commit intomainfrom
Conversation
Co-authored-by: Copilot <copilot@github.com>
There was a problem hiding this comment.
Pull request overview
This PR adds SSH remote agent-host process reuse by persisting remote CLI agent-host state (PID/port/token) in a state file and attempting to reconnect to an existing process before starting a new one, reducing orphaned code agent-host processes on reconnect.
Changes:
- Add reusable SSH helper utilities to read/write/validate a remote
.agent-host-statefile and to clean up stale processes. - Update the SSH connect flow to (1) detect an existing remote agent-host, (2) reuse it when possible, and (3) fall back to cleanup + fresh start when reuse is not connectable.
- Add node tests covering helper behavior and the end-to-end connect/reuse/cleanup flow via in-process mocks.
Show a summary per file
| File | Description |
|---|---|
| src/vs/platform/agentHost/node/sshRemoteAgentHostService.ts | Integrates remote agent-host reuse into the SSH connect flow; refactors to allow test overrides and uses DisposableMap for connection lifecycle. |
| src/vs/platform/agentHost/node/sshRemoteAgentHostHelpers.ts | New helper module for shell-safe paths, platform detection, token redaction, and state-file based process reuse/cleanup. |
| src/vs/platform/agentHost/test/node/sshRemoteAgentHostHelpers.test.ts | Unit tests for helper utilities (token validation, state parsing/cleanup, state writing). |
| src/vs/platform/agentHost/test/node/sshRemoteAgentHostService.test.ts | Connect-flow tests validating caching, reuse via state, stale cleanup, relay-failure fallback, and sanitization behavior. |
Copilot's findings
- Files reviewed: 4/4 changed files
- Comments generated: 3
| // The reused agent host is not connectable — kill it and start fresh | ||
| this._logService.info(`${LOG_PREFIX} Failed to connect to reused agent host on port ${remotePort}, starting fresh`); |
There was a problem hiding this comment.
The log line for a failed relay to a reused agent host drops the underlying error details (relayErr). Including the error (message/code) in the log (at least trace/warn) would make diagnosing remote connectivity issues much easier.
| // The reused agent host is not connectable — kill it and start fresh | |
| this._logService.info(`${LOG_PREFIX} Failed to connect to reused agent host on port ${remotePort}, starting fresh`); | |
| const relayErrorMessage = relayErr instanceof Error ? relayErr.message : String(relayErr); | |
| const relayErrorCode = relayErr instanceof Error && 'code' in relayErr && typeof relayErr.code === 'string' ? relayErr.code : undefined; | |
| // The reused agent host is not connectable — kill it and start fresh | |
| this._logService.warn(`${LOG_PREFIX} Failed to connect to reused agent host on port ${remotePort}${relayErrorCode ? ` (code: ${relayErrorCode})` : ''}: ${relayErrorMessage}. Starting fresh`); |
| await exec(`rm -f ${stateFile} && (umask 077 && echo ${shellEscape(json)} > ${stateFile})`, { ignoreExitCode: true }); | ||
| logService.info(`${LOG_PREFIX} Wrote agent host state to ${stateFile}: PID ${pid}, port ${port}`); |
There was a problem hiding this comment.
writeAgentHostState runs the remote write command with ignoreExitCode: true but then always logs that it "Wrote agent host state". If the write fails (e.g. CLI dir missing, permissions), this log becomes misleading and reuse will silently stop working. Consider checking the returned exit code/stderr and only logging success when the command succeeds (otherwise log a warning).
| await exec(`rm -f ${stateFile} && (umask 077 && echo ${shellEscape(json)} > ${stateFile})`, { ignoreExitCode: true }); | |
| logService.info(`${LOG_PREFIX} Wrote agent host state to ${stateFile}: PID ${pid}, port ${port}`); | |
| const result = await exec(`rm -f ${stateFile} && (umask 077 && echo ${shellEscape(json)} > ${stateFile})`, { ignoreExitCode: true }); | |
| if (result.code === 0) { | |
| logService.info(`${LOG_PREFIX} Wrote agent host state to ${stateFile}: PID ${pid}, port ${port}`); | |
| return; | |
| } | |
| const stderr = 'stderr' in result && typeof result.stderr === 'string' ? result.stderr.trim() : ''; | |
| const errorSuffix = stderr ? `: ${stderr}` : ''; | |
| logService.warn(`${LOG_PREFIX} Failed to write agent host state to ${stateFile} (exit code ${result.code})${errorSuffix}`); |
| let sshClient: SSHClient | undefined; | ||
|
|
||
| try { | ||
| const ssh2Module = _require('ssh2') as { Client: new () => unknown }; | ||
|
|
||
| const reportProgress = (message: string) => { | ||
| this._onDidReportConnectProgress.fire({ connectionKey, message }); | ||
| }; | ||
|
|
There was a problem hiding this comment.
connect() has a cached/early-return path (above this block) that returns an ISSHConnectResult without sshConfigHost, while the non-cached path returns it. This makes the result shape inconsistent and can break callers that rely on sshConfigHost for reconnect. Consider ensuring the cached return also includes sshConfigHost (e.g. from the stored connection/config).
Co-authored-by: Copilot copilot@github.com