Crew Task sometimes returns previous tool result as completed output

Summary:

Crew / Task agent runs sometimes finish with status completed, but the returned result is not the agent final summary. Instead it can contain a raw previous tool result such as:

[Previous tool result; call_id=call_...]: {...}

Impact:

- Parent agent may trust a completed Task even though the deliverable is missing or incomplete.

- Tool-result leakage pollutes conversation history, mission outputs, task result storage, and memory archives.

- Coding tasks may leave partial edits without a clear failure signal.

Recent observed cases on 2026-05-18:

1. Developer agent completed but returned a message saying implementation was interrupted midway and listed remaining work.

2. Developer agent completed but result was only a raw previous tool result from an Edit call:

[Previous tool result; call_id=call_yZqXLVu0CqENETlYMRVOHUac]: {"file_path":"src/mock-runtime.mjs","replacements":1,"changed":true,...}

Historical evidence:

Running `alma memory grep "Previous tool result"` found similar leakage in SketchUp-related threads from 2026-04-29, 2026-04-30, 2026-05-11, and 2026-05-18. Mission output files also contained raw previous tool results, e.g.:

~/.config/alma/missions/*/sprints/*/attempt-*/generator-output.md

Suspected root cause:

The issue appears to be in crew / harness / agent adapter final-output extraction. If the sub-agent performs a tool call and then does not emit or expose a normal final assistant summary, the wrapper may fall back to the latest transcript item, which can be a tool result. The Task is then marked completed because the process ended cleanly, even though the handoff output is invalid.

Expected behavior:

- Raw tool results should never be used as final Task result.

- If final output is missing/malformed or matches `[Previous tool result`, `call_id=`, raw JSON tool output, etc., mark as invalid/failed/needs retry, not completed.

- If an agent explicitly says implementation is incomplete, do not treat the Task as successful acceptance completion.

Suggested fix:

1. Validate final Task result before marking completed.

2. Prefer the last assistant-authored textual message over the last transcript item.

3. Add an `agent_output_invalid` or `incomplete` status.

4. Retry or ask the sub-agent to summarize actual changes when final output is invalid.

5. For coding agents, require a final summary with changed files and validation results, or an explicit failure/incomplete status.

Temporary workaround:

Parent agent will manually inspect files/tests after every crew completion and treat `[Previous tool result]` as invalid.

Please authenticate to join the conversation.

Upvoters
Status

In Review

Board
πŸ›

Bug Reports

Date

2 days ago

Author

linqi zhang

Subscribe to post

Get notified by email when there are changes.