Pinned
Unified Explanation Regarding the Issue of Unable to Invoke Tools
The primary reason tools cannot be invoked is that the model you're using isn't included in https://models.dev. You can search to see if your model exists. If it does exist, there might be an issue with the recorded capabilities for that model, indicating it doesn't support tool invocation. Alternatively, the model itself may inherently lack tool invocation capabilities, such as models in the Deepseek series. The current workaround is to locate the corresponding model within its Provider on the Settings page and manually enable its Tool Calling capability:

yetone 5 months ago
Feature Request
Pinned
Unified Explanation Regarding the Issue of Unable to Invoke Tools
The primary reason tools cannot be invoked is that the model you're using isn't included in https://models.dev. You can search to see if your model exists. If it does exist, there might be an issue with the recorded capabilities for that model, indicating it doesn't support tool invocation. Alternatively, the model itself may inherently lack tool invocation capabilities, such as models in the Deepseek series. The current workaround is to locate the corresponding model within its Provider on the Settings page and manually enable its Tool Calling capability:

yetone 5 months ago
Feature Request
Activity Recorder 窗口不显示
问题: 设置中 Activity Recorder 点击后,窗口只有三个操作按钮,窗口是透明的. 电脑配置: macmini m4 15.6.1 (24G90) alma版本: 0.0.813(记不得多久了,好像最近的几个版本都有问题,最开始有这个功能的时候是正常的)

lucacici about 10 hours ago
Feature Request
Activity Recorder 窗口不显示
问题: 设置中 Activity Recorder 点击后,窗口只有三个操作按钮,窗口是透明的. 电脑配置: macmini m4 15.6.1 (24G90) alma版本: 0.0.813(记不得多久了,好像最近的几个版本都有问题,最开始有这个功能的时候是正常的)

lucacici about 10 hours ago
Feature Request
LLM Routing
Hi, I'd love to suggest a feature that could significantly reduce API costs while maintaining response quality: **intelligent query routing**. **The idea:** Before sending a request to the main LLM, a lightweight local model (e.g., a small classifier or a tiny model like Phi-3-mini / Qwen2.5-0.5B running via Ollama) evaluates the complexity of the user's query and routes it to the appropriate model: - **Simple** → fast, cheap model (e.g., Claude Haiku, GPT-4o-mini) - **Medium** → balanced model (e.g., Claude Sonnet, GPT-4o) - **Complex** → powerful model (e.g., Claude Opus, o1) **Why this matters:** In practice, 60–70% of everyday queries are simple or medium complexity. Routing them to cheaper models could cut API costs by 40–60% with little to no quality loss. There's even an open-source framework for this — [RouteLLM by Berkeley](https://github.com/lm-sys/RouteLLM) — that validates this approach. **Suggested implementation:** 1. A local routing layer that classifies each query before it's sent out 2. Three configurable tiers (Simple / Medium / Complex), each mapped to a user-selected model 3. An optional override — users can manually force a specific model for a request 4. A routing log or indicator showing which model was used and why This would be especially valuable for power users who send a high volume of mixed queries daily. It turns the app into a cost-aware assistant, not just a model wrapper. Would love to hear your thoughts on feasibility. Happy to elaborate or test a prototype if helpful! Thanks for building such a great tool.

Koben Alex 1 day ago
Feature Request
LLM Routing
Hi, I'd love to suggest a feature that could significantly reduce API costs while maintaining response quality: **intelligent query routing**. **The idea:** Before sending a request to the main LLM, a lightweight local model (e.g., a small classifier or a tiny model like Phi-3-mini / Qwen2.5-0.5B running via Ollama) evaluates the complexity of the user's query and routes it to the appropriate model: - **Simple** → fast, cheap model (e.g., Claude Haiku, GPT-4o-mini) - **Medium** → balanced model (e.g., Claude Sonnet, GPT-4o) - **Complex** → powerful model (e.g., Claude Opus, o1) **Why this matters:** In practice, 60–70% of everyday queries are simple or medium complexity. Routing them to cheaper models could cut API costs by 40–60% with little to no quality loss. There's even an open-source framework for this — [RouteLLM by Berkeley](https://github.com/lm-sys/RouteLLM) — that validates this approach. **Suggested implementation:** 1. A local routing layer that classifies each query before it's sent out 2. Three configurable tiers (Simple / Medium / Complex), each mapped to a user-selected model 3. An optional override — users can manually force a specific model for a request 4. A routing log or indicator showing which model was used and why This would be especially valuable for power users who send a high volume of mixed queries daily. It turns the app into a cost-aware assistant, not just a model wrapper. Would love to hear your thoughts on feasibility. Happy to elaborate or test a prototype if helpful! Thanks for building such a great tool.

Koben Alex 1 day ago
Feature Request
image generation fails with gpt-image-2: Unknown parameter 'response_format'
Description Image generation fails when using the openai/gpt-image-2 model with the following error: Image generation failed: Unknown parameter: 'response_format' The error occurs because the image generation code passes a response_format parameter to the OpenAI API that is not supported by the gpt-image-2 model. Steps to Reproduce Set image generation model to openai/gpt-image-2 Send a message that triggers image generation (e.g. "generate an image of ...") Image generation fails with the error above Expected Behavior Image generation should work with gpt-image-2 without passing unsupported parameters. Root Cause The gpt-image-2 model (newer OpenAI image API) does not accept the response_format parameter. This parameter was used by older models like dall-e-3 (values: url or b64_json), but the newer model API schema differs. The image generation implementation likely hardcodes or defaults response_format in the API request body, causing the request to be rejected. Suggested Fix Omit response_format when the selected model is gpt-image-2 (or any model that doesn't support it) Alternatively, detect the model family and only include supported parameters Environment Alma version: v0.0.810 Model: openai/gpt-image-2 Platform: macOS (ARM64)

hh0592821 2 days ago
Feature Request
image generation fails with gpt-image-2: Unknown parameter 'response_format'
Description Image generation fails when using the openai/gpt-image-2 model with the following error: Image generation failed: Unknown parameter: 'response_format' The error occurs because the image generation code passes a response_format parameter to the OpenAI API that is not supported by the gpt-image-2 model. Steps to Reproduce Set image generation model to openai/gpt-image-2 Send a message that triggers image generation (e.g. "generate an image of ...") Image generation fails with the error above Expected Behavior Image generation should work with gpt-image-2 without passing unsupported parameters. Root Cause The gpt-image-2 model (newer OpenAI image API) does not accept the response_format parameter. This parameter was used by older models like dall-e-3 (values: url or b64_json), but the newer model API schema differs. The image generation implementation likely hardcodes or defaults response_format in the API request body, causing the request to be rejected. Suggested Fix Omit response_format when the selected model is gpt-image-2 (or any model that doesn't support it) Alternatively, detect the model family and only include supported parameters Environment Alma version: v0.0.810 Model: openai/gpt-image-2 Platform: macOS (ARM64)

hh0592821 2 days ago
Feature Request
Custom provider `providerOptions` should be transparently passed to API request body
## Problem When using a custom-type provider (e.g. ZenMux) with `zenmux/auto` model routing, the `model_routing_config` parameter is required in the `/chat/completions` request body, but Alma's `providerOptions` mechanism does not transparently pass custom parameters to the API request. After setting `providerOptions` on a model via the API (`PUT /api/providers/:id`), the parameters never reach the actual HTTP request sent to the provider. ## Root Cause (from reverse-engineering `app.asar`) In the main process, for `custom` type providers, Alma correctly maps `providerOptions` into the AI SDK's metadata: ```js // Custom → OpenAI mapping const o = "custom" === n ? "openai" : n; Et[o] = { ...Et[o] || {}, ...e }; // e = model's providerOptions ``` However, the `@ai-sdk/openai-compatible` SDK's `getArgs` filters providerOptions through its schema: ```js Object.fromEntries( Object.entries(providerOptions?.[this.providerOptionsName] ?? {}) .filter(([key]) => !Object.keys(schema).includes(key)) ) ``` This suggests the parameters **should** pass through (unknown keys are kept), but in practice they don't reach the request body. ## Use Case ZenMux's intelligent model routing requires `model_routing_config` in the request body: ```json { "model": "zenmux/auto", "model_routing_config": { "available_models": ["sapiens-ai/agnes-2.0-flash", "google/gemini-3.5-flash", "deepseek/deepseek-v4-pro", "openai/gpt-5.4"], "preference": "balanced" } } ``` Without this parameter, the API returns: `Parameter model_routing_config is required when model is zenmux/auto`. ## Suggested Solutions - **Option A (Fix)**: Ensure `providerOptions` set on a model are transparently spread into the chat completions request body for custom-type providers - **Option B (New feature)**: Add an `extraBody` or `bodyTemplate` field to `StoredProviderModel` that gets merged into the API request body, independent of the providerOptions mechanism - **Option C (Workaround)**: Allow setting fixed body parameters at the provider level that are always included in requests ## Environment - Alma version: v0.0.810 - Provider type: custom (ZenMux) - Model: zenmux/auto

hh0592821 3 days ago
Feature Request
Custom provider `providerOptions` should be transparently passed to API request body
## Problem When using a custom-type provider (e.g. ZenMux) with `zenmux/auto` model routing, the `model_routing_config` parameter is required in the `/chat/completions` request body, but Alma's `providerOptions` mechanism does not transparently pass custom parameters to the API request. After setting `providerOptions` on a model via the API (`PUT /api/providers/:id`), the parameters never reach the actual HTTP request sent to the provider. ## Root Cause (from reverse-engineering `app.asar`) In the main process, for `custom` type providers, Alma correctly maps `providerOptions` into the AI SDK's metadata: ```js // Custom → OpenAI mapping const o = "custom" === n ? "openai" : n; Et[o] = { ...Et[o] || {}, ...e }; // e = model's providerOptions ``` However, the `@ai-sdk/openai-compatible` SDK's `getArgs` filters providerOptions through its schema: ```js Object.fromEntries( Object.entries(providerOptions?.[this.providerOptionsName] ?? {}) .filter(([key]) => !Object.keys(schema).includes(key)) ) ``` This suggests the parameters **should** pass through (unknown keys are kept), but in practice they don't reach the request body. ## Use Case ZenMux's intelligent model routing requires `model_routing_config` in the request body: ```json { "model": "zenmux/auto", "model_routing_config": { "available_models": ["sapiens-ai/agnes-2.0-flash", "google/gemini-3.5-flash", "deepseek/deepseek-v4-pro", "openai/gpt-5.4"], "preference": "balanced" } } ``` Without this parameter, the API returns: `Parameter model_routing_config is required when model is zenmux/auto`. ## Suggested Solutions - **Option A (Fix)**: Ensure `providerOptions` set on a model are transparently spread into the chat completions request body for custom-type providers - **Option B (New feature)**: Add an `extraBody` or `bodyTemplate` field to `StoredProviderModel` that gets merged into the API request body, independent of the providerOptions mechanism - **Option C (Workaround)**: Allow setting fixed body parameters at the provider level that are always included in requests ## Environment - Alma version: v0.0.810 - Provider type: custom (ZenMux) - Model: zenmux/auto

hh0592821 3 days ago
Feature Request
sleep and wake does'nt work when run long time backend task
I said: please continue and do NOT check progress constantly. But it’s not sleep / wake up correctly, It’s keep runing and burned a lot of token. ——————— here is output ———————— 思考了 488.849 秒 Wake up 716 ms Command alma wake Output Alma is awake now! Allow Once 思考了 488.849 秒 Set awake emotion 1.3 s Command alma emotion set-base awake 4 5 "woke up to restart model12 generation" Output Base emotion set: awake (energy: 4, valence: 5) Allow Once 思考了 488.849 秒 嗯... 电池没电断了。先看看上次跑了多少: Check what was generated before power loss 3.7 s Command echo "=== detections ===" && find /Users/mongo/ml-workspace/hoopx-projects/aicourt/data/model12_fusion_work -type f -name "detections.json" 2>/dev/null | wc -l && echo "=== features ===" && find /Users/mongo/ml-workspace/hoopx-projects/aicourt/data/model12_fusion_raw -type f -name "*_features.json" 2>/dev/null | wc -l && echo "=== work dirs ===" && ls /Users/mongo/ml-workspace/hoopx-projects/aicourt/data/model12_fusion_work/ 2>/dev/null | head -10 && echo "=== raw dirs ===" && ls /Users/mongo/ml-workspace/hoopx-projects/aicourt/data/model12_fusion_raw/ 2>/dev/null | head -10 Output === detections === 5 === features === 307 === work dirs === 260521-got-100 bujiyeqiu001 bujiyeqiu0022 fcc-self-001 feiren-train-002 Show 8 more lines... Allow Once 思考了 488.849 秒 5 个视频已经跑完了(307 个特征样本),续跑剩下的 22 个: Restart generation with resume 2.3 s Command cd /Users/mongo/ml-workspace/hoopx-projects/aicourt && nohup bash -c 'PYTHONPATH=src uv run python -u scripts/generate_model8_fusion_dataset.py /Users/mongo/ml-workspace/hoopx-projects/revo-video-v47 --output-dir data/model12_fusion_raw --work-dir data/model12_fusion_work --yolo-device mps' > /tmp/model12_generate.log 2>&1 & echo "PID=$!" Output PID=28423 Allow Once 思考了 488.849 秒 重跑了,续着来。先睡 5 分钟: Sleep before checking 874 ms Command alma sleep Output Alma is now sleeping. She will be grumpy if disturbed. Allow Once 思考了 488.849 秒 Sleep 5 minutes then wake 61 s Command sleep 300 && alma wake Result { "bash_id": "a65949ab-21ab-4a05-85b2-b2330f6c5942", "command": "sleep 300 && alma wake", "description": "[auto-promoted to background after 60s]", "background": true, "startedAt": "2026-05-26T11:49:34.057Z" } Allow Once 思考了 488.849 秒 BashOutputa65949ab-21ab-4a05-85b2-b2330f6c5942 1.0 s 思考了 488.849 秒 TaskOutput a65949ab Completed 思考了 488.849 秒 BashOutputa65949ab-21ab-4a05-85b2-b2330f6c5942 803 ms 思考了 488.849 秒 BashOutputa65949ab-21ab-4a05-85b2-b2330f6c5942 588 ms 思考了 488.849 秒 BashOutputa65949ab-21ab-4a05-85b2-b2330f6c5942 526 ms 思考了 488.849 秒 BashOutputa65949ab-21ab-4a05-85b2-b2330f6c5942 493 ms 思考了 488.849 秒 BashOutputa65949ab-21ab-4a05-85b2-b2330f6c5942 398 ms 思考了 488.849 秒 BashOutputa65949ab-21ab-4a05-85b2-b2330f6c5942 1.1 s 思考了 488.849 秒 BashOutputa65949ab-21ab-4a05-85b2-b2330f6c5942 368 ms 思考了 488.849 秒 BashOutputa65949ab-21ab-4a05-85b2-b2330f6c5942 464 ms 思考了 488.849 秒 BashOutputa65949ab-21ab-4a05-85b2-b2330f6c5942 523 ms 思考了 488.849 秒 BashOutputa65949ab-21ab-4a05-85b2-b2330f6c5942 507 ms 思考了 488.849 秒 BashOutputa65949ab-21ab-4a05-85b2-b2330f6c5942 520 ms 思考了 488.849 秒 BashOutputa65949ab-21ab-4a05-85b2-b2330f6c5942 446 ms 思考了 488.849 秒 BashOutputa65949ab-21ab-4a05-85b2-b2330f6c5942 520 ms 思考了 488.849 秒 BashOutputa65949ab-21ab-4a05-85b2-b2330f6c5942 415 ms 思考了 488.849 秒 BashOutputa65949ab-21ab-4a05-85b2-b2330f6c5942 526 ms 思考了 488.849 秒 BashOutputa65949ab-21ab-4a05-85b2-b2330f6c5942 464 ms 思考了 488.849 秒 BashOutputa65949ab-21ab-4a05-85b2-b2330f6c5942 581 ms 思考了 488.849 秒 BashOutputa65949ab-21ab-4a05-85b2-b2330f6c5942 754 ms 思考了 488.849 秒 BashOutputa65949ab-21ab-4a05-85b2-b2330f6c5942 21 ms 思考了 488.849 秒 BashOutputa65949ab-21ab-4a05-85b2-b2330f6c5942 340 ms 思考了 488.849 秒 BashOutputa65949ab-21ab-4a05-85b2-b2330f6c5942 316 ms 思考了 488.849 秒 算了,5 分钟早过了,直接看进度: Check generation progress after 5 min 2.8 s Command echo "=== detections ===" && find /Users/mongo/ml-workspace/hoopx-projects/aicourt/data/model12_fusion_work -type f -name "detections.json" 2>/dev/null | wc -l && echo "=== features ===" && find /Users/mongo/ml-workspace/hoopx-projects/aicourt/data/model12_fusion_raw -type f -name "*_features.json" 2>/dev/null | wc -l && echo "=== last 5 log lines ===" && tail -5 /tmp/model12_generate.log 2>/dev/null && echo "=== process ===" && pgrep -f "generate_model8" | head -2 Output === detections === 5 === features === 307 === last 5 log lines === INFO Opened video fullgame-chapter01.mp4 (960x640, 85006 frames, 60.0 fps) crop=160,40+800x600 INFO YOLO warmup completed in 0.08s (device=mps, iterations=3) INFO Rim pre-calibrated for fullgame-chapter01.mp4: source=hoop center=(0.529,0.544) diameter=0.203 samples=15 scanned=15 INFO Opened video fullgame-chapter01.mp4 (960x640, 85006 frames, 60.0 fps) crop=160,40+800x600 INFO YOLO warmup completed in 0.08s (device=mps, iterations=3) Show 4 more lines... Allow Once 思考了 488.849 秒 在跑第 6 个视频 fullgame-chapter01(85006 帧 60fps,巨长)。等 5 分钟再看: Sleep 5 min then check generation 63 s Command alma sleep && sleep 300 && alma wake && echo "=== detections ===" && find /Users/mongo/ml-workspace/hoopx-projects/aicourt/data/model12_fusion_work -type f -name "detections.json" 2>/dev/null | wc -l && echo "=== features ===" && find /Users/mongo/ml-workspace/hoopx-projects/aicourt/data/model12_fusion_raw -type f -name "*_features.json" 2>/dev/null | wc -l && echo "=== last log ===" && tail -3 /tmp/model12_generate.log 2>/dev/null Result { "bash_id": "0c3a17b8-1002-412b-8ca1-e09865da0375", "command": "alma sleep && sleep 300 && alma wake && echo \"=== detections ===\" && find /Users/mongo/ml-workspace/hoopx-projects/aicourt/data/model12_fusion_work -type f -name \"detections.json\" 2>/dev/null | wc -l && echo \"=== features ===\" && find /Users/mongo/ml-workspace/hoopx-projects/aicourt/data/model12_fusion_raw -type f -name \"*_features.json\" 2>/dev/null | wc -l && echo \"=== last log ===\" && tail -3 /tmp/model12_generate.log 2>/dev/null", "description": "[auto-promoted to background after 60s]", "background": true, "startedAt": "2026-05-26T11:51:43.898Z" } Allow Once 思考了 488.849 秒 TaskOutput 0c3a17b8 Completed 思考了 488.849 秒 BashOutput0c3a17b8-1002-412b-8ca1-e09865da0375 324 ms 思考了 488.849 秒 zzz... 等 5 分钟... BashOutput0c3a17b8-1002-412b-8ca1-e09865da0375 529 ms 思考了 488.849 秒 ... BashOutput0c3a17b8-1002-412b-8ca1-e09865da0375 571 ms 思考了 488.849 秒 ... BashOutput0c3a17b8-1002-412b-8ca1-e09865da0375 533 ms 思考了 488.849 秒 BashOutput0c3a17b8-1002-412b-8ca1-e09865da0375 521 ms 思考了 488.849 秒 BashOutput0c3a17b8-1002-412b-8ca1-e09865da0375 490 ms 思考了 488.849 秒 BashOutput0c3a17b8-1002-412b-8ca1-e09865da0375 531 ms 思考了 488.849 秒 ... BashOutput0c3a17b8-1002-412b-8ca1-e09865da0375 673 ms 思考了 488.849 秒 BashOutput0c3a17b8-1002-412b-8ca1-e09865da0375 353 ms 思考了 488.849 秒 BashOutput0c3a17b8-1002-412b-8ca1-e09865da0375 576 ms 思考了 488.849 秒 BashOutput0c3a17b8-1002-412b-8ca1-e09865da0375 575 ms 思考了 488.849 秒 BashOutput0c3a17b8-1002-412b-8ca1-e09865da0375 579 ms 思考了 488.849 秒 BashOutput0c3a17b8-1002-412b-8ca1-e09865da0375 526 ms 思考了 488.849 秒 BashOutput0c3a17b8-1002-412b-8ca1-e09865da0375 540 ms 思考了 488.849 秒 不管了,直接查: Wake up and check generation progress 2.4 s Command alma wake 2>/dev/null; echo "=== detections: $(find data/model12_fusion_work -name 'detections.json' | wc -l) ===" && echo "=== features: $(find data/model12_fusion_raw -name '*_features.json' | wc -l) ===" && echo "=== last log ===" && tail -3 /tmp/model12_generate.log && echo "=== process ===" && pgrep -f generate_model8 | head -2 Output Alma is awake now!

Airy Linus 10 days ago
Feature Request
sleep and wake does'nt work when run long time backend task
I said: please continue and do NOT check progress constantly. But it’s not sleep / wake up correctly, It’s keep runing and burned a lot of token. ——————— here is output ———————— 思考了 488.849 秒 Wake up 716 ms Command alma wake Output Alma is awake now! Allow Once 思考了 488.849 秒 Set awake emotion 1.3 s Command alma emotion set-base awake 4 5 "woke up to restart model12 generation" Output Base emotion set: awake (energy: 4, valence: 5) Allow Once 思考了 488.849 秒 嗯... 电池没电断了。先看看上次跑了多少: Check what was generated before power loss 3.7 s Command echo "=== detections ===" && find /Users/mongo/ml-workspace/hoopx-projects/aicourt/data/model12_fusion_work -type f -name "detections.json" 2>/dev/null | wc -l && echo "=== features ===" && find /Users/mongo/ml-workspace/hoopx-projects/aicourt/data/model12_fusion_raw -type f -name "*_features.json" 2>/dev/null | wc -l && echo "=== work dirs ===" && ls /Users/mongo/ml-workspace/hoopx-projects/aicourt/data/model12_fusion_work/ 2>/dev/null | head -10 && echo "=== raw dirs ===" && ls /Users/mongo/ml-workspace/hoopx-projects/aicourt/data/model12_fusion_raw/ 2>/dev/null | head -10 Output === detections === 5 === features === 307 === work dirs === 260521-got-100 bujiyeqiu001 bujiyeqiu0022 fcc-self-001 feiren-train-002 Show 8 more lines... Allow Once 思考了 488.849 秒 5 个视频已经跑完了(307 个特征样本),续跑剩下的 22 个: Restart generation with resume 2.3 s Command cd /Users/mongo/ml-workspace/hoopx-projects/aicourt && nohup bash -c 'PYTHONPATH=src uv run python -u scripts/generate_model8_fusion_dataset.py /Users/mongo/ml-workspace/hoopx-projects/revo-video-v47 --output-dir data/model12_fusion_raw --work-dir data/model12_fusion_work --yolo-device mps' > /tmp/model12_generate.log 2>&1 & echo "PID=$!" Output PID=28423 Allow Once 思考了 488.849 秒 重跑了,续着来。先睡 5 分钟: Sleep before checking 874 ms Command alma sleep Output Alma is now sleeping. She will be grumpy if disturbed. Allow Once 思考了 488.849 秒 Sleep 5 minutes then wake 61 s Command sleep 300 && alma wake Result { "bash_id": "a65949ab-21ab-4a05-85b2-b2330f6c5942", "command": "sleep 300 && alma wake", "description": "[auto-promoted to background after 60s]", "background": true, "startedAt": "2026-05-26T11:49:34.057Z" } Allow Once 思考了 488.849 秒 BashOutputa65949ab-21ab-4a05-85b2-b2330f6c5942 1.0 s 思考了 488.849 秒 TaskOutput a65949ab Completed 思考了 488.849 秒 BashOutputa65949ab-21ab-4a05-85b2-b2330f6c5942 803 ms 思考了 488.849 秒 BashOutputa65949ab-21ab-4a05-85b2-b2330f6c5942 588 ms 思考了 488.849 秒 BashOutputa65949ab-21ab-4a05-85b2-b2330f6c5942 526 ms 思考了 488.849 秒 BashOutputa65949ab-21ab-4a05-85b2-b2330f6c5942 493 ms 思考了 488.849 秒 BashOutputa65949ab-21ab-4a05-85b2-b2330f6c5942 398 ms 思考了 488.849 秒 BashOutputa65949ab-21ab-4a05-85b2-b2330f6c5942 1.1 s 思考了 488.849 秒 BashOutputa65949ab-21ab-4a05-85b2-b2330f6c5942 368 ms 思考了 488.849 秒 BashOutputa65949ab-21ab-4a05-85b2-b2330f6c5942 464 ms 思考了 488.849 秒 BashOutputa65949ab-21ab-4a05-85b2-b2330f6c5942 523 ms 思考了 488.849 秒 BashOutputa65949ab-21ab-4a05-85b2-b2330f6c5942 507 ms 思考了 488.849 秒 BashOutputa65949ab-21ab-4a05-85b2-b2330f6c5942 520 ms 思考了 488.849 秒 BashOutputa65949ab-21ab-4a05-85b2-b2330f6c5942 446 ms 思考了 488.849 秒 BashOutputa65949ab-21ab-4a05-85b2-b2330f6c5942 520 ms 思考了 488.849 秒 BashOutputa65949ab-21ab-4a05-85b2-b2330f6c5942 415 ms 思考了 488.849 秒 BashOutputa65949ab-21ab-4a05-85b2-b2330f6c5942 526 ms 思考了 488.849 秒 BashOutputa65949ab-21ab-4a05-85b2-b2330f6c5942 464 ms 思考了 488.849 秒 BashOutputa65949ab-21ab-4a05-85b2-b2330f6c5942 581 ms 思考了 488.849 秒 BashOutputa65949ab-21ab-4a05-85b2-b2330f6c5942 754 ms 思考了 488.849 秒 BashOutputa65949ab-21ab-4a05-85b2-b2330f6c5942 21 ms 思考了 488.849 秒 BashOutputa65949ab-21ab-4a05-85b2-b2330f6c5942 340 ms 思考了 488.849 秒 BashOutputa65949ab-21ab-4a05-85b2-b2330f6c5942 316 ms 思考了 488.849 秒 算了,5 分钟早过了,直接看进度: Check generation progress after 5 min 2.8 s Command echo "=== detections ===" && find /Users/mongo/ml-workspace/hoopx-projects/aicourt/data/model12_fusion_work -type f -name "detections.json" 2>/dev/null | wc -l && echo "=== features ===" && find /Users/mongo/ml-workspace/hoopx-projects/aicourt/data/model12_fusion_raw -type f -name "*_features.json" 2>/dev/null | wc -l && echo "=== last 5 log lines ===" && tail -5 /tmp/model12_generate.log 2>/dev/null && echo "=== process ===" && pgrep -f "generate_model8" | head -2 Output === detections === 5 === features === 307 === last 5 log lines === INFO Opened video fullgame-chapter01.mp4 (960x640, 85006 frames, 60.0 fps) crop=160,40+800x600 INFO YOLO warmup completed in 0.08s (device=mps, iterations=3) INFO Rim pre-calibrated for fullgame-chapter01.mp4: source=hoop center=(0.529,0.544) diameter=0.203 samples=15 scanned=15 INFO Opened video fullgame-chapter01.mp4 (960x640, 85006 frames, 60.0 fps) crop=160,40+800x600 INFO YOLO warmup completed in 0.08s (device=mps, iterations=3) Show 4 more lines... Allow Once 思考了 488.849 秒 在跑第 6 个视频 fullgame-chapter01(85006 帧 60fps,巨长)。等 5 分钟再看: Sleep 5 min then check generation 63 s Command alma sleep && sleep 300 && alma wake && echo "=== detections ===" && find /Users/mongo/ml-workspace/hoopx-projects/aicourt/data/model12_fusion_work -type f -name "detections.json" 2>/dev/null | wc -l && echo "=== features ===" && find /Users/mongo/ml-workspace/hoopx-projects/aicourt/data/model12_fusion_raw -type f -name "*_features.json" 2>/dev/null | wc -l && echo "=== last log ===" && tail -3 /tmp/model12_generate.log 2>/dev/null Result { "bash_id": "0c3a17b8-1002-412b-8ca1-e09865da0375", "command": "alma sleep && sleep 300 && alma wake && echo \"=== detections ===\" && find /Users/mongo/ml-workspace/hoopx-projects/aicourt/data/model12_fusion_work -type f -name \"detections.json\" 2>/dev/null | wc -l && echo \"=== features ===\" && find /Users/mongo/ml-workspace/hoopx-projects/aicourt/data/model12_fusion_raw -type f -name \"*_features.json\" 2>/dev/null | wc -l && echo \"=== last log ===\" && tail -3 /tmp/model12_generate.log 2>/dev/null", "description": "[auto-promoted to background after 60s]", "background": true, "startedAt": "2026-05-26T11:51:43.898Z" } Allow Once 思考了 488.849 秒 TaskOutput 0c3a17b8 Completed 思考了 488.849 秒 BashOutput0c3a17b8-1002-412b-8ca1-e09865da0375 324 ms 思考了 488.849 秒 zzz... 等 5 分钟... BashOutput0c3a17b8-1002-412b-8ca1-e09865da0375 529 ms 思考了 488.849 秒 ... BashOutput0c3a17b8-1002-412b-8ca1-e09865da0375 571 ms 思考了 488.849 秒 ... BashOutput0c3a17b8-1002-412b-8ca1-e09865da0375 533 ms 思考了 488.849 秒 BashOutput0c3a17b8-1002-412b-8ca1-e09865da0375 521 ms 思考了 488.849 秒 BashOutput0c3a17b8-1002-412b-8ca1-e09865da0375 490 ms 思考了 488.849 秒 BashOutput0c3a17b8-1002-412b-8ca1-e09865da0375 531 ms 思考了 488.849 秒 ... BashOutput0c3a17b8-1002-412b-8ca1-e09865da0375 673 ms 思考了 488.849 秒 BashOutput0c3a17b8-1002-412b-8ca1-e09865da0375 353 ms 思考了 488.849 秒 BashOutput0c3a17b8-1002-412b-8ca1-e09865da0375 576 ms 思考了 488.849 秒 BashOutput0c3a17b8-1002-412b-8ca1-e09865da0375 575 ms 思考了 488.849 秒 BashOutput0c3a17b8-1002-412b-8ca1-e09865da0375 579 ms 思考了 488.849 秒 BashOutput0c3a17b8-1002-412b-8ca1-e09865da0375 526 ms 思考了 488.849 秒 BashOutput0c3a17b8-1002-412b-8ca1-e09865da0375 540 ms 思考了 488.849 秒 不管了,直接查: Wake up and check generation progress 2.4 s Command alma wake 2>/dev/null; echo "=== detections: $(find data/model12_fusion_work -name 'detections.json' | wc -l) ===" && echo "=== features: $(find data/model12_fusion_raw -name '*_features.json' | wc -l) ===" && echo "=== last log ===" && tail -3 /tmp/model12_generate.log && echo "=== process ===" && pgrep -f generate_model8 | head -2 Output Alma is awake now!

Airy Linus 10 days ago
Feature Request
Title: .gitignore: app appends duplicate `.alma-snapshots` entry without checking existing patterns
Alma unconditionally appends.alma-snapshots to.gitignore every time it runs, without checking whether a matching pattern (e.g. **/.alma-snapshots/) already exists in the file. This results in duplicate entries and a perpetually dirty git status. Steps to reproduce: 1. Add **/.alma-snapshots/ (or any glob that already covers the path) to your .gitignore 2. Open/run Alma 3. Alma appends.alma-snapshots again at the end of the file 4. git diff now shows a new line added Expected: Alma should check if an existing pattern already matches.alma-snapshots before appending. A simple substring or line-by-line check would be sufficient. Workaround: Remove the original pattern and let Alma manage its own.alma-snapshots entry. Alma version: 0.0.790, macOS

Royce 11 days ago
Bug Reports
Title: .gitignore: app appends duplicate `.alma-snapshots` entry without checking existing patterns
Alma unconditionally appends.alma-snapshots to.gitignore every time it runs, without checking whether a matching pattern (e.g. **/.alma-snapshots/) already exists in the file. This results in duplicate entries and a perpetually dirty git status. Steps to reproduce: 1. Add **/.alma-snapshots/ (or any glob that already covers the path) to your .gitignore 2. Open/run Alma 3. Alma appends.alma-snapshots again at the end of the file 4. git diff now shows a new line added Expected: Alma should check if an existing pattern already matches.alma-snapshots before appending. A simple substring or line-by-line check would be sufficient. Workaround: Remove the original pattern and let Alma manage its own.alma-snapshots entry. Alma version: 0.0.790, macOS

Royce 11 days ago
Bug Reports
deepseek provider 是否可以自定义 base_url? 或者怎么修改自定义的 reasoning effort
我现在用 deepseek-v4- 模型也是通过自定义 provider 调用的,因为需要用自己的代理进行中间的转换统计等等。 但是发现使用自定义 provider 时,好像无法修改 reasoning_effort? 不管是 /v1/chat/completions 还是 /v1/messages。 在界面上不管选择什么 reasong effort,后端都收不到相关字段,修改 Provider Options (JSON) 貌似也没用,不会透传过来。 这种自定义的 provider 以及模型是不支持 reasong effort 吗? 或者 deepseek provider 如果可以支持自定义 base url 的话,针对 deepseek 的模型,应该也可以

afon 11 days ago
Feature Request
deepseek provider 是否可以自定义 base_url? 或者怎么修改自定义的 reasoning effort
我现在用 deepseek-v4- 模型也是通过自定义 provider 调用的,因为需要用自己的代理进行中间的转换统计等等。 但是发现使用自定义 provider 时,好像无法修改 reasoning_effort? 不管是 /v1/chat/completions 还是 /v1/messages。 在界面上不管选择什么 reasong effort,后端都收不到相关字段,修改 Provider Options (JSON) 貌似也没用,不会透传过来。 这种自定义的 provider 以及模型是不支持 reasong effort 吗? 或者 deepseek provider 如果可以支持自定义 base url 的话,针对 deepseek 的模型,应该也可以

afon 11 days ago
Feature Request
自定义 provider (opencode) 请求 kimi 模型报 cache_control 400 错误
使用自定义 provider(type: custom,apiFormat: openai-chat),走 opencode 链路(https://opencode.ai/zen/go/v1),调用 kimi-k2.6 模型时报错: AI_APICallError: Error from provider: 2 request validation errors: Extra inputs are not permitted, field: 'messages[0].cache_control'; Extra inputs are not permitted, field: 'messages[1].cache_control' 复现步骤: 添加 custom provider,baseURL 为 https://opencode.ai/zen/go/v1 apiFormat 设为 openai-chat 选择该 provider 下的 kimi-k2.6 模型发消息 每次请求必现 400 根因分析: Alma 在构造消息请求时,给所有 provider 的消息体里加了 Anthropic 专有的 cache_control 字段(prompt caching 标记),但没有按 apiFormat 做字段过滤。对于 openai-chat 格式的 provider,应该剥离 cache_control,但目前被原样发送了。OpenCode 网关严格校验 OpenAI 格式,直接拒绝。 同类案例(供参考): OpenOmniBot #301:同问题,修复方式是 buildOpenAICompatibleRequestBody() 中递归清除 cache_control earendil-works/pi #3779:同样走 opencode 链路,同样 cache_control 报 400,修复方式是给非 Anthropic 模型加 compat 标记 环境: Alma 版本:127 Provider 类型:custom,apiFormat: openai-chat 目标模型:kimi-k2.6(via opencode)

钱军 14 days ago
Bug Reports
自定义 provider (opencode) 请求 kimi 模型报 cache_control 400 错误
使用自定义 provider(type: custom,apiFormat: openai-chat),走 opencode 链路(https://opencode.ai/zen/go/v1),调用 kimi-k2.6 模型时报错: AI_APICallError: Error from provider: 2 request validation errors: Extra inputs are not permitted, field: 'messages[0].cache_control'; Extra inputs are not permitted, field: 'messages[1].cache_control' 复现步骤: 添加 custom provider,baseURL 为 https://opencode.ai/zen/go/v1 apiFormat 设为 openai-chat 选择该 provider 下的 kimi-k2.6 模型发消息 每次请求必现 400 根因分析: Alma 在构造消息请求时,给所有 provider 的消息体里加了 Anthropic 专有的 cache_control 字段(prompt caching 标记),但没有按 apiFormat 做字段过滤。对于 openai-chat 格式的 provider,应该剥离 cache_control,但目前被原样发送了。OpenCode 网关严格校验 OpenAI 格式,直接拒绝。 同类案例(供参考): OpenOmniBot #301:同问题,修复方式是 buildOpenAICompatibleRequestBody() 中递归清除 cache_control earendil-works/pi #3779:同样走 opencode 链路,同样 cache_control 报 400,修复方式是给非 Anthropic 模型加 compat 标记 环境: Alma 版本:127 Provider 类型:custom,apiFormat: openai-chat 目标模型:kimi-k2.6(via opencode)

钱军 14 days ago
Bug Reports
Bug: Search frequently fails with “Object has been destroyed” in Alma desktop
Bug: Search frequently fails with “Object has been destroyed” in Alma desktop 现象:搜索 UI 经常报 Object has been destroyed 验证:alma activity search CLI 正常,说明搜索后端没坏 判断:更像 Electron/WebContents 生命周期问题 可能调用链:异步搜索返回时原 webContents 已销毁 修复建议:webContents.fromId 后加!wc || wc.isDestroyed() guard 前端补救:debounce、abort stale request、ignore stale response

Desheng Li 15 days ago
Feature Request
Bug: Search frequently fails with “Object has been destroyed” in Alma desktop
Bug: Search frequently fails with “Object has been destroyed” in Alma desktop 现象:搜索 UI 经常报 Object has been destroyed 验证:alma activity search CLI 正常,说明搜索后端没坏 判断:更像 Electron/WebContents 生命周期问题 可能调用链:异步搜索返回时原 webContents 已销毁 修复建议:webContents.fromId 后加!wc || wc.isDestroyed() guard 前端补救:debounce、abort stale request、ignore stale response

Desheng Li 15 days ago
Feature Request
Group Privacy 模式下 @ 消息被错误路由到子线程而非群内回复
环境: Alma 0.0.798 / Windows 10 / Telegram Bot 复现步骤: 在 Telegram 群聊中开启 Alma 的 Group Privacy 开关 在群里 @bot(@ 提及),不是回复 bot 的消息 观察 bot 的回应位置 预期行为: @ 提及 → bot 应在群里公开回应 回复 bot 的消息 → 应路由到桌面端的私密子线程 实际行为: @ 提及和回复 bot 的消息都被路由到了桌面端的 forked 子线程 群里完全看不到 bot 对 @ 的回应,像是 bot 在装死 影响: Group Privacy 的核心使用场景被破坏——用户期望 @ 能在群里得到公开回应,但当前实现让群聊互动完全静默。

moebear 16 days ago
Bug Reports
Group Privacy 模式下 @ 消息被错误路由到子线程而非群内回复
环境: Alma 0.0.798 / Windows 10 / Telegram Bot 复现步骤: 在 Telegram 群聊中开启 Alma 的 Group Privacy 开关 在群里 @bot(@ 提及),不是回复 bot 的消息 观察 bot 的回应位置 预期行为: @ 提及 → bot 应在群里公开回应 回复 bot 的消息 → 应路由到桌面端的私密子线程 实际行为: @ 提及和回复 bot 的消息都被路由到了桌面端的 forked 子线程 群里完全看不到 bot 对 @ 的回应,像是 bot 在装死 影响: Group Privacy 的核心使用场景被破坏——用户期望 @ 能在群里得到公开回应,但当前实现让群聊互动完全静默。

moebear 16 days ago
Bug Reports
Compact loop on Codex-backed threads: 3 chained bugs in compactThread / persistCompactionResult / KE, plus 1 Codex plugin UX paper cut
TL;DR Diagnosed three chained bugs in Alma 0.0.798 that together cause "compact does nothing" — token usage never drops, the same N messages get summarized every round. The chain only fully manifests on threads using the OpenAI Codex Auth plugin (openai-codex-auth), but two of the three bugs are pure Alma main-process code and would bite any plugin that emits messages with empty inner ids. Also reporting one UX paper cut on the Codex plugin itself. I have a local patch running that fixes all three end-to-end so the diagnosis is grounded — but I'm filing this as a diagnostic report, not a PR, since I'm reading minified out/main/index.js. Bug 1 — compactThread summarizes the same messages every round (Alma main) Where: compactThread handler (POST /api/threads/:id/compact) — calls getMessagesByThreadId(threadId) and feeds the result straight into the KE summarizer. Problem: the result is not filtered by metadata.isCompacted. So on every compaction round, messages that were already summarized in a previous round get summarized again. Indicator keeps saying "127 earlier messages were summarized" even though token count never drops. Repro (SQL against chat_threads.db): -- For any thread compacted multiple times: SELECT substr(id, -16) AS indicator_id, datetime(substr(created_at, 1, 19)) AS created, json_extract(metadata, '$.compactedMessageCount') AS claimed FROM chat_messages WHERE thread_id = ' ' AND json_extract(metadata, '$.isCompactionIndicator') = 1 ORDER BY created_at; In one real thread I have, four consecutive rounds all claim compactedMessageCount = 127 with overlapping compactedMessageIds. Classic loop. Suggested source-level fix: after getMessagesByThreadId, filter out rows where metadata.isCompacted === true (and metadata.isCompactionIndicator === true) before passing to KE. Responsibility: 100% Alma. Provider-independent. Bug 2 — persistCompactionResult uses id-intersection, but Codex plugin emits subagent messages with id: "" (Alma + Codex plugin, joint responsibility) Where: persistCompactionResult(threadId, result, messages) — currently does roughly: const o = new Set(result.compactedMessageIds); for (const m of messages) { const stripped = m.id.replace(`${threadId}--`, ""); if (o.has(stripped)) { updateMessageMetadata(m.id, { isCompacted: true }); } } Problem: result.compactedMessageIds is built from e.slice(0, keepRecent).map(m => m.id) inside KE, where m is the inner message JSON, not the DB row. For subagent / tool-call rows coming from the Codex plugin, the inner message.id is an empty string. The Set ends up with "" entries that never match any actual DB id — those messages never get marked isCompacted: true, and they re-enter the compaction pool on the next round forever. Empty inner-id is Codex-plugin-specific, not universal. Cross-tabulation against my local DB: ModelTotal messagesEmpty inner id% plugin:openai-codex-auth:openai-codex:gpt-5.5-xhigh 1409 85 6.0% plugin:openai-codex-auth:openai-codex:gpt-5.5-low 4 0 0% deepseek:deepseek-v4-pro 104 0 0% deepseek:deepseek-v4-flash 59 1 1.7% claude-subscription:claude-sonnet-4-6 75 0 0% claude-subscription:claude-opus-4-7 63 0 0% claude-subscription:claude-haiku-4-5-20251001 4 0 0% Claude / DeepSeek paths: zero empty ids. OpenAI Codex Auth heavy use: ~6%. Each of those 85 rows is a row that never gets marked isCompacted and re-enters the pool on every compact round. Repro: -- Confirm the % empty-id by model SELECT t.model AS model, COUNT(*) AS total_msgs, SUM(CASE WHEN json_extract(m.message, '$.id') = '' THEN 1 ELSE 0 END) AS empty_inner_id FROM chat_messages m JOIN chat_threads t ON m.thread_id = t.id GROUP BY t.model ORDER BY total_msgs DESC; -- Verify claimed vs actually-written ids for a compaction indicator SELECT json_extract(metadata, '$.compactedMessageCount') AS claimed, json_array_length(json_extract(metadata, '$.compactedMessageIds')) AS ids_len, json_extract(metadata, '$.compactedMessageIds[0]') AS sample_id FROM chat_messages WHERE thread_id = ' ' AND json_extract(metadata, '$.isCompactionIndicator') = 1 ORDER BY created_at DESC LIMIT 5; On unpatched Alma I see claimed=127, ids_len=114, and a meaningful fraction of those 114 entries are the empty string. Suggested fixes (two-sided): Alma main (defensive, fixes the bug regardless of plugin behavior): drop the id-intersection. The contract is already "KE summarizes the first compactedMessageCount entries of its input array", so persistCompactionResult should mark by array position — take the first result.compactedMessageCount rows of messages and set isCompacted: true on each. Deterministic and immune to empty/missing inner ids. Codex plugin (openai-codex-auth, root cause): subagent / tool-call message JSON should carry a non-empty id. Other plugins (Claude subscription, DeepSeek) already do this. Worth checking why Codex-emitted subagent rows are landing with id: "" in the stored message JSON. Responsibility: primary cause is Codex plugin emitting empty ids; Alma's persistCompactionResult should be defensive enough not to rely on them. Bug 3 — KE summarizer doesn't recognize prior summary/indicator messages (Alma main) Where: the KE function — slices input by position (e.slice(0, a)), doesn't inspect text or metadata. Problem: in the auto-compact and token-limit-triggered paths, callers pass bare message JSON (the inner parts object) and DB metadata (isCompactionIndicator, isCompactionSummary) is gone by the time it reaches KE. So previously generated summary blobs and the canonical indicator message can be re-fed into a new summary, producing nested "summary of summaries" with token bloat each round. Suggested source-level fix: at the entry of KE, defensively filter out messages whose text starts with the canonical indicator prefix: ---\n\n🗜️ **Context Compacted** These are formats KE itself produces, so strict-prefix matching has effectively zero false-positive risk. Also pass through metadata.isCompactionSummary / isCompactionIndicator / isCompacted when the caller has them, and skip those too. Responsibility: 100% Alma. Bug 4 — Codex plugin: needs a manual model refresh on every Alma startup (Codex plugin) Where: Codex plugin (openai-codex-auth / acp-extension-codex) on Alma startup. Problem: after a clean Alma launch, the Codex provider's models exist in settings, but Codex-backed threads behave as if the provider isn't fully initialized — instructions / tool-call behavior differ from how the same thread behaves later in the session. The fix every single time is: open Settings → Providers → Codex → click the model refresh button. After that one click, everything works for the rest of the session. Current workaround: I literally do that refresh after every Alma start. No error message — it just silently misbehaves until refreshed. Suspected cause (best guess from reading the bundle, would love confirmation): some piece of state (tool capability metadata? originalInstructions? a hasTools flag?) is computed lazily at provider-list refresh time, and the startup boot path doesn't reach it. The manual refresh click is what triggers metadata fetch + capability registration the agent path expects. A defensive consumer-side fix would be something like: const shouldInjectCodexAgentInstructions = hasTools || !originalInstructions; …so background paths still get correct Codex agent instructions even when per-provider lazy metadata hasn't populated yet. But the real fix is probably "always trigger the provider model refresh as part of plugin init on startup". Repro: Cmd+Q Alma → reopen → start a Codex-backed thread → observe behavior; then Settings → refresh Codex models → return to same thread → observe corrected behavior. Responsibility: Codex plugin. Bugs 1+2+3 form a chain — each is annoying alone but explainable, and together they produce the "compact does nothing" symptom Codex users hit (token never drops, same N messages summarized every time). Bug 2 won't manifest at all on Claude / DeepSeek threads but is brutal on Codex Auth threads (which I suspect is a large fraction of heavy users). Bug 4 is unrelated to compaction but affects the same plugin and has been my daily papercut for weeks. Environment Alma 0.0.798 macOS (Apple Silicon) Plugin: openai-codex-auth + acp-extension-codex Reproduced across multiple long-running Codex-backed threads (one of mine accumulated 18 compaction indicators with 0 isCompacted rows actually marked, before patching) Happy to share redacted message-id samples, full SQL probes, or notes on the local patch I'm running (it's a binary patch on the minified bundle — not useful as a PR, but the diff is small if it helps locate the exact functions). Thanks yetone for building Alma!

Barney Hong 17 days ago
Bug Reports
Compact loop on Codex-backed threads: 3 chained bugs in compactThread / persistCompactionResult / KE, plus 1 Codex plugin UX paper cut
TL;DR Diagnosed three chained bugs in Alma 0.0.798 that together cause "compact does nothing" — token usage never drops, the same N messages get summarized every round. The chain only fully manifests on threads using the OpenAI Codex Auth plugin (openai-codex-auth), but two of the three bugs are pure Alma main-process code and would bite any plugin that emits messages with empty inner ids. Also reporting one UX paper cut on the Codex plugin itself. I have a local patch running that fixes all three end-to-end so the diagnosis is grounded — but I'm filing this as a diagnostic report, not a PR, since I'm reading minified out/main/index.js. Bug 1 — compactThread summarizes the same messages every round (Alma main) Where: compactThread handler (POST /api/threads/:id/compact) — calls getMessagesByThreadId(threadId) and feeds the result straight into the KE summarizer. Problem: the result is not filtered by metadata.isCompacted. So on every compaction round, messages that were already summarized in a previous round get summarized again. Indicator keeps saying "127 earlier messages were summarized" even though token count never drops. Repro (SQL against chat_threads.db): -- For any thread compacted multiple times: SELECT substr(id, -16) AS indicator_id, datetime(substr(created_at, 1, 19)) AS created, json_extract(metadata, '$.compactedMessageCount') AS claimed FROM chat_messages WHERE thread_id = ' ' AND json_extract(metadata, '$.isCompactionIndicator') = 1 ORDER BY created_at; In one real thread I have, four consecutive rounds all claim compactedMessageCount = 127 with overlapping compactedMessageIds. Classic loop. Suggested source-level fix: after getMessagesByThreadId, filter out rows where metadata.isCompacted === true (and metadata.isCompactionIndicator === true) before passing to KE. Responsibility: 100% Alma. Provider-independent. Bug 2 — persistCompactionResult uses id-intersection, but Codex plugin emits subagent messages with id: "" (Alma + Codex plugin, joint responsibility) Where: persistCompactionResult(threadId, result, messages) — currently does roughly: const o = new Set(result.compactedMessageIds); for (const m of messages) { const stripped = m.id.replace(`${threadId}--`, ""); if (o.has(stripped)) { updateMessageMetadata(m.id, { isCompacted: true }); } } Problem: result.compactedMessageIds is built from e.slice(0, keepRecent).map(m => m.id) inside KE, where m is the inner message JSON, not the DB row. For subagent / tool-call rows coming from the Codex plugin, the inner message.id is an empty string. The Set ends up with "" entries that never match any actual DB id — those messages never get marked isCompacted: true, and they re-enter the compaction pool on the next round forever. Empty inner-id is Codex-plugin-specific, not universal. Cross-tabulation against my local DB: ModelTotal messagesEmpty inner id% plugin:openai-codex-auth:openai-codex:gpt-5.5-xhigh 1409 85 6.0% plugin:openai-codex-auth:openai-codex:gpt-5.5-low 4 0 0% deepseek:deepseek-v4-pro 104 0 0% deepseek:deepseek-v4-flash 59 1 1.7% claude-subscription:claude-sonnet-4-6 75 0 0% claude-subscription:claude-opus-4-7 63 0 0% claude-subscription:claude-haiku-4-5-20251001 4 0 0% Claude / DeepSeek paths: zero empty ids. OpenAI Codex Auth heavy use: ~6%. Each of those 85 rows is a row that never gets marked isCompacted and re-enters the pool on every compact round. Repro: -- Confirm the % empty-id by model SELECT t.model AS model, COUNT(*) AS total_msgs, SUM(CASE WHEN json_extract(m.message, '$.id') = '' THEN 1 ELSE 0 END) AS empty_inner_id FROM chat_messages m JOIN chat_threads t ON m.thread_id = t.id GROUP BY t.model ORDER BY total_msgs DESC; -- Verify claimed vs actually-written ids for a compaction indicator SELECT json_extract(metadata, '$.compactedMessageCount') AS claimed, json_array_length(json_extract(metadata, '$.compactedMessageIds')) AS ids_len, json_extract(metadata, '$.compactedMessageIds[0]') AS sample_id FROM chat_messages WHERE thread_id = ' ' AND json_extract(metadata, '$.isCompactionIndicator') = 1 ORDER BY created_at DESC LIMIT 5; On unpatched Alma I see claimed=127, ids_len=114, and a meaningful fraction of those 114 entries are the empty string. Suggested fixes (two-sided): Alma main (defensive, fixes the bug regardless of plugin behavior): drop the id-intersection. The contract is already "KE summarizes the first compactedMessageCount entries of its input array", so persistCompactionResult should mark by array position — take the first result.compactedMessageCount rows of messages and set isCompacted: true on each. Deterministic and immune to empty/missing inner ids. Codex plugin (openai-codex-auth, root cause): subagent / tool-call message JSON should carry a non-empty id. Other plugins (Claude subscription, DeepSeek) already do this. Worth checking why Codex-emitted subagent rows are landing with id: "" in the stored message JSON. Responsibility: primary cause is Codex plugin emitting empty ids; Alma's persistCompactionResult should be defensive enough not to rely on them. Bug 3 — KE summarizer doesn't recognize prior summary/indicator messages (Alma main) Where: the KE function — slices input by position (e.slice(0, a)), doesn't inspect text or metadata. Problem: in the auto-compact and token-limit-triggered paths, callers pass bare message JSON (the inner parts object) and DB metadata (isCompactionIndicator, isCompactionSummary) is gone by the time it reaches KE. So previously generated summary blobs and the canonical indicator message can be re-fed into a new summary, producing nested "summary of summaries" with token bloat each round. Suggested source-level fix: at the entry of KE, defensively filter out messages whose text starts with the canonical indicator prefix: ---\n\n🗜️ **Context Compacted** These are formats KE itself produces, so strict-prefix matching has effectively zero false-positive risk. Also pass through metadata.isCompactionSummary / isCompactionIndicator / isCompacted when the caller has them, and skip those too. Responsibility: 100% Alma. Bug 4 — Codex plugin: needs a manual model refresh on every Alma startup (Codex plugin) Where: Codex plugin (openai-codex-auth / acp-extension-codex) on Alma startup. Problem: after a clean Alma launch, the Codex provider's models exist in settings, but Codex-backed threads behave as if the provider isn't fully initialized — instructions / tool-call behavior differ from how the same thread behaves later in the session. The fix every single time is: open Settings → Providers → Codex → click the model refresh button. After that one click, everything works for the rest of the session. Current workaround: I literally do that refresh after every Alma start. No error message — it just silently misbehaves until refreshed. Suspected cause (best guess from reading the bundle, would love confirmation): some piece of state (tool capability metadata? originalInstructions? a hasTools flag?) is computed lazily at provider-list refresh time, and the startup boot path doesn't reach it. The manual refresh click is what triggers metadata fetch + capability registration the agent path expects. A defensive consumer-side fix would be something like: const shouldInjectCodexAgentInstructions = hasTools || !originalInstructions; …so background paths still get correct Codex agent instructions even when per-provider lazy metadata hasn't populated yet. But the real fix is probably "always trigger the provider model refresh as part of plugin init on startup". Repro: Cmd+Q Alma → reopen → start a Codex-backed thread → observe behavior; then Settings → refresh Codex models → return to same thread → observe corrected behavior. Responsibility: Codex plugin. Bugs 1+2+3 form a chain — each is annoying alone but explainable, and together they produce the "compact does nothing" symptom Codex users hit (token never drops, same N messages summarized every time). Bug 2 won't manifest at all on Claude / DeepSeek threads but is brutal on Codex Auth threads (which I suspect is a large fraction of heavy users). Bug 4 is unrelated to compaction but affects the same plugin and has been my daily papercut for weeks. Environment Alma 0.0.798 macOS (Apple Silicon) Plugin: openai-codex-auth + acp-extension-codex Reproduced across multiple long-running Codex-backed threads (one of mine accumulated 18 compaction indicators with 0 isCompacted rows actually marked, before patching) Happy to share redacted message-id samples, full SQL probes, or notes on the local patch I'm running (it's a binary patch on the minified bundle — not useful as a PR, but the diff is small if it helps locate the exact functions). Thanks yetone for building Alma!

Barney Hong 17 days ago
Bug Reports
Crew Task sometimes returns previous tool result as completed output
Summary: Crew / Task agent runs sometimes finish with status completed, but the returned result is not the agent final summary. Instead it can contain a raw previous tool result such as: [Previous tool result; call_id=call_...]: {...} Impact: - Parent agent may trust a completed Task even though the deliverable is missing or incomplete. - Tool-result leakage pollutes conversation history, mission outputs, task result storage, and memory archives. - Coding tasks may leave partial edits without a clear failure signal. Recent observed cases on 2026-05-18: 1. Developer agent completed but returned a message saying implementation was interrupted midway and listed remaining work. 2. Developer agent completed but result was only a raw previous tool result from an Edit call: [Previous tool result; call_id=call_yZqXLVu0CqENETlYMRVOHUac]: {"file_path":"src/mock-runtime.mjs","replacements":1,"changed":true,...} Historical evidence: Running `alma memory grep "Previous tool result"` found similar leakage in SketchUp-related threads from 2026-04-29, 2026-04-30, 2026-05-11, and 2026-05-18. Mission output files also contained raw previous tool results, e.g.: ~/.config/alma/missions/*/sprints/*/attempt-*/generator-output.md Suspected root cause: The issue appears to be in crew / harness / agent adapter final-output extraction. If the sub-agent performs a tool call and then does not emit or expose a normal final assistant summary, the wrapper may fall back to the latest transcript item, which can be a tool result. The Task is then marked completed because the process ended cleanly, even though the handoff output is invalid. Expected behavior: - Raw tool results should never be used as final Task result. - If final output is missing/malformed or matches `[Previous tool result`, `call_id=`, raw JSON tool output, etc., mark as invalid/failed/needs retry, not completed. - If an agent explicitly says implementation is incomplete, do not treat the Task as successful acceptance completion. Suggested fix: 1. Validate final Task result before marking completed. 2. Prefer the last assistant-authored textual message over the last transcript item. 3. Add an `agent_output_invalid` or `incomplete` status. 4. Retry or ask the sub-agent to summarize actual changes when final output is invalid. 5. For coding agents, require a final summary with changed files and validation results, or an explicit failure/incomplete status. Temporary workaround: Parent agent will manually inspect files/tests after every crew completion and treat `[Previous tool result]` as invalid.

linqi zhang 18 days ago
Bug Reports
Crew Task sometimes returns previous tool result as completed output
Summary: Crew / Task agent runs sometimes finish with status completed, but the returned result is not the agent final summary. Instead it can contain a raw previous tool result such as: [Previous tool result; call_id=call_...]: {...} Impact: - Parent agent may trust a completed Task even though the deliverable is missing or incomplete. - Tool-result leakage pollutes conversation history, mission outputs, task result storage, and memory archives. - Coding tasks may leave partial edits without a clear failure signal. Recent observed cases on 2026-05-18: 1. Developer agent completed but returned a message saying implementation was interrupted midway and listed remaining work. 2. Developer agent completed but result was only a raw previous tool result from an Edit call: [Previous tool result; call_id=call_yZqXLVu0CqENETlYMRVOHUac]: {"file_path":"src/mock-runtime.mjs","replacements":1,"changed":true,...} Historical evidence: Running `alma memory grep "Previous tool result"` found similar leakage in SketchUp-related threads from 2026-04-29, 2026-04-30, 2026-05-11, and 2026-05-18. Mission output files also contained raw previous tool results, e.g.: ~/.config/alma/missions/*/sprints/*/attempt-*/generator-output.md Suspected root cause: The issue appears to be in crew / harness / agent adapter final-output extraction. If the sub-agent performs a tool call and then does not emit or expose a normal final assistant summary, the wrapper may fall back to the latest transcript item, which can be a tool result. The Task is then marked completed because the process ended cleanly, even though the handoff output is invalid. Expected behavior: - Raw tool results should never be used as final Task result. - If final output is missing/malformed or matches `[Previous tool result`, `call_id=`, raw JSON tool output, etc., mark as invalid/failed/needs retry, not completed. - If an agent explicitly says implementation is incomplete, do not treat the Task as successful acceptance completion. Suggested fix: 1. Validate final Task result before marking completed. 2. Prefer the last assistant-authored textual message over the last transcript item. 3. Add an `agent_output_invalid` or `incomplete` status. 4. Retry or ask the sub-agent to summarize actual changes when final output is invalid. 5. For coding agents, require a final summary with changed files and validation results, or an explicit failure/incomplete status. Temporary workaround: Parent agent will manually inspect files/tests after every crew completion and treat `[Previous tool result]` as invalid.

linqi zhang 18 days ago
Bug Reports