Compare commits

...

7 Commits

14 changed files with 1085 additions and 12 deletions

6
.gitignore vendored
View File

@@ -11,6 +11,8 @@ __pycache__/
ai/inbox/mattermost-latest.md
ai/inbox/mattermost-*.md
ai/inbox/mattermost-status.json
ai/inbox/mattermost-mirror/*
!ai/inbox/mattermost-mirror/.gitkeep
ai/inbox/photos/*
!ai/inbox/photos/.gitkeep
@@ -23,6 +25,7 @@ scripts/mattermost/.env
scripts/mattermost/.venv/
scripts/mattermost/generated/*
!scripts/mattermost/generated/.gitkeep
scripts/mattermost-proxy/.env
# Obsidian local runtime state
/.obsidian/
@@ -35,3 +38,6 @@ project-knowledge/.obsidian/plugins/
project-knowledge/.obsidian/snippets/
project-knowledge/.obsidian/cache/
.trash/
# Antigravity CLI local workspace configuration
.antigravitycli/

View File

@@ -61,6 +61,9 @@ Use this structure by default:
- For VS Code multi-root Copilot workflows, preserve repo-provided customizations such as `.github/prompts`, `.github/instructions`, `.github/agents`, `.github/skills`, and `AGENTS.md`. Shared `fidelity-ai-copilot` customizations should supplement these repo files, while repo-specific instructions should be treated as the practical authority when they conflict.
- For Fidelity Jira/Confluence access from GitHub Copilot CLI or VS Code, do not assume the approved access method. First have the target AI read the current Fidelity-provided human instructions from Confluence or local exported docs, then configure the smallest matching workflow. If those instructions require terminal `curl` with environment variables such as `COPILOT_JIRA_URL` and `COPILOT_JIRA_TOKEN`, enforce that path; otherwise follow the documented Fidelity-approved method. Never print, persist, or hardcode tokens.
- Treat `fidelity-ai-copilot` as a self-improving AI harness rather than a static prompt dump: the target AI should notice recurring useful workflows, newly discovered internal instructions, and tool changes, then propose small auditable updates to instructions, skills, prompts, agents, specs, or validation checklists. It should ask before making broad changes and keep product repos clean.
- For corporate-tool captures in `fidelity-ai-copilot`, prefer a single raw Charles Mirror source such as `archive/charles-mirror/` and treat it as read-only evidence organized by hostname. Generated Copilot outputs should be written to separate per-platform folders only when useful, with prompts requiring source inspection, narrow scope, local-only processing, and explicit evidence paths.
- When advising on `fidelity-ai-copilot` customization, use this routing: keep global safety and repo role in `AGENTS.md` / `.github/copilot-instructions.md`; use `.github/instructions/*.instructions.md` for path-scoped rules such as `archive/**`; use `.github/prompts/*.prompt.md` for repeatable slash-command tasks; use `.github/agents/*.agent.md` for persistent personas with tool restrictions and handoffs; use `.github/skills/*/SKILL.md` for reusable multi-step capabilities with scripts, examples, or resources. Prefer small, composable artifacts over one large instruction file.
- For read-only evidence prompts such as Discourse/Charles Mirror search, explicitly prevent the target AI from editing the prompt/configuration files while running the workflow. If Copilot changes `.github/prompts/*.prompt.md` during an evidence query, treat that as a workflow bug unless the user specifically asked to update the prompt.
- When the user says they will handle dependency alignment, registry configuration, or compile/test execution manually on the development machine, generated Copilot follow-ups should not ask Copilot to solve those dependency/tooling issues or run broad builds. Instead, ask Copilot for the smallest source-level fix for the specific compiler error the user provides, state that the user will rerun validation manually, and request a concise summary of changed files and expected validation impact.
---

View File

@@ -0,0 +1,104 @@
# Charles Session File Format (.chlsx)
Reference for AI agents that need to parse Charles Proxy session files.
---
## File Format
`.chlsx` is a **ZIP archive** containing numbered XML files, each representing one HTTP request/response pair.
### Structure inside the ZIP
```
session.chlsx
├── 00001.xml
├── 00002.xml
├── 00003.xml
├── ...
└── 00/
├── 00001.xml
├── 00002.xml
└── ...
```
Files may be flat at the root or grouped in two-digit subdirectories (`00/`, `01/`, etc.) depending on session size.
### XML Structure Per File
Each XML file contains:
- **Request**: method, URL, protocol, headers, body
- **Response**: status, protocol, headers, body
- **Timing**: start time, duration
Key XML elements:
```xml
<?xml version="1.0" encoding="UTF-8"?>
<session>
<request>
<method>GET</method>
<url>https://discourse.example.com/t/123.json</url>
<protocol>HTTP/1.1</protocol>
<header name="Accept">application/json</header>
<header name="Cookie">_t=abc123</header>
<body></body>
</request>
<response>
<status>200</status>
<protocol>HTTP/1.1</protocol>
<header name="Content-Type">application/json; charset=utf-8</header>
<body>{"id": 123, "title": "...", "post_stream": {...}}</body>
</response>
<timing>
<start>2026-01-15T10:30:00.000Z</start>
<duration>450</duration>
</timing>
</session>
```
### .chls vs .chlsx vs .chlsj
| Extension | Format | Notes |
|---|---|---|
| `.chls` | Binary | Legacy format, harder to parse |
| `.chlsx` | ZIP + XML | **Prefer this**. Most common modern format |
| `.chlsj` | JSON | Newer, less common; each session is one JSON file with an array of request/response objects |
**Recommendation**: Configure Charles to save as `.chlsx` (File → Save Session As... → choose `.chlsx`).
---
## Discourse API Endpoints to Look For
These are the endpoints worth extracting from a Charles session:
| Purpose | URL pattern | Parsing target |
|---|---|---|
| Topic feed | `/latest.json` | `topic_list.topics[]` |
| Category topics | `/c/{slug}.json` | `topic_list.topics[]` |
| Single topic | `/t/{id}.json` | The full topic with posts |
| Posts in topic | `/t/{id}/{page}.json` | Paginated posts |
| Search | `/search.json?q=...` | `topics[]`, `posts[]` |
| User activity | `/u/{username}/activity.json` | User posts/topics |
---
## Extraction Strategy for AI
1. **Open the `.chlsx` as a ZIP** (it is not encrypted)
2. **Iterate over all XML files** inside
3. For each XML, check if the request URL matches a Discourse API endpoint
4. Extract the JSON response body from `<response><body>`
5. Parse the JSON and convert to Markdown
6. Organize by topic ID + title for easy search
---
## Common Pitfalls
- Some responses are paginated (`/t/{id}.json?page=1`). Collect all pages for completeness.
- Binary responses (images, JS bundles) should be skipped.
- The same topic may appear multiple times in different Charles sessions; deduplicate by topic ID + last updated timestamp.
- Session cookies captured in Charles will be expired by the time the AI reads them; only the response data matters.

View File

@@ -0,0 +1,140 @@
---
type: copilot-prompt
status: ready
target: github-copilot
purpose: Parse Charles .chlsx sessions to create a searchable Discourse archive
---
# Copilot Prompt — Charles Discourse Archiver
Paste this into GitHub Copilot on the corporate device.
---
## Prompt
You are helping me build a local searchable archive of a Discourse forum from captured Charles Proxy session files.
### Background
I browse a Discourse forum in my browser while Charles Proxy records traffic. I save the session as a `.chlsx` file. Inside that file are all the HTTP request/response pairs for the pages I visited — including Discourse API calls that return structured JSON (topics, posts, categories, user profiles).
I need you to extract only the Discourse content and organize it into a Markdown archive that:
- Is searchable by an AI in future sessions
- Preserves topic titles, post authors, dates, and content
- Groups by category
- Deduplicates topics that appear across multiple sessions
### File format: `.chlsx`
`.chlsx` is a ZIP archive. Inside are numbered XML files (e.g. `00001.xml`, `00/00001.xml`). Each XML file represents one HTTP request/response pair with this structure:
```xml
<session>
<request>
<method>GET</method>
<url>https://forum.example.com/t/123.json</url>
<protocol>HTTP/1.1</protocol>
<header name="Cookie">...</header>
<body></body>
</request>
<response>
<status>200</status>
<protocol>HTTP/1.1</protocol>
<header name="Content-Type">application/json; charset=utf-8</header>
<body>{"id": 123, "title": "Some Topic", "post_stream": {...}}</body>
</response>
<timing>
<start>2026-01-15T10:30:00.000Z</start>
<duration>450</duration>
</timing>
</session>
```
### Discourse API endpoints to extract
| What | URL pattern | JSON fields |
|---|---|---|
| Latest topics | `/latest.json` | `topic_list.topics[].{id, title, slug, category_id, created_at, last_posted_at}` |
| Category index | `/categories.json` | `category_list.categories[].{id, name, slug}` |
| Single topic (with posts) | `/t/{id}.json` | `id, title, slug, category_id, post_stream.posts[].{username, cooked, created_at, post_number}` |
| Topic with page | `/t/{id}/{page}.json` | Same as above, paginated |
| User activity | `/u/{username}/activity.json` | `user_actions[]` |
| Search results | `/search.json?q=...` | `topics[]`, `posts[]` |
### What to do
1. **Open the `.chlsx` file** as a ZIP archive.
2. **List all XML files** inside (both flat and in subdirectories).
3. **For each XML file**, parse it and check if the request URL matches one of the Discourse endpoints above.
4. **Skip**: CSS, JS, images, font files, analytics, CDN assets, and any non-Discourse endpoint.
5. **Parse the JSON response body** from `<response><body>`.
6. **Create this folder structure** as output:
```
discourse-archive/
├── categories.json # All categories found
├── index.md # Master index (table of all topics with ID, title, date, category, URL)
├── topics/
│ ├── 123-your-topic-slug.md
│ ├── 456-another-topic.md
│ └── ...
```
### Markdown format per topic
Each topic file should be a clean Markdown document with YAML frontmatter:
```markdown
---
id: 123
title: "Your Topic Title"
slug: your-topic-slug
category: "Category Name"
created: 2026-01-15
updated: 2026-01-16
url: https://forum.example.com/t/your-topic-slug/123
---
# Your Topic Title
**Category**: Category Name
---
## Post 1 — @username1 (2026-01-15T10:30:00Z)
Post content here (HTML stripped, plain Markdown preferred).
---
## Post 2 — @username2 (2026-01-16T14:00:00Z)
More content.
---
```
### Deduplication rules
- If the same topic ID appears in multiple `.chlsx` files, keep the one with the most recent `last_posted_at`.
- If a session has page 2+ of a topic (`/t/123/2.json`), merge the posts with page 1.
- Never duplicate posts within a topic.
### What to do with the output
Place the resulting `discourse-archive/` folder in a location I can reference in future Copilot sessions. I will point Copilot to that folder when I need to search past Discourse conversations.
### Constraints
- Do not modify the original `.chlsx` file.
- Do not upload or send the extracted data anywhere — keep it local.
- If a topic has no readable content (deleted, access restricted), note it in the index but skip the full extraction.
- HTML in `cooked` fields should be converted to readable plain text / Markdown (Discourse stores posts as HTML in the JSON).
### First action
Ask me for:
1. The path to the `.chlsx` file (or files)
2. The Discourse base URL (so you can construct canonical topic URLs)
3. Where I want the output folder created

View File

View File

@@ -4,21 +4,28 @@
- Structured ticket artifacts with jira, patch, prompts, sessions
- Tolerant to pre-experiment fix steps to fix any build error between experiments
- Swift skill, specialized to debug prints
- Xcode Integration
- Xcode logs analysis
- Find and read the full Xcode artifacts
- Extract relevant logs
- Xcode Integration
- Efficient use of xcode commands to build, test, contexted for tuist, cocoapods, sample projects
- Execute unit test proficiently, like executing only new tests or related
- Investigation: Differences from skills with cli commands vs mcps
- Investigation: Differences of skills vs instructions from vscode copilot
- Investigation: Differences from agents vs skills using agent, what is more general? correct relationship and use
- Auto breakpoint management
- UITest integration
- IA Investigation
- Differences from skills with cli commands vs mcps
- Differences of skills vs instructions from vscode copilot
- Differences from agents vs skills using agent, what is more general? correct relationship and use
- Fidelity
- Charles Proxy integration
- LaunchDarkly integration
- Teams integration
- Splunk analyzer
- [x] Discourse integration
- ServiceNow access
- Photo uploader
- Start as a service
- Auto categorize by context
- Multi photos session, copy multiples images in clipboard
- [x] Multi photos session, copy multiples images in clipboard
- [ ] Start as a service
- [ ] Auto categorize by context
- Swiftlint integration
- Auto validator

View File

@@ -0,0 +1,38 @@
---
type: daily
project: fidelity
date: 2026-05-18
status: active
focus: [context-refresh, pdiap-12284]
work-items: [PDIAP-12284]
blockers: []
tags:
- daily
- fidelity
updated: 2026-05-18
---
# 2026-05-18
## Work Done
- Sent the daily scrum update for `PDIAP-12284 - Remove UIKit wrapping from XFlow`: continued SampleApp validation in both host modes, aligned the host-mode path with current flag behavior instead of the deprecated `enable-swift-ui` toggle, and started broader Fid4 smoke testing with temporary validation logs.
## Findings
- While refreshing context for Adam's duplicate-request question, David clarified that the April 4 follow-up in the `Production - Crypto Delinking issue` thread was from David/Jeff's side after Yuva's March 30 comment.
- That April 4 follow-up said the iOS SDK-side network requests looked correct and intentional, matched Android, and did not reproduce duplicate `open-account` API calls from the client side in non-prod.
- Therefore, prior `PDSPS-29371` context should not be summarized as a confirmed reproduction from the iOS SDK side; it is related background, but the previous investigation did not reproduce the issue from the client-side SDK path.
- Follow-up Copilot analysis suggested a plausible but unconfirmed link between REST migration and the current duplicate-page/account report: REST did not introduce a new duplicate-trigger mechanism by itself, but the REST/FTNetwork path may have changed timeout/error behavior enough to expose an existing XFlow re-trigger path under slow BPDC responses.
- Treat that REST link as a hypothesis requiring current logs, dates, versions, and timeout/error evidence before reporting it as root cause.
- Jeff asked whether the REST switch could have impacted Adam's duplicate-page/account report, while noting he assumed they were unrelated. David initially answered that REST should only affect XFlow API transport, not page sequencing or submission count, and offered to trace REST-toggle state once Adam provided an exact date and flow/page.
- After Adam provided more context, David updated Jeff that REST still should not be treated as a direct sequencing cause, but it cannot be fully ruled out because REST/FTNetwork timeout/error behavior might expose an existing XFlow retry or page-rebuild path under load.
- Jeff asked for either a proposed response to Adam or a statement that more information is needed, suggesting Adam should open a Discourse ticket and attach the relevant evidence if more detail is required.
---
## Next Steps
- Frame any update to Jeff as a context refresh: related prior investigation exists, but the previous iOS SDK-side review did not reproduce duplicate client-side `open-account` calls, so current logs/examples are needed before calling the new report the same issue or a regression.
- If discussing REST impact, separate confirmed facts from hypothesis: confirmed prior non-prod iOS review did not reproduce duplicate client-side calls; current hypothesis is that REST timeout/error semantics may expose the existing XFlow model-state retry/rebuild path under production load.
- Prepare a concise proposed response to Adam that asks for a Discourse ticket with exact incident date/time, affected flow/page, app/XFlowSDK version, REST state if known, user journey logs, and examples needed to compare against `PDSPS-29371` / `PDIAP-11561`.

View File

@@ -0,0 +1,21 @@
---
type: daily
project: fidelity
date: 2026-05-19
status: active
focus: [pdiap-12284, duplicate-ao-report]
work-items: [PDIAP-12284]
blockers: []
tags:
- daily
- fidelity
updated: 2026-05-19
---
# 2026-05-19
## Work Done
- Sent the daily scrum update for today.
- Proposed and sent the request for a Discourse ticket to Adam to obtain details (exact date/time, affected flow/page, build version, logs, and examples) for the duplicate account-opening report.
- Confirmed that no new Discourse ticket with the `xflog` tag has been posted yet.

View File

@@ -32,6 +32,12 @@ Promote durable facts into `project-knowledge/01-current/`, `project-knowledge/0
- [2026-05-05](2026-05-05.md)
- [2026-05-07](2026-05-07.md)
- [2026-05-08](2026-05-08.md)
- [2026-05-11](2026-05-11.md)
- [2026-05-12](2026-05-12.md)
- [2026-05-13](2026-05-13.md)
- [2026-05-14](2026-05-14.md)
- [2026-05-18](2026-05-18.md)
- [2026-05-19](2026-05-19.md)
---

View File

@@ -0,0 +1,27 @@
# Mattermost proxy mirror configuration.
# Copy to .env if you want local overrides. Do not commit .env.
# Optional: restrict capture to the Mattermost host. Use the host only, no scheme.
# If empty, the addon captures /api/v4 traffic from the proxied Mattermost app.
# Example: mm.all-win-solutions.app
MATTERMOST_MIRROR_HOST_ALLOW=
# Output directory for raw evidence and normalized AI-readable context.
MATTERMOST_MIRROR_DIR=ai/inbox/mattermost-mirror
# mitmproxy listener used by launch-mattermost.sh.
MATTERMOST_MIRROR_LISTEN_HOST=127.0.0.1
MATTERMOST_MIRROR_LISTEN_PORT=8080
# Keep the small AI context window bounded.
MATTERMOST_MIRROR_LATEST_LIMIT=200
# Optional channel allowlist. Comma-separated channel IDs. Empty means all captured channels.
MATTERMOST_MIRROR_CHANNEL_IDS=
# Write compact raw REST/WebSocket evidence in addition to normalized messages.
# Keep disabled by default to avoid large files.
MATTERMOST_MIRROR_WRITE_RAW=0
# Mattermost desktop app bundle.
MATTERMOST_APP_PATH=/Applications/Mattermost.app

View File

@@ -0,0 +1,151 @@
# Mattermost Proxy Mirror
Local read-only Mattermost Desktop mirror for AI workspace context.
This is for **raw evidence only**. It writes under `ai/inbox/mattermost-mirror/`; durable project memory still belongs in `project-knowledge/` after normal promotion rules.
## Why this exists
Mattermost Team Edition 11.4.2 exposes normal `/api/v4` REST and WebSocket traffic. When Mattermost Desktop is launched with Chromium/Electron's `--proxy-server` flag, `mitmproxy` can capture only that app without changing the macOS system proxy.
## Setup
1. Install `mitmproxy`.
2. Trust the mitmproxy certificate if HTTPS interception is not already working:
- Start `scripts/mattermost-proxy/run-mirror.sh`
- Open `http://mitm.it`
- Install/trust the certificate in Keychain.
3. Optional: copy `.env.example` to `.env` and set `MATTERMOST_MIRROR_HOST_ALLOW` to the exact Mattermost host, for example `mm.all-win-solutions.app`.
## Run day to day
Terminal 1:
```bash
scripts/mattermost-proxy/run-mirror.sh
```
Terminal 2:
```bash
scripts/mattermost-proxy/launch-mattermost.sh
```
This launches Mattermost Desktop through macOS LaunchServices with:
```bash
--proxy-server=http://127.0.0.1:8080
```
No global macOS proxy is required.
The helper intentionally uses `open -n /Applications/Mattermost.app --args ...`
instead of invoking `/Applications/Mattermost.app/Contents/MacOS/Mattermost`
directly. Direct binary launch can crash sandboxed Electron apps with Mach
rendezvous errors because their expected app/container parent process is
missing.
## Output layout
```text
ai/inbox/mattermost-mirror/
latest.jsonl # bounded AI-readable window
latest.md # bounded Markdown view
state.json # last seen by channel and user cache
index.json # date/channel/thread file map
refs/
channels.json # channel_id -> channel_name
users.json # user_id -> username
channels/<channel-name>/YYYY/MM/YYYY-MM-DD.jsonl
by-date/YYYY/MM/YYYY-MM-DD.jsonl
threads/<root-or-post-id>.jsonl
raw/YYYY/MM/YYYY-MM-DD-websocket.jsonl # only if MATTERMOST_MIRROR_WRITE_RAW=1
raw/YYYY/MM/YYYY-MM-DD-rest-flows.jsonl # only if MATTERMOST_MIRROR_WRITE_RAW=1
```
Use `latest.md` or `latest.jsonl` for quick AI context. Use `channels/...`
for conversation-focused analysis, `by-date/...` for standups or daily review,
and `threads/...` when a single discussion thread is the relevant evidence.
This mirrors Slack's export pattern of one folder per conversation with one file
per date, while adding Mattermost-specific thread views.
Direct-message channels are labeled as `dm-<user-a>--<user-b>` when the mirror
has seen enough user metadata to resolve the Mattermost channel ID. Group DMs
use `group-...`. If a DM was first captured before the relevant user metadata
arrived, the folder can temporarily use raw IDs; later captures use the readable
label and `refs/channels.json` remains the source for resolving channel IDs.
The mirror writes any post payload it sees, including older messages returned
when the desktop app loads channel history or a thread. It dedupes by `post_id`,
so scrolling back through useful history is a safe way to backfill missing local
evidence without creating repeated entries.
## Normalized message schema
Each line in the normalized JSONL contains:
```json
{
"source": "websocket|rest",
"captured_at": "2026-05-19T...Z",
"created_at": "2026-05-19T...Z",
"created_at_ms": 1779190000000,
"channel_id": "...",
"channel_name": "fidelity-preguntas",
"post_id": "...",
"root_id": "...",
"thread_id": "...",
"user_id": "...",
"username": "jeff",
"message": "...",
"type": "channel_post|thread_reply",
"raw_event": "posted|posts|post"
}
```
## Safety rules
- The addon allowlists Mattermost hosts and `/api/v4` traffic only.
- Headers such as `Authorization`, `Cookie`, `Set-Cookie`, and CSRF are redacted in optional raw output.
- Optional raw output is disabled by default to prevent large files.
- Attachments are not downloaded by this mirror.
- The mirror is evidence, not canonical memory.
## Useful environment variables
- `MATTERMOST_MIRROR_HOST_ALLOW`: exact host or parent domain to capture.
- `MATTERMOST_MIRROR_DIR`: output directory, default `ai/inbox/mattermost-mirror`.
- `MATTERMOST_MIRROR_LATEST_LIMIT`: number of messages in `latest.*`, default `200`.
- `MATTERMOST_MIRROR_CHANNEL_IDS`: optional comma-separated channel ID allowlist.
- `MATTERMOST_MIRROR_WRITE_RAW`: set to `1` to save compact raw REST/WebSocket evidence.
- `MATTERMOST_APP_PATH`: Mattermost Desktop `.app` bundle path.
## Troubleshooting
### TLS certificate warnings
Mitmproxy uses a persistent local CA under `~/.mitmproxy`. If the desktop app
asks about the certificate after every proxy restart, install and trust that CA
in macOS Keychain instead of approving it only in the app prompt:
1. Start `scripts/mattermost-proxy/run-mirror.sh`.
2. Open `http://mitm.it` from a browser on this Mac and download the macOS certificate.
3. Add it to Keychain Access and set it to **Always Trust**.
4. Restart Mattermost Desktop through `launch-mattermost.sh`.
Warnings for unrelated hosts such as `releases.mattermost.com` or OpenGraph
preview hosts are not required for message capture. The mirror only writes
normalized messages from Mattermost `/api/v4` REST/WebSocket payloads.
### Proxy logs show traffic but no `latest.md`
The mirror writes files only after it sees a post payload. Startup calls such as
`/api/v4/teams`, `/api/v4/users`, `/api/v4/files`, or WebSocket ping/ack events
do not create message files. Open a channel, open a thread, scroll slightly in
history, or wait for/send a new message. Then check:
```text
ai/inbox/mattermost-mirror/latest.md
ai/inbox/mattermost-mirror/channels/<channel-name>/YYYY/MM/YYYY-MM-DD.jsonl
ai/inbox/mattermost-mirror/by-date/YYYY/MM/YYYY-MM-DD.jsonl
```

View File

@@ -0,0 +1,26 @@
#!/usr/bin/env bash
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
if [ -f "$SCRIPT_DIR/.env" ]; then
set -a
# shellcheck source=/dev/null
source "$SCRIPT_DIR/.env"
set +a
fi
APP_PATH="${MATTERMOST_APP_PATH:-/Applications/Mattermost.app}"
PROXY_HOST="${MATTERMOST_MIRROR_LISTEN_HOST:-127.0.0.1}"
PROXY_PORT="${MATTERMOST_MIRROR_LISTEN_PORT:-8080}"
if [ ! -d "$APP_PATH" ]; then
echo "Mattermost app bundle not found: $APP_PATH" >&2
echo "Set MATTERMOST_APP_PATH in scripts/mattermost-proxy/.env if needed." >&2
exit 1
fi
# Prefer macOS LaunchServices over invoking the Electron binary directly.
# Direct binary launch can crash sandboxed Electron apps with Mach rendezvous
# errors because their expected app/container parent process is missing.
exec open -n "$APP_PATH" --args --proxy-server="http://${PROXY_HOST}:${PROXY_PORT}"

View File

@@ -0,0 +1,514 @@
"""mitmproxy addon for a local Mattermost Desktop mirror.
This addon is intentionally narrow:
- allowlist a Mattermost host
- inspect only /api/v4 REST and WebSocket traffic
- redact secrets
- normalize posts into date-rotated JSONL files for AI context
The output under ai/inbox/ is raw evidence, not canonical project memory.
"""
from __future__ import annotations
import json
import os
import re
import tempfile
from datetime import datetime, timezone
from pathlib import Path
from typing import Any
from urllib.parse import urlparse
from mitmproxy import http
DEFAULT_OUT_DIR = "ai/inbox/mattermost-mirror"
POST_ID_RE = re.compile(r"^[a-z0-9]{26}$")
SAFE_NAME_RE = re.compile(r"[^a-zA-Z0-9._-]+")
def env_bool(name: str, default: bool = False) -> bool:
raw = os.getenv(name)
if raw is None:
return default
return raw.strip().lower() in {"1", "true", "yes", "on"}
def split_csv(raw: str) -> set[str]:
return {item.strip() for item in raw.replace("\n", ",").split(",") if item.strip()}
class MattermostMirror:
def __init__(self) -> None:
self.out_dir = Path(os.getenv("MATTERMOST_MIRROR_DIR", DEFAULT_OUT_DIR)).resolve()
self.host_allow = os.getenv("MATTERMOST_MIRROR_HOST_ALLOW", "").strip().lower()
self.channel_allow = split_csv(os.getenv("MATTERMOST_MIRROR_CHANNEL_IDS", ""))
self.latest_limit = int(os.getenv("MATTERMOST_MIRROR_LATEST_LIMIT", "200"))
self.write_raw = env_bool("MATTERMOST_MIRROR_WRITE_RAW", default=False)
self.channels_dir = self.out_dir / "channels"
self.by_date_dir = self.out_dir / "by-date"
self.threads_dir = self.out_dir / "threads"
self.refs_dir = self.out_dir / "refs"
self.raw_dir = self.out_dir / "raw"
self.state_path = self.out_dir / "state.json"
self.index_path = self.out_dir / "index.json"
self.latest_jsonl_path = self.out_dir / "latest.jsonl"
self.latest_md_path = self.out_dir / "latest.md"
self.seen_post_ids: set[str] = set()
self.seen_by_file: dict[Path, set[str]] = {}
self.users: dict[str, str] = {}
self.channels: dict[str, str] = {}
self.channel_meta: dict[str, dict[str, Any]] = {}
self.state: dict[str, Any] = {"channels": {}, "users": {}, "updated_at": None}
self._ensure_dirs()
self._load_state()
self._load_recent_seen_ids()
def _ensure_dirs(self) -> None:
self.channels_dir.mkdir(parents=True, exist_ok=True)
self.by_date_dir.mkdir(parents=True, exist_ok=True)
self.threads_dir.mkdir(parents=True, exist_ok=True)
self.refs_dir.mkdir(parents=True, exist_ok=True)
self.raw_dir.mkdir(parents=True, exist_ok=True)
def _load_state(self) -> None:
if not self.state_path.exists():
return
try:
self.state = json.loads(self.state_path.read_text(encoding="utf-8"))
self.users = dict(self.state.get("users") or {})
self.channel_meta = dict(self.state.get("channel_meta") or {})
for channel_id, value in (self.state.get("channels") or {}).items():
if isinstance(value, dict):
name = value.get("channel_name") or value.get("name")
if name:
self.channels[channel_id] = name
except Exception:
self.state = {"channels": {}, "users": {}, "updated_at": None}
def _load_recent_seen_ids(self) -> None:
# Bound startup work: latest.jsonl contains the hot dedupe window. Daily
# files are loaded lazily when older/backfilled messages are encountered.
today = datetime.now(timezone.utc)
for path in [self.latest_jsonl_path, self._daily_by_date_path(today)]:
if not path.exists():
continue
try:
ids = self._load_seen_ids_for_file(path)
self.seen_post_ids.update(ids)
except Exception:
continue
def _load_seen_ids_for_file(self, path: Path) -> set[str]:
if path in self.seen_by_file:
return self.seen_by_file[path]
ids: set[str] = set()
if path.exists():
try:
with path.open("r", encoding="utf-8") as handle:
for line in handle:
if not line.strip():
continue
obj = json.loads(line)
post_id = obj.get("post_id")
if post_id:
ids.add(post_id)
except Exception:
ids = set()
self.seen_by_file[path] = ids
return ids
def _atomic_write_text(self, path: Path, text: str) -> None:
path.parent.mkdir(parents=True, exist_ok=True)
with tempfile.NamedTemporaryFile("w", encoding="utf-8", dir=str(path.parent), delete=False) as tmp:
tmp.write(text)
tmp_path = Path(tmp.name)
tmp_path.replace(path)
def _append_jsonl(self, path: Path, obj: dict[str, Any]) -> None:
path.parent.mkdir(parents=True, exist_ok=True)
with path.open("a", encoding="utf-8") as handle:
handle.write(json.dumps(obj, ensure_ascii=False, sort_keys=True) + "\n")
def _dt_from_ms(self, value: Any) -> datetime:
try:
ms = int(value)
if ms > 0:
return datetime.fromtimestamp(ms / 1000, timezone.utc)
except Exception:
pass
return datetime.now(timezone.utc)
def _safe_name(self, value: str | None, fallback: str = "unknown") -> str:
raw = (value or fallback).strip() or fallback
safe = SAFE_NAME_RE.sub("-", raw).strip("-._")
return safe or fallback
def _daily_channel_path(self, dt: datetime, channel_name: str | None, channel_id: str | None) -> Path:
channel_slug = self._safe_name(channel_name or channel_id, fallback="unknown-channel")
return self.channels_dir / channel_slug / f"{dt:%Y}" / f"{dt:%m}" / f"{dt:%Y-%m-%d}.jsonl"
def _daily_by_date_path(self, dt: datetime) -> Path:
return self.by_date_dir / f"{dt:%Y}" / f"{dt:%m}" / f"{dt:%Y-%m-%d}.jsonl"
def _thread_path(self, thread_id: str | None) -> Path | None:
if not thread_id:
return None
return self.threads_dir / f"{self._safe_name(thread_id)}.jsonl"
def _daily_raw_path(self, dt: datetime, suffix: str) -> Path:
return self.raw_dir / f"{dt:%Y}" / f"{dt:%m}" / f"{dt:%Y-%m-%d}-{suffix}.jsonl"
def _safe_url(self, url: str) -> str:
parsed = urlparse(url)
return parsed._replace(query=parsed.query, fragment="").geturl()
def _is_allowed_host(self, host: str) -> bool:
host = host.lower()
if self.host_allow:
return host == self.host_allow or host.endswith(f".{self.host_allow}")
# The launched Mattermost Desktop app is already scoped to this proxy.
# Some company hosts do not include "mattermost" in the hostname
# (for example, mm.example.com), so default to allowing the proxied
# app's /api/v4 traffic when no explicit host allowlist is configured.
return True
def _is_allowed_channel(self, channel_id: str | None) -> bool:
if not self.channel_allow:
return True
return bool(channel_id and channel_id in self.channel_allow)
def _capture_flow(self, flow: http.HTTPFlow) -> bool:
return self._is_allowed_host(flow.request.pretty_host) and "/api/v4/" in flow.request.path
def _redact_headers(self, headers: Any) -> dict[str, str]:
redacted: dict[str, str] = {}
for key, value in headers.items():
lowered = key.lower()
if lowered in {"authorization", "cookie", "set-cookie", "x-csrf-token"}:
redacted[key] = "[REDACTED]"
else:
redacted[key] = str(value)
return redacted
def _remember_user(self, user: dict[str, Any]) -> None:
user_id = user.get("id")
if not user_id:
return
username = user.get("username") or user.get("nickname") or user.get("first_name") or user_id
self.users[user_id] = username
self._write_refs()
def _remember_channel(self, channel: dict[str, Any]) -> None:
channel_id = channel.get("id")
if not channel_id:
return
self.channel_meta[channel_id] = channel
name = self._channel_label(channel)
self.channels[channel_id] = name
self._write_refs()
def _user_label(self, user_id: str | None) -> str | None:
if not user_id:
return None
return self.users.get(user_id) or user_id
def _channel_label(self, channel: dict[str, Any]) -> str:
channel_id = channel.get("id") or "unknown-channel"
channel_type = channel.get("type")
display_name = (channel.get("display_name") or "").strip()
name = (channel.get("name") or "").strip()
if channel_type == "D":
user_ids = [item for item in name.split("__") if item]
labels = [self._user_label(user_id) or user_id for user_id in user_ids]
if labels:
return "dm-" + "--".join(labels)
if channel_type == "G":
if display_name:
return "group-" + display_name
user_ids = [item for item in name.split("__") if item]
labels = [self._user_label(user_id) or user_id for user_id in user_ids]
if labels:
return "group-" + "--".join(labels)
return display_name or name or channel_id
def _refresh_channel_labels(self) -> None:
changed = False
for channel_id, meta in self.channel_meta.items():
label = self._channel_label(meta)
if label and self.channels.get(channel_id) != label:
self.channels[channel_id] = label
changed = True
if changed:
self._write_refs()
def _write_refs(self) -> None:
users_path = self.refs_dir / "users.json"
channels_path = self.refs_dir / "channels.json"
self._atomic_write_text(users_path, json.dumps(self.users, ensure_ascii=False, indent=2, sort_keys=True) + "\n")
self._atomic_write_text(channels_path, json.dumps(self.channels, ensure_ascii=False, indent=2, sort_keys=True) + "\n")
def _ingest_reference_payload(self, payload: Any) -> None:
if isinstance(payload, list):
for item in payload:
self._ingest_reference_payload(item)
return
if not isinstance(payload, dict):
return
if payload.get("id") and ("username" in payload or "first_name" in payload):
self._remember_user(payload)
if payload.get("id") and ("display_name" in payload or "team_id" in payload) and "type" in payload:
self._remember_channel(payload)
users = payload.get("users")
if isinstance(users, dict):
for user in users.values():
if isinstance(user, dict):
self._remember_user(user)
elif isinstance(users, list):
for user in users:
if isinstance(user, dict):
self._remember_user(user)
channels = payload.get("channels")
if isinstance(channels, list):
for channel in channels:
if isinstance(channel, dict):
self._remember_channel(channel)
self._refresh_channel_labels()
def _normalize_post(self, post: dict[str, Any], source: str, raw_event: str | None = None) -> dict[str, Any] | None:
post_id = post.get("id")
channel_id = post.get("channel_id")
if not post_id or not POST_ID_RE.match(str(post_id)):
return None
if not self._is_allowed_channel(channel_id):
return None
created_dt = self._dt_from_ms(post.get("create_at"))
root_id = post.get("root_id") or None
user_id = post.get("user_id") or None
message = post.get("message") or ""
message_type = "thread_reply" if root_id else "channel_post"
return {
"source": source,
"captured_at": datetime.now(timezone.utc).isoformat(),
"created_at": created_dt.isoformat(),
"created_at_ms": int(post.get("create_at") or created_dt.timestamp() * 1000),
"updated_at_ms": int(post.get("update_at") or 0),
"channel_id": channel_id,
"channel_name": self.channels.get(channel_id) if channel_id else None,
"post_id": post_id,
"root_id": root_id,
"thread_id": root_id or post_id,
"user_id": user_id,
"username": self.users.get(user_id) if user_id else None,
"message": message,
"type": message_type,
"raw_event": raw_event,
"props": post.get("props") or {},
}
def _write_message(self, msg: dict[str, Any]) -> None:
post_id = msg["post_id"]
created_dt = self._dt_from_ms(msg.get("created_at_ms"))
channel_path = self._daily_channel_path(created_dt, msg.get("channel_name"), msg.get("channel_id"))
by_date_path = self._daily_by_date_path(created_dt)
thread_path = self._thread_path(msg.get("thread_id"))
channel_seen = self._load_seen_ids_for_file(channel_path)
by_date_seen = self._load_seen_ids_for_file(by_date_path)
if post_id in self.seen_post_ids or post_id in channel_seen or post_id in by_date_seen:
return
self.seen_post_ids.add(post_id)
channel_seen.add(post_id)
by_date_seen.add(post_id)
self._append_jsonl(channel_path, msg)
self._append_jsonl(by_date_path, msg)
if thread_path:
thread_seen = self._load_seen_ids_for_file(thread_path)
if post_id not in thread_seen:
thread_seen.add(post_id)
self._append_jsonl(thread_path, msg)
self._update_state(msg)
self._update_latest(msg)
self._update_index(created_dt, msg)
def _update_state(self, msg: dict[str, Any]) -> None:
channel_id = msg.get("channel_id") or "unknown"
channels = self.state.setdefault("channels", {})
entry = channels.setdefault(channel_id, {})
if msg.get("channel_name"):
entry["channel_name"] = msg.get("channel_name")
entry["last_seen_create_at"] = max(int(entry.get("last_seen_create_at") or 0), int(msg.get("created_at_ms") or 0))
entry["last_seen_post_id"] = msg.get("post_id")
self.state["users"] = self.users
self.state["channel_meta"] = self.channel_meta
self.state["updated_at"] = datetime.now(timezone.utc).isoformat()
self._atomic_write_text(self.state_path, json.dumps(self.state, ensure_ascii=False, indent=2, sort_keys=True) + "\n")
self._write_refs()
def _read_jsonl(self, path: Path) -> list[dict[str, Any]]:
if not path.exists():
return []
records: list[dict[str, Any]] = []
try:
with path.open("r", encoding="utf-8") as handle:
for line in handle:
if line.strip():
records.append(json.loads(line))
except Exception:
return []
return records
def _update_latest(self, msg: dict[str, Any]) -> None:
records = self._read_jsonl(self.latest_jsonl_path)
by_id: dict[str, dict[str, Any]] = {item.get("post_id"): item for item in records if item.get("post_id")}
by_id[msg["post_id"]] = msg
latest = sorted(by_id.values(), key=lambda item: int(item.get("created_at_ms") or 0))[-self.latest_limit :]
jsonl = "".join(json.dumps(item, ensure_ascii=False, sort_keys=True) + "\n" for item in latest)
self._atomic_write_text(self.latest_jsonl_path, jsonl)
self._atomic_write_text(self.latest_md_path, self._render_latest_md(latest))
def _render_latest_md(self, records: list[dict[str, Any]]) -> str:
lines = ["# Latest Mattermost Mirror", "", "Generated from local proxy mirror evidence.", ""]
current_channel = None
for item in records:
channel = item.get("channel_name") or item.get("channel_id") or "unknown-channel"
if channel != current_channel:
lines.extend([f"## {channel}", ""])
current_channel = channel
author = item.get("username") or item.get("user_id") or "unknown-user"
created = item.get("created_at") or "unknown-time"
prefix = "reply" if item.get("type") == "thread_reply" else "post"
text = (item.get("message") or "").strip()
lines.append(f"- {created} {author} ({prefix} `{item.get('post_id')}`): {text}")
lines.append("")
return "\n".join(lines)
def _update_index(self, dt: datetime, msg: dict[str, Any]) -> None:
index: dict[str, Any] = {"dates": [], "channels": {}, "updated_at": None}
if self.index_path.exists():
try:
index = json.loads(self.index_path.read_text(encoding="utf-8"))
except Exception:
pass
date_key = f"{dt:%Y-%m-%d}"
channel_path = self._daily_channel_path(dt, msg.get("channel_name"), msg.get("channel_id"))
by_date_path = self._daily_by_date_path(dt)
thread_path = self._thread_path(msg.get("thread_id"))
channel_rel_path = str(channel_path.relative_to(self.out_dir))
by_date_rel_path = str(by_date_path.relative_to(self.out_dir))
dates = set(index.get("dates") or [])
dates.add(date_key)
index["dates"] = sorted(dates)
by_date = index.setdefault("by_date", {})
by_date[date_key] = by_date_rel_path
channel_key = msg.get("channel_name") or msg.get("channel_id") or "unknown-channel"
channels = index.setdefault("channels", {})
channel_entry = channels.setdefault(channel_key, {"channel_id": msg.get("channel_id"), "files": []})
channel_entry["channel_id"] = msg.get("channel_id")
files = set(channel_entry.get("files") or [])
files.add(channel_rel_path)
channel_entry["files"] = sorted(files)
if thread_path:
threads = index.setdefault("threads", {})
threads[msg.get("thread_id")] = str(thread_path.relative_to(self.out_dir))
index["updated_at"] = datetime.now(timezone.utc).isoformat()
self._atomic_write_text(self.index_path, json.dumps(index, ensure_ascii=False, indent=2, sort_keys=True) + "\n")
def _write_raw(self, suffix: str, obj: dict[str, Any]) -> None:
if not self.write_raw:
return
self._append_jsonl(self._daily_raw_path(datetime.now(timezone.utc), suffix), obj)
def response(self, flow: http.HTTPFlow) -> None:
if not self._capture_flow(flow) or not flow.response:
return
content_type = flow.response.headers.get("content-type", "")
if "json" not in content_type:
return
try:
payload = flow.response.json()
except Exception:
return
self._ingest_reference_payload(payload)
path = flow.request.path
raw_record = {
"captured_at": datetime.now(timezone.utc).isoformat(),
"method": flow.request.method,
"url": self._safe_url(flow.request.pretty_url),
"path": path,
"status_code": flow.response.status_code,
"request_headers": self._redact_headers(flow.request.headers),
"response": payload,
}
self._write_raw("rest-flows", raw_record)
# Mattermost post-list shape: { order: [...], posts: {post_id: {...}} }
if isinstance(payload, dict) and isinstance(payload.get("posts"), dict):
for post in payload["posts"].values():
if isinstance(post, dict):
normalized = self._normalize_post(post, source="rest", raw_event="posts")
if normalized:
self._write_message(normalized)
elif isinstance(payload, dict) and payload.get("id") and payload.get("message") is not None:
normalized = self._normalize_post(payload, source="rest", raw_event="post")
if normalized:
self._write_message(normalized)
def websocket_message(self, flow: http.HTTPFlow) -> None:
if not self._is_allowed_host(flow.request.pretty_host):
return
if "/api/v4/websocket" not in flow.request.path:
return
if not flow.websocket or not flow.websocket.messages:
return
message = flow.websocket.messages[-1]
if message.from_client:
return
try:
text = message.content.decode("utf-8") if isinstance(message.content, bytes) else str(message.content)
payload = json.loads(text)
except Exception:
return
self._write_raw("websocket", {
"captured_at": datetime.now(timezone.utc).isoformat(),
"url": self._safe_url(flow.request.pretty_url),
"event": payload.get("event"),
"seq": payload.get("seq"),
"data": payload.get("data"),
"broadcast": payload.get("broadcast"),
})
event = payload.get("event")
if event != "posted":
return
data = payload.get("data") or {}
post_raw = data.get("post")
if not post_raw:
return
try:
post = json.loads(post_raw)
except Exception:
return
normalized = self._normalize_post(post, source="websocket", raw_event=event)
if normalized:
self._write_message(normalized)
addons = [MattermostMirror()]

View File

@@ -0,0 +1,30 @@
#!/usr/bin/env bash
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
WORKSPACE_ROOT="$(cd "$SCRIPT_DIR/../.." && pwd)"
if [ -f "$SCRIPT_DIR/.env" ]; then
set -a
# shellcheck source=/dev/null
source "$SCRIPT_DIR/.env"
set +a
fi
export MATTERMOST_MIRROR_DIR="${MATTERMOST_MIRROR_DIR:-$WORKSPACE_ROOT/ai/inbox/mattermost-mirror}"
export MATTERMOST_MIRROR_LISTEN_HOST="${MATTERMOST_MIRROR_LISTEN_HOST:-127.0.0.1}"
export MATTERMOST_MIRROR_LISTEN_PORT="${MATTERMOST_MIRROR_LISTEN_PORT:-8080}"
mkdir -p "$MATTERMOST_MIRROR_DIR"
echo "Mattermost proxy mirror output: $MATTERMOST_MIRROR_DIR"
echo "Listening on ${MATTERMOST_MIRROR_LISTEN_HOST}:${MATTERMOST_MIRROR_LISTEN_PORT}"
echo "Launch Mattermost Desktop with: scripts/mattermost-proxy/launch-mattermost.sh"
if [ -z "${MATTERMOST_MIRROR_HOST_ALLOW:-}" ]; then
echo "MATTERMOST_MIRROR_HOST_ALLOW is not set; capturing /api/v4 traffic from the proxied app."
fi
exec mitmdump \
--listen-host "$MATTERMOST_MIRROR_LISTEN_HOST" \
--listen-port "$MATTERMOST_MIRROR_LISTEN_PORT" \
-s "$SCRIPT_DIR/mattermost_mirror.py"