Compare commits
7 Commits
8950cfcdf0
...
e081360a84
| Author | SHA1 | Date | |
|---|---|---|---|
| e081360a84 | |||
| d318701899 | |||
| 3816487bec | |||
| b886c61afd | |||
| 9dd731f758 | |||
| 73166b585f | |||
| f726814811 |
6
.gitignore
vendored
6
.gitignore
vendored
@@ -11,6 +11,8 @@ __pycache__/
|
||||
ai/inbox/mattermost-latest.md
|
||||
ai/inbox/mattermost-*.md
|
||||
ai/inbox/mattermost-status.json
|
||||
ai/inbox/mattermost-mirror/*
|
||||
!ai/inbox/mattermost-mirror/.gitkeep
|
||||
ai/inbox/photos/*
|
||||
!ai/inbox/photos/.gitkeep
|
||||
|
||||
@@ -23,6 +25,7 @@ scripts/mattermost/.env
|
||||
scripts/mattermost/.venv/
|
||||
scripts/mattermost/generated/*
|
||||
!scripts/mattermost/generated/.gitkeep
|
||||
scripts/mattermost-proxy/.env
|
||||
|
||||
# Obsidian local runtime state
|
||||
/.obsidian/
|
||||
@@ -35,3 +38,6 @@ project-knowledge/.obsidian/plugins/
|
||||
project-knowledge/.obsidian/snippets/
|
||||
project-knowledge/.obsidian/cache/
|
||||
.trash/
|
||||
|
||||
# Antigravity CLI local workspace configuration
|
||||
.antigravitycli/
|
||||
|
||||
@@ -61,6 +61,9 @@ Use this structure by default:
|
||||
- For VS Code multi-root Copilot workflows, preserve repo-provided customizations such as `.github/prompts`, `.github/instructions`, `.github/agents`, `.github/skills`, and `AGENTS.md`. Shared `fidelity-ai-copilot` customizations should supplement these repo files, while repo-specific instructions should be treated as the practical authority when they conflict.
|
||||
- For Fidelity Jira/Confluence access from GitHub Copilot CLI or VS Code, do not assume the approved access method. First have the target AI read the current Fidelity-provided human instructions from Confluence or local exported docs, then configure the smallest matching workflow. If those instructions require terminal `curl` with environment variables such as `COPILOT_JIRA_URL` and `COPILOT_JIRA_TOKEN`, enforce that path; otherwise follow the documented Fidelity-approved method. Never print, persist, or hardcode tokens.
|
||||
- Treat `fidelity-ai-copilot` as a self-improving AI harness rather than a static prompt dump: the target AI should notice recurring useful workflows, newly discovered internal instructions, and tool changes, then propose small auditable updates to instructions, skills, prompts, agents, specs, or validation checklists. It should ask before making broad changes and keep product repos clean.
|
||||
- For corporate-tool captures in `fidelity-ai-copilot`, prefer a single raw Charles Mirror source such as `archive/charles-mirror/` and treat it as read-only evidence organized by hostname. Generated Copilot outputs should be written to separate per-platform folders only when useful, with prompts requiring source inspection, narrow scope, local-only processing, and explicit evidence paths.
|
||||
- When advising on `fidelity-ai-copilot` customization, use this routing: keep global safety and repo role in `AGENTS.md` / `.github/copilot-instructions.md`; use `.github/instructions/*.instructions.md` for path-scoped rules such as `archive/**`; use `.github/prompts/*.prompt.md` for repeatable slash-command tasks; use `.github/agents/*.agent.md` for persistent personas with tool restrictions and handoffs; use `.github/skills/*/SKILL.md` for reusable multi-step capabilities with scripts, examples, or resources. Prefer small, composable artifacts over one large instruction file.
|
||||
- For read-only evidence prompts such as Discourse/Charles Mirror search, explicitly prevent the target AI from editing the prompt/configuration files while running the workflow. If Copilot changes `.github/prompts/*.prompt.md` during an evidence query, treat that as a workflow bug unless the user specifically asked to update the prompt.
|
||||
- When the user says they will handle dependency alignment, registry configuration, or compile/test execution manually on the development machine, generated Copilot follow-ups should not ask Copilot to solve those dependency/tooling issues or run broad builds. Instead, ask Copilot for the smallest source-level fix for the specific compiler error the user provides, state that the user will rerun validation manually, and request a concise summary of changed files and expected validation impact.
|
||||
|
||||
---
|
||||
|
||||
104
ai/discourse-archive/charles-session-format.md
Normal file
104
ai/discourse-archive/charles-session-format.md
Normal file
@@ -0,0 +1,104 @@
|
||||
# Charles Session File Format (.chlsx)
|
||||
|
||||
Reference for AI agents that need to parse Charles Proxy session files.
|
||||
|
||||
---
|
||||
|
||||
## File Format
|
||||
|
||||
`.chlsx` is a **ZIP archive** containing numbered XML files, each representing one HTTP request/response pair.
|
||||
|
||||
### Structure inside the ZIP
|
||||
|
||||
```
|
||||
session.chlsx
|
||||
├── 00001.xml
|
||||
├── 00002.xml
|
||||
├── 00003.xml
|
||||
├── ...
|
||||
└── 00/
|
||||
├── 00001.xml
|
||||
├── 00002.xml
|
||||
└── ...
|
||||
```
|
||||
|
||||
Files may be flat at the root or grouped in two-digit subdirectories (`00/`, `01/`, etc.) depending on session size.
|
||||
|
||||
### XML Structure Per File
|
||||
|
||||
Each XML file contains:
|
||||
|
||||
- **Request**: method, URL, protocol, headers, body
|
||||
- **Response**: status, protocol, headers, body
|
||||
- **Timing**: start time, duration
|
||||
|
||||
Key XML elements:
|
||||
|
||||
```xml
|
||||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<session>
|
||||
<request>
|
||||
<method>GET</method>
|
||||
<url>https://discourse.example.com/t/123.json</url>
|
||||
<protocol>HTTP/1.1</protocol>
|
||||
<header name="Accept">application/json</header>
|
||||
<header name="Cookie">_t=abc123</header>
|
||||
<body></body>
|
||||
</request>
|
||||
<response>
|
||||
<status>200</status>
|
||||
<protocol>HTTP/1.1</protocol>
|
||||
<header name="Content-Type">application/json; charset=utf-8</header>
|
||||
<body>{"id": 123, "title": "...", "post_stream": {...}}</body>
|
||||
</response>
|
||||
<timing>
|
||||
<start>2026-01-15T10:30:00.000Z</start>
|
||||
<duration>450</duration>
|
||||
</timing>
|
||||
</session>
|
||||
```
|
||||
|
||||
### .chls vs .chlsx vs .chlsj
|
||||
|
||||
| Extension | Format | Notes |
|
||||
|---|---|---|
|
||||
| `.chls` | Binary | Legacy format, harder to parse |
|
||||
| `.chlsx` | ZIP + XML | **Prefer this**. Most common modern format |
|
||||
| `.chlsj` | JSON | Newer, less common; each session is one JSON file with an array of request/response objects |
|
||||
|
||||
**Recommendation**: Configure Charles to save as `.chlsx` (File → Save Session As... → choose `.chlsx`).
|
||||
|
||||
---
|
||||
|
||||
## Discourse API Endpoints to Look For
|
||||
|
||||
These are the endpoints worth extracting from a Charles session:
|
||||
|
||||
| Purpose | URL pattern | Parsing target |
|
||||
|---|---|---|
|
||||
| Topic feed | `/latest.json` | `topic_list.topics[]` |
|
||||
| Category topics | `/c/{slug}.json` | `topic_list.topics[]` |
|
||||
| Single topic | `/t/{id}.json` | The full topic with posts |
|
||||
| Posts in topic | `/t/{id}/{page}.json` | Paginated posts |
|
||||
| Search | `/search.json?q=...` | `topics[]`, `posts[]` |
|
||||
| User activity | `/u/{username}/activity.json` | User posts/topics |
|
||||
|
||||
---
|
||||
|
||||
## Extraction Strategy for AI
|
||||
|
||||
1. **Open the `.chlsx` as a ZIP** (it is not encrypted)
|
||||
2. **Iterate over all XML files** inside
|
||||
3. For each XML, check if the request URL matches a Discourse API endpoint
|
||||
4. Extract the JSON response body from `<response><body>`
|
||||
5. Parse the JSON and convert to Markdown
|
||||
6. Organize by topic ID + title for easy search
|
||||
|
||||
---
|
||||
|
||||
## Common Pitfalls
|
||||
|
||||
- Some responses are paginated (`/t/{id}.json?page=1`). Collect all pages for completeness.
|
||||
- Binary responses (images, JS bundles) should be skipped.
|
||||
- The same topic may appear multiple times in different Charles sessions; deduplicate by topic ID + last updated timestamp.
|
||||
- Session cookies captured in Charles will be expired by the time the AI reads them; only the response data matters.
|
||||
@@ -0,0 +1,140 @@
|
||||
---
|
||||
type: copilot-prompt
|
||||
status: ready
|
||||
target: github-copilot
|
||||
purpose: Parse Charles .chlsx sessions to create a searchable Discourse archive
|
||||
---
|
||||
|
||||
# Copilot Prompt — Charles Discourse Archiver
|
||||
|
||||
Paste this into GitHub Copilot on the corporate device.
|
||||
|
||||
---
|
||||
|
||||
## Prompt
|
||||
|
||||
You are helping me build a local searchable archive of a Discourse forum from captured Charles Proxy session files.
|
||||
|
||||
### Background
|
||||
|
||||
I browse a Discourse forum in my browser while Charles Proxy records traffic. I save the session as a `.chlsx` file. Inside that file are all the HTTP request/response pairs for the pages I visited — including Discourse API calls that return structured JSON (topics, posts, categories, user profiles).
|
||||
|
||||
I need you to extract only the Discourse content and organize it into a Markdown archive that:
|
||||
- Is searchable by an AI in future sessions
|
||||
- Preserves topic titles, post authors, dates, and content
|
||||
- Groups by category
|
||||
- Deduplicates topics that appear across multiple sessions
|
||||
|
||||
### File format: `.chlsx`
|
||||
|
||||
`.chlsx` is a ZIP archive. Inside are numbered XML files (e.g. `00001.xml`, `00/00001.xml`). Each XML file represents one HTTP request/response pair with this structure:
|
||||
|
||||
```xml
|
||||
<session>
|
||||
<request>
|
||||
<method>GET</method>
|
||||
<url>https://forum.example.com/t/123.json</url>
|
||||
<protocol>HTTP/1.1</protocol>
|
||||
<header name="Cookie">...</header>
|
||||
<body></body>
|
||||
</request>
|
||||
<response>
|
||||
<status>200</status>
|
||||
<protocol>HTTP/1.1</protocol>
|
||||
<header name="Content-Type">application/json; charset=utf-8</header>
|
||||
<body>{"id": 123, "title": "Some Topic", "post_stream": {...}}</body>
|
||||
</response>
|
||||
<timing>
|
||||
<start>2026-01-15T10:30:00.000Z</start>
|
||||
<duration>450</duration>
|
||||
</timing>
|
||||
</session>
|
||||
```
|
||||
|
||||
### Discourse API endpoints to extract
|
||||
|
||||
| What | URL pattern | JSON fields |
|
||||
|---|---|---|
|
||||
| Latest topics | `/latest.json` | `topic_list.topics[].{id, title, slug, category_id, created_at, last_posted_at}` |
|
||||
| Category index | `/categories.json` | `category_list.categories[].{id, name, slug}` |
|
||||
| Single topic (with posts) | `/t/{id}.json` | `id, title, slug, category_id, post_stream.posts[].{username, cooked, created_at, post_number}` |
|
||||
| Topic with page | `/t/{id}/{page}.json` | Same as above, paginated |
|
||||
| User activity | `/u/{username}/activity.json` | `user_actions[]` |
|
||||
| Search results | `/search.json?q=...` | `topics[]`, `posts[]` |
|
||||
|
||||
### What to do
|
||||
|
||||
1. **Open the `.chlsx` file** as a ZIP archive.
|
||||
2. **List all XML files** inside (both flat and in subdirectories).
|
||||
3. **For each XML file**, parse it and check if the request URL matches one of the Discourse endpoints above.
|
||||
4. **Skip**: CSS, JS, images, font files, analytics, CDN assets, and any non-Discourse endpoint.
|
||||
5. **Parse the JSON response body** from `<response><body>`.
|
||||
6. **Create this folder structure** as output:
|
||||
|
||||
```
|
||||
discourse-archive/
|
||||
├── categories.json # All categories found
|
||||
├── index.md # Master index (table of all topics with ID, title, date, category, URL)
|
||||
├── topics/
|
||||
│ ├── 123-your-topic-slug.md
|
||||
│ ├── 456-another-topic.md
|
||||
│ └── ...
|
||||
```
|
||||
|
||||
### Markdown format per topic
|
||||
|
||||
Each topic file should be a clean Markdown document with YAML frontmatter:
|
||||
|
||||
```markdown
|
||||
---
|
||||
id: 123
|
||||
title: "Your Topic Title"
|
||||
slug: your-topic-slug
|
||||
category: "Category Name"
|
||||
created: 2026-01-15
|
||||
updated: 2026-01-16
|
||||
url: https://forum.example.com/t/your-topic-slug/123
|
||||
---
|
||||
|
||||
# Your Topic Title
|
||||
|
||||
**Category**: Category Name
|
||||
|
||||
---
|
||||
|
||||
## Post 1 — @username1 (2026-01-15T10:30:00Z)
|
||||
|
||||
Post content here (HTML stripped, plain Markdown preferred).
|
||||
|
||||
---
|
||||
|
||||
## Post 2 — @username2 (2026-01-16T14:00:00Z)
|
||||
|
||||
More content.
|
||||
|
||||
---
|
||||
```
|
||||
|
||||
### Deduplication rules
|
||||
|
||||
- If the same topic ID appears in multiple `.chlsx` files, keep the one with the most recent `last_posted_at`.
|
||||
- If a session has page 2+ of a topic (`/t/123/2.json`), merge the posts with page 1.
|
||||
- Never duplicate posts within a topic.
|
||||
|
||||
### What to do with the output
|
||||
|
||||
Place the resulting `discourse-archive/` folder in a location I can reference in future Copilot sessions. I will point Copilot to that folder when I need to search past Discourse conversations.
|
||||
|
||||
### Constraints
|
||||
|
||||
- Do not modify the original `.chlsx` file.
|
||||
- Do not upload or send the extracted data anywhere — keep it local.
|
||||
- If a topic has no readable content (deleted, access restricted), note it in the index but skip the full extraction.
|
||||
- HTML in `cooked` fields should be converted to readable plain text / Markdown (Discourse stores posts as HTML in the JSON).
|
||||
|
||||
### First action
|
||||
|
||||
Ask me for:
|
||||
1. The path to the `.chlsx` file (or files)
|
||||
2. The Discourse base URL (so you can construct canonical topic URLs)
|
||||
3. Where I want the output folder created
|
||||
0
ai/inbox/mattermost-mirror/.gitkeep
Normal file
0
ai/inbox/mattermost-mirror/.gitkeep
Normal file
@@ -4,21 +4,28 @@
|
||||
- Structured ticket artifacts with jira, patch, prompts, sessions
|
||||
- Tolerant to pre-experiment fix steps to fix any build error between experiments
|
||||
- Swift skill, specialized to debug prints
|
||||
- Xcode logs analysis
|
||||
- Find and read the full Xcode artifacts
|
||||
- Extract relevant logs
|
||||
- Xcode Integration
|
||||
- Xcode logs analysis
|
||||
- Find and read the full Xcode artifacts
|
||||
- Extract relevant logs
|
||||
- Efficient use of xcode commands to build, test, contexted for tuist, cocoapods, sample projects
|
||||
- Execute unit test proficiently, like executing only new tests or related
|
||||
- Investigation: Differences from skills with cli commands vs mcps
|
||||
- Investigation: Differences of skills vs instructions from vscode copilot
|
||||
- Investigation: Differences from agents vs skills using agent, what is more general? correct relationship and use
|
||||
- Charles Proxy integration
|
||||
- LaunchDarkly integration
|
||||
- Teams integration
|
||||
- Auto breakpoint management
|
||||
- UITest integration
|
||||
- IA Investigation
|
||||
- Differences from skills with cli commands vs mcps
|
||||
- Differences of skills vs instructions from vscode copilot
|
||||
- Differences from agents vs skills using agent, what is more general? correct relationship and use
|
||||
- Fidelity
|
||||
- Charles Proxy integration
|
||||
- LaunchDarkly integration
|
||||
- Teams integration
|
||||
- Splunk analyzer
|
||||
- [x] Discourse integration
|
||||
- ServiceNow access
|
||||
- Photo uploader
|
||||
- Start as a service
|
||||
- Auto categorize by context
|
||||
- Multi photos session, copy multiples images in clipboard
|
||||
- [x] Multi photos session, copy multiples images in clipboard
|
||||
- [ ] Start as a service
|
||||
- [ ] Auto categorize by context
|
||||
- Swiftlint integration
|
||||
- Auto validator
|
||||
|
||||
38
project-knowledge/06-daily/2026-05-18.md
Normal file
38
project-knowledge/06-daily/2026-05-18.md
Normal file
@@ -0,0 +1,38 @@
|
||||
---
|
||||
type: daily
|
||||
project: fidelity
|
||||
date: 2026-05-18
|
||||
status: active
|
||||
focus: [context-refresh, pdiap-12284]
|
||||
work-items: [PDIAP-12284]
|
||||
blockers: []
|
||||
tags:
|
||||
- daily
|
||||
- fidelity
|
||||
updated: 2026-05-18
|
||||
---
|
||||
|
||||
# 2026-05-18
|
||||
|
||||
## Work Done
|
||||
|
||||
- Sent the daily scrum update for `PDIAP-12284 - Remove UIKit wrapping from XFlow`: continued SampleApp validation in both host modes, aligned the host-mode path with current flag behavior instead of the deprecated `enable-swift-ui` toggle, and started broader Fid4 smoke testing with temporary validation logs.
|
||||
|
||||
## Findings
|
||||
|
||||
- While refreshing context for Adam's duplicate-request question, David clarified that the April 4 follow-up in the `Production - Crypto Delinking issue` thread was from David/Jeff's side after Yuva's March 30 comment.
|
||||
- That April 4 follow-up said the iOS SDK-side network requests looked correct and intentional, matched Android, and did not reproduce duplicate `open-account` API calls from the client side in non-prod.
|
||||
- Therefore, prior `PDSPS-29371` context should not be summarized as a confirmed reproduction from the iOS SDK side; it is related background, but the previous investigation did not reproduce the issue from the client-side SDK path.
|
||||
- Follow-up Copilot analysis suggested a plausible but unconfirmed link between REST migration and the current duplicate-page/account report: REST did not introduce a new duplicate-trigger mechanism by itself, but the REST/FTNetwork path may have changed timeout/error behavior enough to expose an existing XFlow re-trigger path under slow BPDC responses.
|
||||
- Treat that REST link as a hypothesis requiring current logs, dates, versions, and timeout/error evidence before reporting it as root cause.
|
||||
- Jeff asked whether the REST switch could have impacted Adam's duplicate-page/account report, while noting he assumed they were unrelated. David initially answered that REST should only affect XFlow API transport, not page sequencing or submission count, and offered to trace REST-toggle state once Adam provided an exact date and flow/page.
|
||||
- After Adam provided more context, David updated Jeff that REST still should not be treated as a direct sequencing cause, but it cannot be fully ruled out because REST/FTNetwork timeout/error behavior might expose an existing XFlow retry or page-rebuild path under load.
|
||||
- Jeff asked for either a proposed response to Adam or a statement that more information is needed, suggesting Adam should open a Discourse ticket and attach the relevant evidence if more detail is required.
|
||||
|
||||
---
|
||||
|
||||
## Next Steps
|
||||
|
||||
- Frame any update to Jeff as a context refresh: related prior investigation exists, but the previous iOS SDK-side review did not reproduce duplicate client-side `open-account` calls, so current logs/examples are needed before calling the new report the same issue or a regression.
|
||||
- If discussing REST impact, separate confirmed facts from hypothesis: confirmed prior non-prod iOS review did not reproduce duplicate client-side calls; current hypothesis is that REST timeout/error semantics may expose the existing XFlow model-state retry/rebuild path under production load.
|
||||
- Prepare a concise proposed response to Adam that asks for a Discourse ticket with exact incident date/time, affected flow/page, app/XFlowSDK version, REST state if known, user journey logs, and examples needed to compare against `PDSPS-29371` / `PDIAP-11561`.
|
||||
21
project-knowledge/06-daily/2026-05-19.md
Normal file
21
project-knowledge/06-daily/2026-05-19.md
Normal file
@@ -0,0 +1,21 @@
|
||||
---
|
||||
type: daily
|
||||
project: fidelity
|
||||
date: 2026-05-19
|
||||
status: active
|
||||
focus: [pdiap-12284, duplicate-ao-report]
|
||||
work-items: [PDIAP-12284]
|
||||
blockers: []
|
||||
tags:
|
||||
- daily
|
||||
- fidelity
|
||||
updated: 2026-05-19
|
||||
---
|
||||
|
||||
# 2026-05-19
|
||||
|
||||
## Work Done
|
||||
|
||||
- Sent the daily scrum update for today.
|
||||
- Proposed and sent the request for a Discourse ticket to Adam to obtain details (exact date/time, affected flow/page, build version, logs, and examples) for the duplicate account-opening report.
|
||||
- Confirmed that no new Discourse ticket with the `xflog` tag has been posted yet.
|
||||
@@ -32,6 +32,12 @@ Promote durable facts into `project-knowledge/01-current/`, `project-knowledge/0
|
||||
- [2026-05-05](2026-05-05.md)
|
||||
- [2026-05-07](2026-05-07.md)
|
||||
- [2026-05-08](2026-05-08.md)
|
||||
- [2026-05-11](2026-05-11.md)
|
||||
- [2026-05-12](2026-05-12.md)
|
||||
- [2026-05-13](2026-05-13.md)
|
||||
- [2026-05-14](2026-05-14.md)
|
||||
- [2026-05-18](2026-05-18.md)
|
||||
- [2026-05-19](2026-05-19.md)
|
||||
|
||||
---
|
||||
|
||||
|
||||
27
scripts/mattermost-proxy/.env.example
Normal file
27
scripts/mattermost-proxy/.env.example
Normal file
@@ -0,0 +1,27 @@
|
||||
# Mattermost proxy mirror configuration.
|
||||
# Copy to .env if you want local overrides. Do not commit .env.
|
||||
|
||||
# Optional: restrict capture to the Mattermost host. Use the host only, no scheme.
|
||||
# If empty, the addon captures /api/v4 traffic from the proxied Mattermost app.
|
||||
# Example: mm.all-win-solutions.app
|
||||
MATTERMOST_MIRROR_HOST_ALLOW=
|
||||
|
||||
# Output directory for raw evidence and normalized AI-readable context.
|
||||
MATTERMOST_MIRROR_DIR=ai/inbox/mattermost-mirror
|
||||
|
||||
# mitmproxy listener used by launch-mattermost.sh.
|
||||
MATTERMOST_MIRROR_LISTEN_HOST=127.0.0.1
|
||||
MATTERMOST_MIRROR_LISTEN_PORT=8080
|
||||
|
||||
# Keep the small AI context window bounded.
|
||||
MATTERMOST_MIRROR_LATEST_LIMIT=200
|
||||
|
||||
# Optional channel allowlist. Comma-separated channel IDs. Empty means all captured channels.
|
||||
MATTERMOST_MIRROR_CHANNEL_IDS=
|
||||
|
||||
# Write compact raw REST/WebSocket evidence in addition to normalized messages.
|
||||
# Keep disabled by default to avoid large files.
|
||||
MATTERMOST_MIRROR_WRITE_RAW=0
|
||||
|
||||
# Mattermost desktop app bundle.
|
||||
MATTERMOST_APP_PATH=/Applications/Mattermost.app
|
||||
151
scripts/mattermost-proxy/README.md
Normal file
151
scripts/mattermost-proxy/README.md
Normal file
@@ -0,0 +1,151 @@
|
||||
# Mattermost Proxy Mirror
|
||||
|
||||
Local read-only Mattermost Desktop mirror for AI workspace context.
|
||||
|
||||
This is for **raw evidence only**. It writes under `ai/inbox/mattermost-mirror/`; durable project memory still belongs in `project-knowledge/` after normal promotion rules.
|
||||
|
||||
## Why this exists
|
||||
|
||||
Mattermost Team Edition 11.4.2 exposes normal `/api/v4` REST and WebSocket traffic. When Mattermost Desktop is launched with Chromium/Electron's `--proxy-server` flag, `mitmproxy` can capture only that app without changing the macOS system proxy.
|
||||
|
||||
## Setup
|
||||
|
||||
1. Install `mitmproxy`.
|
||||
2. Trust the mitmproxy certificate if HTTPS interception is not already working:
|
||||
- Start `scripts/mattermost-proxy/run-mirror.sh`
|
||||
- Open `http://mitm.it`
|
||||
- Install/trust the certificate in Keychain.
|
||||
3. Optional: copy `.env.example` to `.env` and set `MATTERMOST_MIRROR_HOST_ALLOW` to the exact Mattermost host, for example `mm.all-win-solutions.app`.
|
||||
|
||||
## Run day to day
|
||||
|
||||
Terminal 1:
|
||||
|
||||
```bash
|
||||
scripts/mattermost-proxy/run-mirror.sh
|
||||
```
|
||||
|
||||
Terminal 2:
|
||||
|
||||
```bash
|
||||
scripts/mattermost-proxy/launch-mattermost.sh
|
||||
```
|
||||
|
||||
This launches Mattermost Desktop through macOS LaunchServices with:
|
||||
|
||||
```bash
|
||||
--proxy-server=http://127.0.0.1:8080
|
||||
```
|
||||
|
||||
No global macOS proxy is required.
|
||||
|
||||
The helper intentionally uses `open -n /Applications/Mattermost.app --args ...`
|
||||
instead of invoking `/Applications/Mattermost.app/Contents/MacOS/Mattermost`
|
||||
directly. Direct binary launch can crash sandboxed Electron apps with Mach
|
||||
rendezvous errors because their expected app/container parent process is
|
||||
missing.
|
||||
|
||||
## Output layout
|
||||
|
||||
```text
|
||||
ai/inbox/mattermost-mirror/
|
||||
latest.jsonl # bounded AI-readable window
|
||||
latest.md # bounded Markdown view
|
||||
state.json # last seen by channel and user cache
|
||||
index.json # date/channel/thread file map
|
||||
refs/
|
||||
channels.json # channel_id -> channel_name
|
||||
users.json # user_id -> username
|
||||
channels/<channel-name>/YYYY/MM/YYYY-MM-DD.jsonl
|
||||
by-date/YYYY/MM/YYYY-MM-DD.jsonl
|
||||
threads/<root-or-post-id>.jsonl
|
||||
raw/YYYY/MM/YYYY-MM-DD-websocket.jsonl # only if MATTERMOST_MIRROR_WRITE_RAW=1
|
||||
raw/YYYY/MM/YYYY-MM-DD-rest-flows.jsonl # only if MATTERMOST_MIRROR_WRITE_RAW=1
|
||||
```
|
||||
|
||||
Use `latest.md` or `latest.jsonl` for quick AI context. Use `channels/...`
|
||||
for conversation-focused analysis, `by-date/...` for standups or daily review,
|
||||
and `threads/...` when a single discussion thread is the relevant evidence.
|
||||
This mirrors Slack's export pattern of one folder per conversation with one file
|
||||
per date, while adding Mattermost-specific thread views.
|
||||
|
||||
Direct-message channels are labeled as `dm-<user-a>--<user-b>` when the mirror
|
||||
has seen enough user metadata to resolve the Mattermost channel ID. Group DMs
|
||||
use `group-...`. If a DM was first captured before the relevant user metadata
|
||||
arrived, the folder can temporarily use raw IDs; later captures use the readable
|
||||
label and `refs/channels.json` remains the source for resolving channel IDs.
|
||||
|
||||
The mirror writes any post payload it sees, including older messages returned
|
||||
when the desktop app loads channel history or a thread. It dedupes by `post_id`,
|
||||
so scrolling back through useful history is a safe way to backfill missing local
|
||||
evidence without creating repeated entries.
|
||||
|
||||
## Normalized message schema
|
||||
|
||||
Each line in the normalized JSONL contains:
|
||||
|
||||
```json
|
||||
{
|
||||
"source": "websocket|rest",
|
||||
"captured_at": "2026-05-19T...Z",
|
||||
"created_at": "2026-05-19T...Z",
|
||||
"created_at_ms": 1779190000000,
|
||||
"channel_id": "...",
|
||||
"channel_name": "fidelity-preguntas",
|
||||
"post_id": "...",
|
||||
"root_id": "...",
|
||||
"thread_id": "...",
|
||||
"user_id": "...",
|
||||
"username": "jeff",
|
||||
"message": "...",
|
||||
"type": "channel_post|thread_reply",
|
||||
"raw_event": "posted|posts|post"
|
||||
}
|
||||
```
|
||||
|
||||
## Safety rules
|
||||
|
||||
- The addon allowlists Mattermost hosts and `/api/v4` traffic only.
|
||||
- Headers such as `Authorization`, `Cookie`, `Set-Cookie`, and CSRF are redacted in optional raw output.
|
||||
- Optional raw output is disabled by default to prevent large files.
|
||||
- Attachments are not downloaded by this mirror.
|
||||
- The mirror is evidence, not canonical memory.
|
||||
|
||||
## Useful environment variables
|
||||
|
||||
- `MATTERMOST_MIRROR_HOST_ALLOW`: exact host or parent domain to capture.
|
||||
- `MATTERMOST_MIRROR_DIR`: output directory, default `ai/inbox/mattermost-mirror`.
|
||||
- `MATTERMOST_MIRROR_LATEST_LIMIT`: number of messages in `latest.*`, default `200`.
|
||||
- `MATTERMOST_MIRROR_CHANNEL_IDS`: optional comma-separated channel ID allowlist.
|
||||
- `MATTERMOST_MIRROR_WRITE_RAW`: set to `1` to save compact raw REST/WebSocket evidence.
|
||||
- `MATTERMOST_APP_PATH`: Mattermost Desktop `.app` bundle path.
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### TLS certificate warnings
|
||||
|
||||
Mitmproxy uses a persistent local CA under `~/.mitmproxy`. If the desktop app
|
||||
asks about the certificate after every proxy restart, install and trust that CA
|
||||
in macOS Keychain instead of approving it only in the app prompt:
|
||||
|
||||
1. Start `scripts/mattermost-proxy/run-mirror.sh`.
|
||||
2. Open `http://mitm.it` from a browser on this Mac and download the macOS certificate.
|
||||
3. Add it to Keychain Access and set it to **Always Trust**.
|
||||
4. Restart Mattermost Desktop through `launch-mattermost.sh`.
|
||||
|
||||
Warnings for unrelated hosts such as `releases.mattermost.com` or OpenGraph
|
||||
preview hosts are not required for message capture. The mirror only writes
|
||||
normalized messages from Mattermost `/api/v4` REST/WebSocket payloads.
|
||||
|
||||
### Proxy logs show traffic but no `latest.md`
|
||||
|
||||
The mirror writes files only after it sees a post payload. Startup calls such as
|
||||
`/api/v4/teams`, `/api/v4/users`, `/api/v4/files`, or WebSocket ping/ack events
|
||||
do not create message files. Open a channel, open a thread, scroll slightly in
|
||||
history, or wait for/send a new message. Then check:
|
||||
|
||||
```text
|
||||
ai/inbox/mattermost-mirror/latest.md
|
||||
ai/inbox/mattermost-mirror/channels/<channel-name>/YYYY/MM/YYYY-MM-DD.jsonl
|
||||
ai/inbox/mattermost-mirror/by-date/YYYY/MM/YYYY-MM-DD.jsonl
|
||||
```
|
||||
26
scripts/mattermost-proxy/launch-mattermost.sh
Executable file
26
scripts/mattermost-proxy/launch-mattermost.sh
Executable file
@@ -0,0 +1,26 @@
|
||||
#!/usr/bin/env bash
|
||||
set -euo pipefail
|
||||
|
||||
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
|
||||
|
||||
if [ -f "$SCRIPT_DIR/.env" ]; then
|
||||
set -a
|
||||
# shellcheck source=/dev/null
|
||||
source "$SCRIPT_DIR/.env"
|
||||
set +a
|
||||
fi
|
||||
|
||||
APP_PATH="${MATTERMOST_APP_PATH:-/Applications/Mattermost.app}"
|
||||
PROXY_HOST="${MATTERMOST_MIRROR_LISTEN_HOST:-127.0.0.1}"
|
||||
PROXY_PORT="${MATTERMOST_MIRROR_LISTEN_PORT:-8080}"
|
||||
|
||||
if [ ! -d "$APP_PATH" ]; then
|
||||
echo "Mattermost app bundle not found: $APP_PATH" >&2
|
||||
echo "Set MATTERMOST_APP_PATH in scripts/mattermost-proxy/.env if needed." >&2
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# Prefer macOS LaunchServices over invoking the Electron binary directly.
|
||||
# Direct binary launch can crash sandboxed Electron apps with Mach rendezvous
|
||||
# errors because their expected app/container parent process is missing.
|
||||
exec open -n "$APP_PATH" --args --proxy-server="http://${PROXY_HOST}:${PROXY_PORT}"
|
||||
514
scripts/mattermost-proxy/mattermost_mirror.py
Normal file
514
scripts/mattermost-proxy/mattermost_mirror.py
Normal file
@@ -0,0 +1,514 @@
|
||||
"""mitmproxy addon for a local Mattermost Desktop mirror.
|
||||
|
||||
This addon is intentionally narrow:
|
||||
- allowlist a Mattermost host
|
||||
- inspect only /api/v4 REST and WebSocket traffic
|
||||
- redact secrets
|
||||
- normalize posts into date-rotated JSONL files for AI context
|
||||
|
||||
The output under ai/inbox/ is raw evidence, not canonical project memory.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
import os
|
||||
import re
|
||||
import tempfile
|
||||
from datetime import datetime, timezone
|
||||
from pathlib import Path
|
||||
from typing import Any
|
||||
from urllib.parse import urlparse
|
||||
|
||||
from mitmproxy import http
|
||||
|
||||
|
||||
DEFAULT_OUT_DIR = "ai/inbox/mattermost-mirror"
|
||||
POST_ID_RE = re.compile(r"^[a-z0-9]{26}$")
|
||||
SAFE_NAME_RE = re.compile(r"[^a-zA-Z0-9._-]+")
|
||||
|
||||
|
||||
def env_bool(name: str, default: bool = False) -> bool:
|
||||
raw = os.getenv(name)
|
||||
if raw is None:
|
||||
return default
|
||||
return raw.strip().lower() in {"1", "true", "yes", "on"}
|
||||
|
||||
|
||||
def split_csv(raw: str) -> set[str]:
|
||||
return {item.strip() for item in raw.replace("\n", ",").split(",") if item.strip()}
|
||||
|
||||
|
||||
class MattermostMirror:
|
||||
def __init__(self) -> None:
|
||||
self.out_dir = Path(os.getenv("MATTERMOST_MIRROR_DIR", DEFAULT_OUT_DIR)).resolve()
|
||||
self.host_allow = os.getenv("MATTERMOST_MIRROR_HOST_ALLOW", "").strip().lower()
|
||||
self.channel_allow = split_csv(os.getenv("MATTERMOST_MIRROR_CHANNEL_IDS", ""))
|
||||
self.latest_limit = int(os.getenv("MATTERMOST_MIRROR_LATEST_LIMIT", "200"))
|
||||
self.write_raw = env_bool("MATTERMOST_MIRROR_WRITE_RAW", default=False)
|
||||
|
||||
self.channels_dir = self.out_dir / "channels"
|
||||
self.by_date_dir = self.out_dir / "by-date"
|
||||
self.threads_dir = self.out_dir / "threads"
|
||||
self.refs_dir = self.out_dir / "refs"
|
||||
self.raw_dir = self.out_dir / "raw"
|
||||
self.state_path = self.out_dir / "state.json"
|
||||
self.index_path = self.out_dir / "index.json"
|
||||
self.latest_jsonl_path = self.out_dir / "latest.jsonl"
|
||||
self.latest_md_path = self.out_dir / "latest.md"
|
||||
|
||||
self.seen_post_ids: set[str] = set()
|
||||
self.seen_by_file: dict[Path, set[str]] = {}
|
||||
self.users: dict[str, str] = {}
|
||||
self.channels: dict[str, str] = {}
|
||||
self.channel_meta: dict[str, dict[str, Any]] = {}
|
||||
self.state: dict[str, Any] = {"channels": {}, "users": {}, "updated_at": None}
|
||||
|
||||
self._ensure_dirs()
|
||||
self._load_state()
|
||||
self._load_recent_seen_ids()
|
||||
|
||||
def _ensure_dirs(self) -> None:
|
||||
self.channels_dir.mkdir(parents=True, exist_ok=True)
|
||||
self.by_date_dir.mkdir(parents=True, exist_ok=True)
|
||||
self.threads_dir.mkdir(parents=True, exist_ok=True)
|
||||
self.refs_dir.mkdir(parents=True, exist_ok=True)
|
||||
self.raw_dir.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
def _load_state(self) -> None:
|
||||
if not self.state_path.exists():
|
||||
return
|
||||
try:
|
||||
self.state = json.loads(self.state_path.read_text(encoding="utf-8"))
|
||||
self.users = dict(self.state.get("users") or {})
|
||||
self.channel_meta = dict(self.state.get("channel_meta") or {})
|
||||
for channel_id, value in (self.state.get("channels") or {}).items():
|
||||
if isinstance(value, dict):
|
||||
name = value.get("channel_name") or value.get("name")
|
||||
if name:
|
||||
self.channels[channel_id] = name
|
||||
except Exception:
|
||||
self.state = {"channels": {}, "users": {}, "updated_at": None}
|
||||
|
||||
def _load_recent_seen_ids(self) -> None:
|
||||
# Bound startup work: latest.jsonl contains the hot dedupe window. Daily
|
||||
# files are loaded lazily when older/backfilled messages are encountered.
|
||||
today = datetime.now(timezone.utc)
|
||||
for path in [self.latest_jsonl_path, self._daily_by_date_path(today)]:
|
||||
if not path.exists():
|
||||
continue
|
||||
try:
|
||||
ids = self._load_seen_ids_for_file(path)
|
||||
self.seen_post_ids.update(ids)
|
||||
except Exception:
|
||||
continue
|
||||
|
||||
def _load_seen_ids_for_file(self, path: Path) -> set[str]:
|
||||
if path in self.seen_by_file:
|
||||
return self.seen_by_file[path]
|
||||
ids: set[str] = set()
|
||||
if path.exists():
|
||||
try:
|
||||
with path.open("r", encoding="utf-8") as handle:
|
||||
for line in handle:
|
||||
if not line.strip():
|
||||
continue
|
||||
obj = json.loads(line)
|
||||
post_id = obj.get("post_id")
|
||||
if post_id:
|
||||
ids.add(post_id)
|
||||
except Exception:
|
||||
ids = set()
|
||||
self.seen_by_file[path] = ids
|
||||
return ids
|
||||
|
||||
def _atomic_write_text(self, path: Path, text: str) -> None:
|
||||
path.parent.mkdir(parents=True, exist_ok=True)
|
||||
with tempfile.NamedTemporaryFile("w", encoding="utf-8", dir=str(path.parent), delete=False) as tmp:
|
||||
tmp.write(text)
|
||||
tmp_path = Path(tmp.name)
|
||||
tmp_path.replace(path)
|
||||
|
||||
def _append_jsonl(self, path: Path, obj: dict[str, Any]) -> None:
|
||||
path.parent.mkdir(parents=True, exist_ok=True)
|
||||
with path.open("a", encoding="utf-8") as handle:
|
||||
handle.write(json.dumps(obj, ensure_ascii=False, sort_keys=True) + "\n")
|
||||
|
||||
def _dt_from_ms(self, value: Any) -> datetime:
|
||||
try:
|
||||
ms = int(value)
|
||||
if ms > 0:
|
||||
return datetime.fromtimestamp(ms / 1000, timezone.utc)
|
||||
except Exception:
|
||||
pass
|
||||
return datetime.now(timezone.utc)
|
||||
|
||||
def _safe_name(self, value: str | None, fallback: str = "unknown") -> str:
|
||||
raw = (value or fallback).strip() or fallback
|
||||
safe = SAFE_NAME_RE.sub("-", raw).strip("-._")
|
||||
return safe or fallback
|
||||
|
||||
def _daily_channel_path(self, dt: datetime, channel_name: str | None, channel_id: str | None) -> Path:
|
||||
channel_slug = self._safe_name(channel_name or channel_id, fallback="unknown-channel")
|
||||
return self.channels_dir / channel_slug / f"{dt:%Y}" / f"{dt:%m}" / f"{dt:%Y-%m-%d}.jsonl"
|
||||
|
||||
def _daily_by_date_path(self, dt: datetime) -> Path:
|
||||
return self.by_date_dir / f"{dt:%Y}" / f"{dt:%m}" / f"{dt:%Y-%m-%d}.jsonl"
|
||||
|
||||
def _thread_path(self, thread_id: str | None) -> Path | None:
|
||||
if not thread_id:
|
||||
return None
|
||||
return self.threads_dir / f"{self._safe_name(thread_id)}.jsonl"
|
||||
|
||||
def _daily_raw_path(self, dt: datetime, suffix: str) -> Path:
|
||||
return self.raw_dir / f"{dt:%Y}" / f"{dt:%m}" / f"{dt:%Y-%m-%d}-{suffix}.jsonl"
|
||||
|
||||
def _safe_url(self, url: str) -> str:
|
||||
parsed = urlparse(url)
|
||||
return parsed._replace(query=parsed.query, fragment="").geturl()
|
||||
|
||||
def _is_allowed_host(self, host: str) -> bool:
|
||||
host = host.lower()
|
||||
if self.host_allow:
|
||||
return host == self.host_allow or host.endswith(f".{self.host_allow}")
|
||||
# The launched Mattermost Desktop app is already scoped to this proxy.
|
||||
# Some company hosts do not include "mattermost" in the hostname
|
||||
# (for example, mm.example.com), so default to allowing the proxied
|
||||
# app's /api/v4 traffic when no explicit host allowlist is configured.
|
||||
return True
|
||||
|
||||
def _is_allowed_channel(self, channel_id: str | None) -> bool:
|
||||
if not self.channel_allow:
|
||||
return True
|
||||
return bool(channel_id and channel_id in self.channel_allow)
|
||||
|
||||
def _capture_flow(self, flow: http.HTTPFlow) -> bool:
|
||||
return self._is_allowed_host(flow.request.pretty_host) and "/api/v4/" in flow.request.path
|
||||
|
||||
def _redact_headers(self, headers: Any) -> dict[str, str]:
|
||||
redacted: dict[str, str] = {}
|
||||
for key, value in headers.items():
|
||||
lowered = key.lower()
|
||||
if lowered in {"authorization", "cookie", "set-cookie", "x-csrf-token"}:
|
||||
redacted[key] = "[REDACTED]"
|
||||
else:
|
||||
redacted[key] = str(value)
|
||||
return redacted
|
||||
|
||||
def _remember_user(self, user: dict[str, Any]) -> None:
|
||||
user_id = user.get("id")
|
||||
if not user_id:
|
||||
return
|
||||
username = user.get("username") or user.get("nickname") or user.get("first_name") or user_id
|
||||
self.users[user_id] = username
|
||||
self._write_refs()
|
||||
|
||||
def _remember_channel(self, channel: dict[str, Any]) -> None:
|
||||
channel_id = channel.get("id")
|
||||
if not channel_id:
|
||||
return
|
||||
self.channel_meta[channel_id] = channel
|
||||
name = self._channel_label(channel)
|
||||
self.channels[channel_id] = name
|
||||
self._write_refs()
|
||||
|
||||
def _user_label(self, user_id: str | None) -> str | None:
|
||||
if not user_id:
|
||||
return None
|
||||
return self.users.get(user_id) or user_id
|
||||
|
||||
def _channel_label(self, channel: dict[str, Any]) -> str:
|
||||
channel_id = channel.get("id") or "unknown-channel"
|
||||
channel_type = channel.get("type")
|
||||
display_name = (channel.get("display_name") or "").strip()
|
||||
name = (channel.get("name") or "").strip()
|
||||
|
||||
if channel_type == "D":
|
||||
user_ids = [item for item in name.split("__") if item]
|
||||
labels = [self._user_label(user_id) or user_id for user_id in user_ids]
|
||||
if labels:
|
||||
return "dm-" + "--".join(labels)
|
||||
|
||||
if channel_type == "G":
|
||||
if display_name:
|
||||
return "group-" + display_name
|
||||
user_ids = [item for item in name.split("__") if item]
|
||||
labels = [self._user_label(user_id) or user_id for user_id in user_ids]
|
||||
if labels:
|
||||
return "group-" + "--".join(labels)
|
||||
|
||||
return display_name or name or channel_id
|
||||
|
||||
def _refresh_channel_labels(self) -> None:
|
||||
changed = False
|
||||
for channel_id, meta in self.channel_meta.items():
|
||||
label = self._channel_label(meta)
|
||||
if label and self.channels.get(channel_id) != label:
|
||||
self.channels[channel_id] = label
|
||||
changed = True
|
||||
if changed:
|
||||
self._write_refs()
|
||||
|
||||
def _write_refs(self) -> None:
|
||||
users_path = self.refs_dir / "users.json"
|
||||
channels_path = self.refs_dir / "channels.json"
|
||||
self._atomic_write_text(users_path, json.dumps(self.users, ensure_ascii=False, indent=2, sort_keys=True) + "\n")
|
||||
self._atomic_write_text(channels_path, json.dumps(self.channels, ensure_ascii=False, indent=2, sort_keys=True) + "\n")
|
||||
|
||||
def _ingest_reference_payload(self, payload: Any) -> None:
|
||||
if isinstance(payload, list):
|
||||
for item in payload:
|
||||
self._ingest_reference_payload(item)
|
||||
return
|
||||
if not isinstance(payload, dict):
|
||||
return
|
||||
|
||||
if payload.get("id") and ("username" in payload or "first_name" in payload):
|
||||
self._remember_user(payload)
|
||||
if payload.get("id") and ("display_name" in payload or "team_id" in payload) and "type" in payload:
|
||||
self._remember_channel(payload)
|
||||
|
||||
users = payload.get("users")
|
||||
if isinstance(users, dict):
|
||||
for user in users.values():
|
||||
if isinstance(user, dict):
|
||||
self._remember_user(user)
|
||||
elif isinstance(users, list):
|
||||
for user in users:
|
||||
if isinstance(user, dict):
|
||||
self._remember_user(user)
|
||||
|
||||
channels = payload.get("channels")
|
||||
if isinstance(channels, list):
|
||||
for channel in channels:
|
||||
if isinstance(channel, dict):
|
||||
self._remember_channel(channel)
|
||||
|
||||
self._refresh_channel_labels()
|
||||
|
||||
def _normalize_post(self, post: dict[str, Any], source: str, raw_event: str | None = None) -> dict[str, Any] | None:
|
||||
post_id = post.get("id")
|
||||
channel_id = post.get("channel_id")
|
||||
if not post_id or not POST_ID_RE.match(str(post_id)):
|
||||
return None
|
||||
if not self._is_allowed_channel(channel_id):
|
||||
return None
|
||||
|
||||
created_dt = self._dt_from_ms(post.get("create_at"))
|
||||
root_id = post.get("root_id") or None
|
||||
user_id = post.get("user_id") or None
|
||||
message = post.get("message") or ""
|
||||
message_type = "thread_reply" if root_id else "channel_post"
|
||||
|
||||
return {
|
||||
"source": source,
|
||||
"captured_at": datetime.now(timezone.utc).isoformat(),
|
||||
"created_at": created_dt.isoformat(),
|
||||
"created_at_ms": int(post.get("create_at") or created_dt.timestamp() * 1000),
|
||||
"updated_at_ms": int(post.get("update_at") or 0),
|
||||
"channel_id": channel_id,
|
||||
"channel_name": self.channels.get(channel_id) if channel_id else None,
|
||||
"post_id": post_id,
|
||||
"root_id": root_id,
|
||||
"thread_id": root_id or post_id,
|
||||
"user_id": user_id,
|
||||
"username": self.users.get(user_id) if user_id else None,
|
||||
"message": message,
|
||||
"type": message_type,
|
||||
"raw_event": raw_event,
|
||||
"props": post.get("props") or {},
|
||||
}
|
||||
|
||||
def _write_message(self, msg: dict[str, Any]) -> None:
|
||||
post_id = msg["post_id"]
|
||||
created_dt = self._dt_from_ms(msg.get("created_at_ms"))
|
||||
channel_path = self._daily_channel_path(created_dt, msg.get("channel_name"), msg.get("channel_id"))
|
||||
by_date_path = self._daily_by_date_path(created_dt)
|
||||
thread_path = self._thread_path(msg.get("thread_id"))
|
||||
channel_seen = self._load_seen_ids_for_file(channel_path)
|
||||
by_date_seen = self._load_seen_ids_for_file(by_date_path)
|
||||
if post_id in self.seen_post_ids or post_id in channel_seen or post_id in by_date_seen:
|
||||
return
|
||||
|
||||
self.seen_post_ids.add(post_id)
|
||||
channel_seen.add(post_id)
|
||||
by_date_seen.add(post_id)
|
||||
self._append_jsonl(channel_path, msg)
|
||||
self._append_jsonl(by_date_path, msg)
|
||||
if thread_path:
|
||||
thread_seen = self._load_seen_ids_for_file(thread_path)
|
||||
if post_id not in thread_seen:
|
||||
thread_seen.add(post_id)
|
||||
self._append_jsonl(thread_path, msg)
|
||||
self._update_state(msg)
|
||||
self._update_latest(msg)
|
||||
self._update_index(created_dt, msg)
|
||||
|
||||
def _update_state(self, msg: dict[str, Any]) -> None:
|
||||
channel_id = msg.get("channel_id") or "unknown"
|
||||
channels = self.state.setdefault("channels", {})
|
||||
entry = channels.setdefault(channel_id, {})
|
||||
if msg.get("channel_name"):
|
||||
entry["channel_name"] = msg.get("channel_name")
|
||||
entry["last_seen_create_at"] = max(int(entry.get("last_seen_create_at") or 0), int(msg.get("created_at_ms") or 0))
|
||||
entry["last_seen_post_id"] = msg.get("post_id")
|
||||
self.state["users"] = self.users
|
||||
self.state["channel_meta"] = self.channel_meta
|
||||
self.state["updated_at"] = datetime.now(timezone.utc).isoformat()
|
||||
self._atomic_write_text(self.state_path, json.dumps(self.state, ensure_ascii=False, indent=2, sort_keys=True) + "\n")
|
||||
self._write_refs()
|
||||
|
||||
def _read_jsonl(self, path: Path) -> list[dict[str, Any]]:
|
||||
if not path.exists():
|
||||
return []
|
||||
records: list[dict[str, Any]] = []
|
||||
try:
|
||||
with path.open("r", encoding="utf-8") as handle:
|
||||
for line in handle:
|
||||
if line.strip():
|
||||
records.append(json.loads(line))
|
||||
except Exception:
|
||||
return []
|
||||
return records
|
||||
|
||||
def _update_latest(self, msg: dict[str, Any]) -> None:
|
||||
records = self._read_jsonl(self.latest_jsonl_path)
|
||||
by_id: dict[str, dict[str, Any]] = {item.get("post_id"): item for item in records if item.get("post_id")}
|
||||
by_id[msg["post_id"]] = msg
|
||||
latest = sorted(by_id.values(), key=lambda item: int(item.get("created_at_ms") or 0))[-self.latest_limit :]
|
||||
jsonl = "".join(json.dumps(item, ensure_ascii=False, sort_keys=True) + "\n" for item in latest)
|
||||
self._atomic_write_text(self.latest_jsonl_path, jsonl)
|
||||
self._atomic_write_text(self.latest_md_path, self._render_latest_md(latest))
|
||||
|
||||
def _render_latest_md(self, records: list[dict[str, Any]]) -> str:
|
||||
lines = ["# Latest Mattermost Mirror", "", "Generated from local proxy mirror evidence.", ""]
|
||||
current_channel = None
|
||||
for item in records:
|
||||
channel = item.get("channel_name") or item.get("channel_id") or "unknown-channel"
|
||||
if channel != current_channel:
|
||||
lines.extend([f"## {channel}", ""])
|
||||
current_channel = channel
|
||||
author = item.get("username") or item.get("user_id") or "unknown-user"
|
||||
created = item.get("created_at") or "unknown-time"
|
||||
prefix = "reply" if item.get("type") == "thread_reply" else "post"
|
||||
text = (item.get("message") or "").strip()
|
||||
lines.append(f"- {created} {author} ({prefix} `{item.get('post_id')}`): {text}")
|
||||
lines.append("")
|
||||
return "\n".join(lines)
|
||||
|
||||
def _update_index(self, dt: datetime, msg: dict[str, Any]) -> None:
|
||||
index: dict[str, Any] = {"dates": [], "channels": {}, "updated_at": None}
|
||||
if self.index_path.exists():
|
||||
try:
|
||||
index = json.loads(self.index_path.read_text(encoding="utf-8"))
|
||||
except Exception:
|
||||
pass
|
||||
date_key = f"{dt:%Y-%m-%d}"
|
||||
channel_path = self._daily_channel_path(dt, msg.get("channel_name"), msg.get("channel_id"))
|
||||
by_date_path = self._daily_by_date_path(dt)
|
||||
thread_path = self._thread_path(msg.get("thread_id"))
|
||||
channel_rel_path = str(channel_path.relative_to(self.out_dir))
|
||||
by_date_rel_path = str(by_date_path.relative_to(self.out_dir))
|
||||
dates = set(index.get("dates") or [])
|
||||
dates.add(date_key)
|
||||
index["dates"] = sorted(dates)
|
||||
by_date = index.setdefault("by_date", {})
|
||||
by_date[date_key] = by_date_rel_path
|
||||
|
||||
channel_key = msg.get("channel_name") or msg.get("channel_id") or "unknown-channel"
|
||||
channels = index.setdefault("channels", {})
|
||||
channel_entry = channels.setdefault(channel_key, {"channel_id": msg.get("channel_id"), "files": []})
|
||||
channel_entry["channel_id"] = msg.get("channel_id")
|
||||
files = set(channel_entry.get("files") or [])
|
||||
files.add(channel_rel_path)
|
||||
channel_entry["files"] = sorted(files)
|
||||
if thread_path:
|
||||
threads = index.setdefault("threads", {})
|
||||
threads[msg.get("thread_id")] = str(thread_path.relative_to(self.out_dir))
|
||||
index["updated_at"] = datetime.now(timezone.utc).isoformat()
|
||||
self._atomic_write_text(self.index_path, json.dumps(index, ensure_ascii=False, indent=2, sort_keys=True) + "\n")
|
||||
|
||||
def _write_raw(self, suffix: str, obj: dict[str, Any]) -> None:
|
||||
if not self.write_raw:
|
||||
return
|
||||
self._append_jsonl(self._daily_raw_path(datetime.now(timezone.utc), suffix), obj)
|
||||
|
||||
def response(self, flow: http.HTTPFlow) -> None:
|
||||
if not self._capture_flow(flow) or not flow.response:
|
||||
return
|
||||
content_type = flow.response.headers.get("content-type", "")
|
||||
if "json" not in content_type:
|
||||
return
|
||||
try:
|
||||
payload = flow.response.json()
|
||||
except Exception:
|
||||
return
|
||||
|
||||
self._ingest_reference_payload(payload)
|
||||
|
||||
path = flow.request.path
|
||||
raw_record = {
|
||||
"captured_at": datetime.now(timezone.utc).isoformat(),
|
||||
"method": flow.request.method,
|
||||
"url": self._safe_url(flow.request.pretty_url),
|
||||
"path": path,
|
||||
"status_code": flow.response.status_code,
|
||||
"request_headers": self._redact_headers(flow.request.headers),
|
||||
"response": payload,
|
||||
}
|
||||
self._write_raw("rest-flows", raw_record)
|
||||
|
||||
# Mattermost post-list shape: { order: [...], posts: {post_id: {...}} }
|
||||
if isinstance(payload, dict) and isinstance(payload.get("posts"), dict):
|
||||
for post in payload["posts"].values():
|
||||
if isinstance(post, dict):
|
||||
normalized = self._normalize_post(post, source="rest", raw_event="posts")
|
||||
if normalized:
|
||||
self._write_message(normalized)
|
||||
elif isinstance(payload, dict) and payload.get("id") and payload.get("message") is not None:
|
||||
normalized = self._normalize_post(payload, source="rest", raw_event="post")
|
||||
if normalized:
|
||||
self._write_message(normalized)
|
||||
|
||||
def websocket_message(self, flow: http.HTTPFlow) -> None:
|
||||
if not self._is_allowed_host(flow.request.pretty_host):
|
||||
return
|
||||
if "/api/v4/websocket" not in flow.request.path:
|
||||
return
|
||||
if not flow.websocket or not flow.websocket.messages:
|
||||
return
|
||||
message = flow.websocket.messages[-1]
|
||||
if message.from_client:
|
||||
return
|
||||
try:
|
||||
text = message.content.decode("utf-8") if isinstance(message.content, bytes) else str(message.content)
|
||||
payload = json.loads(text)
|
||||
except Exception:
|
||||
return
|
||||
|
||||
self._write_raw("websocket", {
|
||||
"captured_at": datetime.now(timezone.utc).isoformat(),
|
||||
"url": self._safe_url(flow.request.pretty_url),
|
||||
"event": payload.get("event"),
|
||||
"seq": payload.get("seq"),
|
||||
"data": payload.get("data"),
|
||||
"broadcast": payload.get("broadcast"),
|
||||
})
|
||||
|
||||
event = payload.get("event")
|
||||
if event != "posted":
|
||||
return
|
||||
data = payload.get("data") or {}
|
||||
post_raw = data.get("post")
|
||||
if not post_raw:
|
||||
return
|
||||
try:
|
||||
post = json.loads(post_raw)
|
||||
except Exception:
|
||||
return
|
||||
normalized = self._normalize_post(post, source="websocket", raw_event=event)
|
||||
if normalized:
|
||||
self._write_message(normalized)
|
||||
|
||||
|
||||
addons = [MattermostMirror()]
|
||||
30
scripts/mattermost-proxy/run-mirror.sh
Executable file
30
scripts/mattermost-proxy/run-mirror.sh
Executable file
@@ -0,0 +1,30 @@
|
||||
#!/usr/bin/env bash
|
||||
set -euo pipefail
|
||||
|
||||
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
|
||||
WORKSPACE_ROOT="$(cd "$SCRIPT_DIR/../.." && pwd)"
|
||||
|
||||
if [ -f "$SCRIPT_DIR/.env" ]; then
|
||||
set -a
|
||||
# shellcheck source=/dev/null
|
||||
source "$SCRIPT_DIR/.env"
|
||||
set +a
|
||||
fi
|
||||
|
||||
export MATTERMOST_MIRROR_DIR="${MATTERMOST_MIRROR_DIR:-$WORKSPACE_ROOT/ai/inbox/mattermost-mirror}"
|
||||
export MATTERMOST_MIRROR_LISTEN_HOST="${MATTERMOST_MIRROR_LISTEN_HOST:-127.0.0.1}"
|
||||
export MATTERMOST_MIRROR_LISTEN_PORT="${MATTERMOST_MIRROR_LISTEN_PORT:-8080}"
|
||||
|
||||
mkdir -p "$MATTERMOST_MIRROR_DIR"
|
||||
|
||||
echo "Mattermost proxy mirror output: $MATTERMOST_MIRROR_DIR"
|
||||
echo "Listening on ${MATTERMOST_MIRROR_LISTEN_HOST}:${MATTERMOST_MIRROR_LISTEN_PORT}"
|
||||
echo "Launch Mattermost Desktop with: scripts/mattermost-proxy/launch-mattermost.sh"
|
||||
if [ -z "${MATTERMOST_MIRROR_HOST_ALLOW:-}" ]; then
|
||||
echo "MATTERMOST_MIRROR_HOST_ALLOW is not set; capturing /api/v4 traffic from the proxied app."
|
||||
fi
|
||||
|
||||
exec mitmdump \
|
||||
--listen-host "$MATTERMOST_MIRROR_LISTEN_HOST" \
|
||||
--listen-port "$MATTERMOST_MIRROR_LISTEN_PORT" \
|
||||
-s "$SCRIPT_DIR/mattermost_mirror.py"
|
||||
Reference in New Issue
Block a user