# Charles Session File Format (.chlsx) Reference for AI agents that need to parse Charles Proxy session files. --- ## File Format `.chlsx` is a **ZIP archive** containing numbered XML files, each representing one HTTP request/response pair. ### Structure inside the ZIP ``` session.chlsx ├── 00001.xml ├── 00002.xml ├── 00003.xml ├── ... └── 00/ ├── 00001.xml ├── 00002.xml └── ... ``` Files may be flat at the root or grouped in two-digit subdirectories (`00/`, `01/`, etc.) depending on session size. ### XML Structure Per File Each XML file contains: - **Request**: method, URL, protocol, headers, body - **Response**: status, protocol, headers, body - **Timing**: start time, duration Key XML elements: ```xml GET https://discourse.example.com/t/123.json HTTP/1.1
application/json
_t=abc123
200 HTTP/1.1
application/json; charset=utf-8
{"id": 123, "title": "...", "post_stream": {...}}
2026-01-15T10:30:00.000Z 450
``` ### .chls vs .chlsx vs .chlsj | Extension | Format | Notes | |---|---|---| | `.chls` | Binary | Legacy format, harder to parse | | `.chlsx` | ZIP + XML | **Prefer this**. Most common modern format | | `.chlsj` | JSON | Newer, less common; each session is one JSON file with an array of request/response objects | **Recommendation**: Configure Charles to save as `.chlsx` (File → Save Session As... → choose `.chlsx`). --- ## Discourse API Endpoints to Look For These are the endpoints worth extracting from a Charles session: | Purpose | URL pattern | Parsing target | |---|---|---| | Topic feed | `/latest.json` | `topic_list.topics[]` | | Category topics | `/c/{slug}.json` | `topic_list.topics[]` | | Single topic | `/t/{id}.json` | The full topic with posts | | Posts in topic | `/t/{id}/{page}.json` | Paginated posts | | Search | `/search.json?q=...` | `topics[]`, `posts[]` | | User activity | `/u/{username}/activity.json` | User posts/topics | --- ## Extraction Strategy for AI 1. **Open the `.chlsx` as a ZIP** (it is not encrypted) 2. **Iterate over all XML files** inside 3. For each XML, check if the request URL matches a Discourse API endpoint 4. Extract the JSON response body from `` 5. Parse the JSON and convert to Markdown 6. Organize by topic ID + title for easy search --- ## Common Pitfalls - Some responses are paginated (`/t/{id}.json?page=1`). Collect all pages for completeness. - Binary responses (images, JS bundles) should be skipped. - The same topic may appear multiple times in different Charles sessions; deduplicate by topic ID + last updated timestamp. - Session cookies captured in Charles will be expired by the time the AI reads them; only the response data matters.