Your privateAI notebook.
Research, draft, and organize across everything you know — your notes, PDFs, audio, and the live web — all on local models that never leave your Mac.

One workspace, ten core workflows.
A chat-first home for everything in your vault.
Open Canto and you land in a chat composer that speaks fluent vault. Ask anything across your notes, PDFs, audio, video, and the web. Streaming answers come back with citations, scope controls, and a timeline rail so you always know what the model looked at.
- Welcome screen is now a chat — no blank-note tax
- Scope to all notes, a folder, a single note, or attached files
- Citations [1][2] resolve to the exact source line you asked about
- Works the same on a 2.1 GB model as on a 76 GB flagship
Eleven built-in models, plus room to bring your own.
A curated Unsloth GGUF lineup with Granite 4.1 3B as the default, new Gemma 4 and vision-capable Qwen options in the middle, and a 122B flagship MoE at the top end. Choose by size, run via Metal, and connect your own through Ollama or LM Studio when you need to.
- Granite 4.1 3BDefault — tool use, RAG, and quick rewritesAny 8 GB Mac2.1 GBDefault — tool use, RAG, and quick rewritesAny 8 GB Mac
- Qwen 3.5 4BEveryday writing, coding, tool calls + vision8 GB Mac · smooth on 16 GB2.9 GBEveryday writing, coding, tool calls + vision8 GB Mac · smooth on 16 GB
- Gemma 4 E4BCompact reasoning, tools, and image input12 GB Mac · daily on 16 GB+5.1 GBCompact reasoning, tools, and image input12 GB Mac · daily on 16 GB+
- Qwen 3.5 9BSharper reasoning, planning, and vision16 GB Mac · daily on 24 GB+6.0 GBSharper reasoning, planning, and vision16 GB Mac · daily on 24 GB+
- Gemma 4 12B UnifiedStronger reasoning, 256K context, vision16 GB Mac · smooth on 24 GB+7.4 GBStronger reasoning, 256K context, vision16 GB Mac · smooth on 24 GB+
- GPT-OSS 20BTool-first reasoning and structured outputs24 GB Mac · smooth on 32 GB+11.9 GBTool-first reasoning and structured outputs24 GB Mac · smooth on 32 GB+
“Comfortable on” assumes the model is your main workload. If you're running heavy apps alongside (browser with many tabs, Xcode, Logic), pick one tier down — Canto warns you in-app when free RAM is too low.
A degraded but usable launch path for low-memory or post-crash boots. Canto skips the chat model, vision worker, voice transcription, and link search so the editor and your notes always come up. One click brings AI back online — and if your last model is too big, Canto suggests a smaller one you already have.
How Canto compares — and what stays on your Mac.
| Feature | Canto | Notion AI | NotebookLM | Obsidian |
|---|---|---|---|---|
| Chat-first AI home | ●Canto Chat: vault, web, files | ◐AI sidebar in pages | ●Chat with sources (cloud) | ◐Community plugins |
| Private, on-device AI | ●11 local + vision models | ◐Cloud AI only | ◐Cloud Gemini only | ◐Plugins / BYO models |
| Planned, cited deep research | ●Plan → sources → cited doc | ◐AI search, no planned runs | ◐Cloud chat with sources | ◐Community plugins |
| Seamless chat ↔ editor handoff | ●Citation → companion pane | ◐Inline AI, no chat handoff | ○Read-only sources | ◐Plugins |
| Agent edits your workspace | ●Notes + notebooks, diff review | ◐Cloud agent (Business+) | ○Read-only chat | ◐Community plugins |
| AI organizes & files your vault | ●Inbox Mode: review-first actions | ◐Manual / AI search | ○No vault organizing | ◐Community plugins |
| Real-time encrypted multi-Mac sync | ●iCloud Drive, full vault | ◐Cloud sync, offline pages | ◐Google account (cloud) | ●Paid E2E Sync |
| Notes + executable notebooks | ●Rich notes + Python/JS/TS | ◐Code blocks only | ○No code execution | ◐Plugins / scripts |
| Chat with PDFs, audio, video, books | ●Local, cited, multimodal + EPUB | ◐Cloud AI on uploads | ●Cloud + audio overviews | ◐Plugins |
| Voice transcription | ●Native Whisper + Metal | ◐Cloud Meeting Notes | ◐Audio overviews (output) | ◐Plugins |
| Semantic links and graph | ●Memory Links + graph | ◐Databases / AI search | ○No graph or backlinks | ●Graph; AI via plugins |
| MCP / external AI access | ●Built-in MCP server | ◐Enterprise MCP | ○No MCP | ◐Community plugins |
| Offline AI workflows | ●After model download | ◐Offline pages, no AI | ○Cloud-only | ◐Plugins / local setup |
| Pricing | ·One-time $29.99 | ·$20 / user / mo | ·Free + Plus $19.99 / mo | ·Free; paid sync |
Free for note-taking. Pay once for unlimited AI.
Just download.
Download Freev0.8.10- Unlimited notes & nested folders
- WikiLinks + Backlinks
- Code Notebooks (Python, JS, TS)
- Real-time, full-vault iCloud Drive Sync
- Sync Doctor repair + clean-slate reset
- Encrypted exports & backups
- Split panes & daily notes
- Mermaid diagrams + LaTeX math
- Knowledge graph
- Full-text search
- Safe Mode + Low Memory Mode for tight Macs
- MCP Server — connect Claude, Cursor, Windsurf & more
- Send Diagnostics — redacted, reviewable support bundles
- 25 free AI queries to try
1 device · Lifetime updates
- Everything in Free, plus
- Unlimited AI queries (no quotas)
- Canto Chat — chat-first home for your vault, web & attachments
- Canto Research — planned, cited deep research saved as a real document
- Research follow-ups — turn open questions into a checklist Canto runs
- Inbox Mode — let Canto file, tag, link & cite your vault with approval-gated actions
- Create notes & notebooks from chat, straight into the folder you choose
- Seamless handoff — citations open the source note in a companion pane
- Agent Chat — multi-step tool calls with diff review
- Edit the open note from chat with a live preview before you approve
- Multimodal attachments — PDFs, audio, video, images & EPUB books with citations
- Vision on local models (vision-capable Qwen + Gemma 4)
- 11 built-in local models, from 2.1 GB to 77 GB
- External endpoints — Ollama, LM Studio, or any OpenAI-compatible API
- @ mentions for notes, cells, attachments, sessions & selections
- Vault Manager for bulk organization
- Memory Links + Related Notes + semantic search
vs the alternatives
Frequently asked questions
- Canto Chat is the new chat-first home introduced in v0.8.0. When you open Canto, you land in a chat composer that can answer across your entire vault, attached PDFs, audio, video, and the web — streaming back citations you can click. It pairs with Seamless Handoff: every citation [1][2] opens the source note in a companion pane, and you can promote any chat answer into a fresh draft note in one keystroke. Think of it as a private NotebookLM that runs on your Mac, with full read/write into your notes when you want it.
- Three surfaces, one workspace. Canto Chat is for asking questions and exploring — your chat-first home. Notes are for writing and thinking — essays, research, knowledge bases with inline AI assistance. Notebooks are for code — Python, JavaScript, and TypeScript with instant execution and inline output. They share state: chat answers can hand off into notes, notes can be cited back inside chat, and Memory Links connect everything by meaning.
- Canto Research is a dedicated Research Mode inside Canto Chat, introduced in v0.8.3 and expanded in v0.8.4. Flip the Chat / Research switch and a normal question becomes a guided run: choose your sources (vault notes, Library files, the web, or a mix), pick how deep to go, and review an editable plan before any deep web work spends a credit. As it runs, a live source trail shows exactly what Canto is reading from your vault, Library, and the web. When it finishes, you save the result as a research note, brief, comparison matrix, or notebook — with a real table of contents, clean clickable citations, and a follow-up checklist Canto can work through one task at a time.
- Canto runs a native Python environment locally on your Mac — no WebAssembly, no cloud. Popular packages like numpy, pandas, and matplotlib install automatically when you import them. Variables are shared across Python, JavaScript, and TypeScript cells in the same notebook.
- Memory Links is Canto’s semantic linking system that automatically finds related notes as you write. Unlike manual tagging or keyword search, it uses AI embeddings to understand meaning — surfacing relevant insights from thousands of notes instantly, including code examples from your notebooks.
- Yes. After the first model download, Canto works completely offline. On planes, trains, or anywhere without internet — unlimited AI assistance, Memory Links, and semantic search are always available.
- Canto ships with 11 built-in local models, all Unsloth-hosted GGUF builds: Granite 4.1 3B (2.1 GB, default), Qwen 3.5 4B (2.9 GB), Gemma 4 E4B (5.1 GB), Qwen 3.5 9B (6.0 GB), Gemma 4 12B Unified (7.4 GB), GPT-OSS 20B (11.9 GB), GLM-4.7 Flash (17.5 GB, 30B MoE), Qwen 3.6 27B (17.6 GB), Qwen 3.6 35B A3B (22.4 GB), GPT-OSS 120B A5B (63 GB), and Qwen 3.5 122B A10B (77 GB). All support tool calling and run locally via Metal GPU acceleration, and the Qwen 3.5/3.6 and Gemma 4 builds add direct image (vision) input when their projector is downloaded. Larger multi-file models download in shards in parallel and resume cleanly if you cancel midway. You can also connect Ollama or LM Studio as external endpoints to use any additional model you’ve downloaded — including cloud-hosted providers like OpenAI and Anthropic through Ollama.
- Absolutely. Your notes, code, and Memory Links embeddings are stored locally in an AES-256 encrypted SQLite database. When you use iCloud Drive Sync, Canto writes encrypted sync data to your own iCloud Drive folder using a Sync Passphrase you choose. Canto does not upload readable note content to a Canto server.
- macOS 14 (Sonoma) or later. Requires Apple Silicon (M1 or later) for Metal GPU inference. 8 GB RAM is the minimum — enough for the default Granite 4.1 3B (2.1 GB) and Qwen 3.5 4B (2.9 GB). Enable Low Memory Mode in Settings to attempt larger models on smaller machines (may reduce context or cause instability). Recommended: 12–16 GB for Gemma 4 E4B and Qwen 3.5 9B, 16–24 GB for Gemma 4 12B and GPT-OSS 20B, 24–32 GB for GLM-4.7 Flash and Qwen 3.6 27B, 32–48 GB for Qwen 3.6 35B A3B, and 96 GB+ for the GPT-OSS 120B and Qwen 3.5 122B flagship MoE builds. Canto also cold-starts local models on demand, so the app stays light at launch even with a large model selected. Models download once and cache locally.
- Yes. Canto ships a built-in MCP Server that lets Claude Desktop, Cursor, Claude Code, Windsurf, OpenClaw, and any other Model Context Protocol client read and write your vault locally. Toggle it on in Settings → MCP Server. The server only accepts connections from your own Mac, an optional bearer token adds a second lock, and tools are split into always-on vault tools plus a separate opt-in automation dispatcher for scripted workflows. Per-tool approvals from your client still apply, so you stay in the loop.
- Yes. Canto includes a Safe Mode launch path designed for tight machines. If free memory is low at boot, or if the previous launch crashed before the window finished loading, Canto starts in a stripped-down mode where your notes, editor, search, and UI still work normally. AI extras are deferred until you say go, and Canto can suggest a smaller model if your last one was too large.
- The Memory dashboard and status bar aggregate Canto’s related processes, including heavier helpers like vision workers. The Memory modal can also break the total down by process so you can see which subsystem is using the most RAM. Canto also includes a “Reclaim now” button in Settings → Memory that drops idle weights and frees RAM on demand without restarting the app.
- Open Help → Send Diagnostics (also available from Settings → About). Canto assembles a redacted JSON bundle covering memory, AI runtime, updater state, database health, recent crash metadata, and (optionally) a screenshot and recent log excerpts. You see the full payload — with secrets and personal content scrubbed — before anything leaves your Mac, and can either save it locally for your own records or upload it directly to the LonelyDuck diagnostics endpoint. Renderer and helper-process crashes are captured locally so the next report can explain what actually happened.
- Canto is a one-time $29.99 purchase — not a subscription. The license activates on 1 Mac and includes lifetime updates within the current major version. Canto v0.8.0 replaces what people typically pay $120–$240 per year for across Notion AI, NotebookLM Plus, Reflect, or ChatGPT Plus. To view activated devices, deactivate an old Mac, or transfer your license, create a LonelyDuck account using the same email from your purchase receipt and open your Account Dashboard.
Start free today.
Keep the note-taking app for free. Unlock the heavy AI workflows only when you actually want them.
Free forever for note-taking · 25 AI queries included · Unlimited AI for $29.99 one-time






