Your private AI and notebook.

Chat with your vault, PDFs, and the web — all on local models, all on your Mac.

Download Freev0.8.2
macOS 14+ · Apple Silicon · Metal GPU required
Canto welcome screen greeting the user with a clean Ask Canto chat composer and recent notes list.
9
local models
From 2.1 GB to 76.5 GB, all on-device
100%
on-device AI
No readable cloud uploads, ever
$29.99
one-time
Lifetime updates · 1 device
Inside
Inside Canto

One workspace, eight core workflows.

Canto Chat

A chat-first home for everything in your vault.

Open Canto and you land in a chat composer that speaks fluent vault. Ask anything across your notes, PDFs, audio, video, and the web. Streaming answers come back with citations, scope controls, and a timeline rail so you always know what the model looked at.

  • Welcome screen is now a chat — no blank-note tax
  • Scope to all notes, a folder, a single note, or attached files
  • Citations [1][2] resolve to the exact source line you asked about
  • Works the same on a 2.1 GB model as on a 76 GB flagship
AI Models

Nine built-in models, plus room to bring your own.

A curated Unsloth GGUF lineup with Granite 4.1 3B as the default and GPT-OSS 120B at the top end. Choose by size, run via Metal, and connect your own through Ollama or LM Studio when you need to.

Model
Size
  • Granite 4.1 3B
    Default — tool use, RAG, and quick rewrites
    Any 8 GB Mac
    2.1 GB
  • Qwen 3.5 4B
    Everyday writing, coding, and tool calls
    8 GB Mac · smooth on 16 GB
    2.7 GB
  • Qwen 3.5 9B
    Sharper everyday reasoning and planning
    16 GB Mac · daily on 24 GB+
    5.7 GB
  • GPT-OSS 20B
    Tool-first reasoning and structured outputs
    24 GB Mac · smooth on 32 GB+
    11.6 GB
  • Qwen 3.6 27B
    Long-form reasoning and careful editing
    32 GB Mac (tight on 24 GB)
    16.8 GB
  • GLM-4.7 Flash
    MoE speed at near-dense quality
    32 GB Mac / 36 GB M3 Pro
    18.3 GB

“Comfortable on” assumes the model is your main workload. If you're running heavy apps alongside (browser with many tabs, Xcode, Logic), pick one tier down — Canto warns you in-app when free RAM is too low.

A degraded but usable launch path for low-memory or post-crash boots. Canto skips the chat model, vision worker, voice transcription, and link search so the editor and your notes always come up. One click brings AI back online — and if your last model is too big, Canto suggests a smaller one you already have.

Proof

How Canto compares — and what stays on your Mac.

FeatureCantoNotion AINotebookLMObsidian
Chat-first AI home
Canto Chat: vault, web, files
AI sidebar in pages
Chat with sources (cloud)
Community plugins
Private, on-device AI
9 local models + external
Cloud AI only
Cloud Gemini only
Plugins / BYO models
Seamless chat ↔ editor handoff
Citation → companion pane
Inline AI, no chat handoff
Read-only sources
Plugins
Agent edits your workspace
Notes + notebooks, diff review
Cloud agent (Business+)
Read-only chat
Community plugins
Encrypted multi-Mac sync
iCloud Drive + selective
Cloud sync, offline pages
Google account (cloud)
Paid E2E Sync
Notes + executable notebooks
Rich notes + Python/JS/TS
Code blocks only
No code execution
Plugins / scripts
Chat with PDFs, audio, video
Local, cited, multimodal
Cloud AI on uploads
Cloud + audio overviews
Plugins
Voice transcription
Native Whisper + Metal
Cloud Meeting Notes
Audio overviews (output)
Plugins
Semantic links and graph
Memory Links + graph
Databases / AI search
No graph or backlinks
Graph; AI via plugins
MCP / external AI access
Built-in MCP server
Enterprise MCP
No MCP
Community plugins
Offline AI workflows
After model download
Offline pages, no AI
Cloud-only
Plugins / local setup
Pricing
·One-time $29.99
·$20 / user / mo
·Free + Plus $19.99 / mo
·Free; paid sync
Built-in Partial / via plugin Not really
Nothing leaves your device.
All AI runs locally. Notes encrypted at rest.
No readable cloud uploads
Encrypted at rest (AES-256)
Offline AI inference
Encrypted iCloud Drive Sync
Local SQLite database
Your data, your rules
Pricing

Free for note-taking. Pay once for unlimited AI.

Free
$0

Just download.

Download Freev0.8.2
  • Unlimited notes & nested folders
  • WikiLinks + Backlinks
  • Code Notebooks (Python, JS, TS)
  • Privacy-first iCloud Drive Sync
  • Selective folder sync per Mac
  • Encrypted exports & backups
  • Split panes & daily notes
  • Mermaid diagrams + LaTeX math
  • Knowledge graph
  • Full-text search
  • Safe Mode + Low Memory Mode for tight Macs
  • MCP Server — connect Claude, Cursor, Windsurf & more
  • Send Diagnostics — redacted, reviewable support bundles
  • 25 free AI queries to try
Unlimited AI
$29.99one-time

1 device · Lifetime updates

  • Everything in Free, plus
  • Unlimited AI queries (no quotas)
  • Canto Chat — chat-first home for your vault, web & attachments
  • Seamless handoff — citations open the source note in a companion pane
  • Agent Chat — multi-step tool calls with diff review
  • Web research with deep-research approval gates
  • Multimodal attachments — PDFs, audio, image, and video with citations
  • Vision on local models (vision-capable Qwen + fallback)
  • 9 built-in local models, from 2.1 GB to 76.5 GB
  • External endpoints — Ollama, LM Studio, or any OpenAI-compatible API
  • @ mentions for notes, cells, attachments, sessions & selections
  • Vault Manager for bulk organization
  • Memory Links + Related Notes + semantic search

vs the alternatives

Notion AI (Business)$240/yr·NotebookLM Plus$240/yr·ChatGPT Plus$240/yr·Reflect$120/yr
FAQ

Frequently asked questions

  • Canto Chat is the new chat-first home introduced in v0.8.0. When you open Canto, you land in a chat composer that can answer across your entire vault, attached PDFs, audio, video, and the web — streaming back citations you can click. It pairs with Seamless Handoff: every citation [1][2] opens the source note in a companion pane, and you can promote any chat answer into a fresh draft note in one keystroke. Think of it as a private NotebookLM that runs on your Mac, with full read/write into your notes when you want it.
  • Three surfaces, one workspace. Canto Chat is for asking questions and exploring — your chat-first home. Notes are for writing and thinking — essays, research, knowledge bases with inline AI assistance. Notebooks are for code — Python, JavaScript, and TypeScript with instant execution and inline output. They share state: chat answers can hand off into notes, notes can be cited back inside chat, and Memory Links connect everything by meaning.
  • Canto runs a native Python environment locally on your Mac — no WebAssembly, no cloud. Popular packages like numpy, pandas, and matplotlib install automatically when you import them. Variables are shared across Python, JavaScript, and TypeScript cells in the same notebook.
  • Yes. After the first model download, Canto works completely offline. On planes, trains, or anywhere without internet — unlimited AI assistance, Memory Links, and semantic search are always available.
  • Canto ships with 9 built-in local models, all Unsloth-hosted GGUF builds: Granite 4.1 3B (2.1 GB, default), Qwen 3.5 4B (2.7 GB), Qwen 3.5 9B (5.7 GB), GPT-OSS 20B (11.6 GB), Qwen 3.6 27B (16.8 GB), GLM-4.7 Flash (18.3 GB, 30B MoE), Qwen 3.6 35B A3B (22.1 GB), GPT-OSS 120B A5B (62.8 GB), and Qwen 3.5 122B A10B (76.5 GB). All support tool calling and run locally via Metal GPU acceleration. Larger multi-file models download in shards in parallel and resume cleanly if you cancel midway. You can also connect Ollama or LM Studio as external endpoints to use any additional model you’ve downloaded — including cloud-hosted providers like OpenAI and Anthropic through Ollama.
  • Absolutely. Your notes, code, and Memory Links embeddings are stored locally in an AES-256 encrypted SQLite database. When you use iCloud Drive Sync, Canto writes encrypted sync data to your own iCloud Drive folder using a Sync Passphrase you choose. Canto does not upload readable note content to a Canto server.
  • macOS 14 (Sonoma) or later. Requires Apple Silicon (M1 or later) for Metal GPU inference. 8 GB RAM is the minimum — enough for the default Granite 4.1 3B (2.1 GB) and Qwen 3.5 4B (2.7 GB). Enable Low Memory Mode in Settings to attempt larger models on smaller machines (may reduce context or cause instability). Recommended: 16 GB+ for Qwen 3.5 9B, 24–32 GB+ for GPT-OSS 20B and Qwen 3.6 27B, 32–48 GB for GLM-4.7 Flash and Qwen 3.6 35B A3B, and 96 GB+ for the GPT-OSS 120B and Qwen 3.5 122B flagship MoE builds. Canto also cold-starts local models on demand, so the app stays light at launch even with a large model selected. Models download once and cache locally.
  • Yes. Canto ships a built-in MCP Server that lets Claude Desktop, Cursor, Claude Code, Windsurf, OpenClaw, and any other Model Context Protocol client read and write your vault locally. Toggle it on in Settings → MCP Server. The server only accepts connections from your own Mac, an optional bearer token adds a second lock, and tools are split into always-on vault tools plus a separate opt-in automation dispatcher for scripted workflows. Per-tool approvals from your client still apply, so you stay in the loop.
  • Yes. Canto includes a Safe Mode launch path designed for tight machines. If free memory is low at boot, or if the previous launch crashed before the window finished loading, Canto starts in a stripped-down mode where your notes, editor, search, and UI still work normally. AI extras are deferred until you say go, and Canto can suggest a smaller model if your last one was too large.
  • The Memory dashboard and status bar aggregate Canto’s related processes, including heavier helpers like vision workers. The Memory modal can also break the total down by process so you can see which subsystem is using the most RAM. Canto also includes a “Reclaim now” button in Settings → Memory that drops idle weights and frees RAM on demand without restarting the app.
  • Open Help → Send Diagnostics (also available from Settings → About). Canto assembles a redacted JSON bundle covering memory, AI runtime, updater state, database health, recent crash metadata, and (optionally) a screenshot and recent log excerpts. You see the full payload — with secrets and personal content scrubbed — before anything leaves your Mac, and can either save it locally for your own records or upload it directly to the LonelyDuck diagnostics endpoint. Renderer and helper-process crashes are captured locally so the next report can explain what actually happened.
  • Canto is a one-time $29.99 purchase — not a subscription. The license activates on 1 Mac and includes lifetime updates within the current major version. Canto v0.8.0 replaces what people typically pay $120–$240 per year for across Notion AI, NotebookLM Plus, Reflect, or ChatGPT Plus. To view activated devices, deactivate an old Mac, or transfer your license, create a LonelyDuck account using the same email from your purchase receipt and open your Account Dashboard.

Start free today.

Keep the note-taking app for free. Unlock the heavy AI workflows only when you actually want them.

Download Freev0.8.2

Free forever for note-taking · 25 AI queries included · Unlimited AI for $29.99 one-time

Join the Canto Discord
Support, ideas, workflows.
Join →