AI-Powered Document Classification with paperless-ai and Ollama
This post is a complete runbook for integrating AI-powered auto-tagging and classification into paperless-ngx using paperless-ai and a locally-running Ollama instance. The setup uses a local LLM to read document text and automatically populate metadata fields — title, document type, tags, correspondent, date, and custom fields.
Hardware and Architecture # NAS (Synology DS1621+, 10.0.10.10): runs paperless-ngx on port 5656 Desktop PC: Windows with WSL2, Docker Desktop, RTX 4090 Goal: AI auto-tagging/classification using a local LLM, zero cloud dependency The key architecture decision is a pull model: paperless-ai runs in WSL2 Docker, polls the paperless-ngx API for documents tagged ai-pending, processes them with Ollama, and writes metadata back. This is the correct approach for a desktop that is not on 24/7 — the NAS holds the queue and the desktop drains it when available.