Tutorials¶

zhi is an open-source Python CLI tool powered by Zhipu GLM models. It provides an intelligent terminal assistant with chat, file processing, OCR, and custom skills.

Table of Contents¶

5-Minute Quickstart
Interactive Chat
File Processing
Skill System
Shell Commands
Web Content
Composite Skills

5-Minute Quickstart¶

Get from zero to your first conversation in under 5 minutes.

Prerequisites¶

Python 3.10 or later
pip package manager
A Zhipu AI account for your API key

Step 1: Install¶

pip install zhicli

Verify the installation:

zhi --version

Step 2: Configure¶

Run the setup wizard:

zhi --setup

The wizard walks you through three steps:

Welcome to zhi (v0.1.0)

Let's get you set up. This takes about 30 seconds.

Step 1/3: API Key
  Paste your Zhipu API key (get one at open.bigmodel.cn):
  > your-api-key

Step 2/3: Defaults
  Default model for chat [glm-5]:
  Default model for skills [glm-4-flash]:
  Output directory [zhi-output]:

Step 3/3: Quick Demo
  Want to try a sample skill? [Y/n]:

Setup complete. Type /help to see available commands.

Alternatively, set the API key via environment variable:

# Linux / macOS
export ZHI_API_KEY="your-api-key"

# Windows (PowerShell)
$env:ZHI_API_KEY = "your-api-key"

Step 3: Start Chatting¶

Interactive mode -- launch the REPL:

zhi

One-shot mode -- ask a single question and exit:

zhi -c "What is machine learning?"

Run a skill -- process a file with a built-in skill:

zhi run summarize document.txt

That's it. You are ready to go.

Interactive Chat¶

Entering the REPL¶

zhi

Welcome to zhi. Type /help for commands.
You [approve]:

The [approve] label in the prompt shows the current permission mode.

Slash Commands¶

Type /help to see all available commands:

Available commands:
  /help              Show this help message
  /auto              Switch to auto mode (no permission prompts)
  /approve           Switch to approve mode (default)
  /model <name>      Switch model (glm-5, glm-4-flash, glm-4-air)
  /think             Enable thinking mode
  /fast              Disable thinking mode
  /run <skill> [args]  Run a skill
  /skill list|new|show|edit|delete  Manage skills
  /reset             Clear conversation history
  /undo              Remove last exchange
  /usage             Show token/cost stats
  /verbose           Toggle verbose output
  /exit              Exit zhi

Permission Modes¶

zhi has two permission modes:

approve (default): You must confirm before zhi writes files or runs shell commands.
auto: Skips confirmation prompts and executes operations automatically.

You [approve]: /auto
Mode switched to auto

You [auto]: /approve
Mode switched to approve

Warning

Shell commands always require confirmation, even in auto mode. This is a safety feature that cannot be bypassed.

Model Switching¶

zhi supports three models:

Model	Type	Use Case
`glm-5`	Premium	Default chat model -- most capable, higher cost
`glm-4-flash`	Economy	Skill execution -- fast and inexpensive
`glm-4-air`	Economy	Lightweight alternative

Check and switch models:

You [approve]: /model
Current model: glm-5. Available: glm-5, glm-4-flash, glm-4-air

You [approve]: /model glm-4-flash
Model switched to glm-4-flash

Thinking Mode¶

Enable thinking mode to see the model's reasoning process (GLM-5 only):

You [approve]: /think
Thinking mode enabled

You [approve]: Explain why the sky is blue

You [approve]: /fast
Thinking mode disabled

Multi-line Input¶

Add \ at the end of a line to continue on the next line:

You [approve]: Write a Python function that \
...  accepts a list parameter and \
...  returns the maximum value

Managing Conversation History¶

Undo the last exchange:

You [approve]: /undo
Last exchange removed

Clear all history:

You [approve]: /reset
Conversation cleared

Usage Statistics¶

You [approve]: /usage

Shows token counts and estimated cost for the current session.

Exiting¶

You [approve]: /exit
Goodbye!

You can also press Ctrl+D to exit, or Ctrl+C to cancel the current input.

Tip

Input history is saved automatically. Use the up/down arrow keys to browse previous inputs.
Lines containing sensitive keywords (api_key, password, token) are excluded from history.
Tab completion works for slash commands and model names.
Switching models only affects the current session -- it does not modify the config file.

File Processing¶

Reading Text Files¶

Ask zhi to read and work with files in conversation:

You [approve]: Read README.md and summarize its contents

zhi calls the file_read tool to load the file, then generates a summary.

file_read characteristics:

Only reads files within the current working directory (relative paths)
Maximum file size: 100KB (larger files are truncated)
Auto-detects encoding, defaults to UTF-8
Cannot read binary files

Listing Directory Contents¶

You [approve]: List all files in the current directory

zhi calls the file_list tool and displays filenames, sizes, and modification times.

OCR: Extracting Text from Images and PDFs¶

Extract text from scanned documents or images:

You [approve]: Extract the text from invoice.png

You [approve]: Extract text from report.pdf and summarize the key points

Supported formats:

Images: PNG, JPG, JPEG, GIF, WEBP
Documents: PDF
Maximum file size: 20MB

Writing Files¶

All file output is saved to the zhi-output/ directory:

You [approve]: Write that summary as a Markdown file

In approve mode, zhi asks for confirmation before creating the file.

Supported output formats:

Format	Extension	Content
Plain text	`.md`, `.txt`	Direct text
JSON	`.json`	Any JSON structure
CSV	`.csv`	Headers + rows
Excel	`.xlsx`	Sheet data
Word	`.docx`	Markdown text

Combining Read + Process + Write¶

You [approve]: Read data.csv, analyze the trends, then write an analysis report to report.md

zhi chains file_read, analysis, and file_write calls automatically.

Info

file_write creates new files only -- it cannot overwrite existing files.
All output goes to zhi-output/, keeping your original files safe.
Paths cannot contain .., preventing writes outside the working directory.
Excel (.xlsx) and Word (.docx) formats are supported out of the box.

Skill System¶

What Are Skills?¶

Skills are predefined YAML configurations that bundle a system prompt, allowed tools, and model settings into a reusable workflow. Run them with a single command instead of typing instructions each time. Skills use the glm-4-flash model by default, costing roughly 10% of what GLM-5 chat costs.

Built-in Skills¶

zhi ships with 15 built-in skills -- 9 single-purpose skills and 6 composite skills that chain multiple single-purpose skills together.

Single-Purpose Skills (9)¶

Skill	Description	Usage
`summarize`	Summarize a document	`zhi run summarize report.txt`
`translate`	Translate a document (default: to Chinese)	`zhi run translate readme-en.md`
`extract-text`	OCR text from PDF/images	`zhi run extract-text scan.pdf`
`extract-table`	Extract tables from documents	`zhi run extract-table invoice.pdf`
`analyze`	Deep structural analysis of a document	`zhi run analyze proposal.md`
`proofread`	Grammar, spelling, style corrections	`zhi run proofread draft.md`
`reformat`	Convert between document formats	`zhi run reformat notes.txt`
`meeting-notes`	Structure raw notes into formal minutes	`zhi run meeting-notes notes.txt`
`compare`	Diff two documents and highlight changes	`zhi run compare v1.md v2.md`

Composite Skills (6)¶

These chain multiple single-purpose skills into multi-step workflows. See the Composite Skills section below for details.

Skill	Pipeline	Usage
`contract-review`	analyze + compare + proofread	`zhi run contract-review contract.pdf`
`daily-digest`	file_list + summarize (batch)	`zhi run daily-digest ./reports/`
`invoice-to-excel`	extract-table + reformat	`zhi run invoice-to-excel invoices/`
`meeting-followup`	meeting-notes + summarize + translate	`zhi run meeting-followup notes.txt`
`report-polish`	proofread + analyze + reformat	`zhi run report-polish draft.md`
`translate-proofread`	translate + proofread	`zhi run translate-proofread doc.md`

Listing Installed Skills¶

You [approve]: /skill list

Running Skills¶

From the command line:

zhi run summarize report.txt

From interactive mode:

You [approve]: /run summarize report.txt

Creating Custom Skills¶

Option 1: Let zhi create it for you

You [approve]: Create a code review skill called code-review that reads source code and suggests improvements

zhi generates the YAML configuration file automatically.

Option 2: Write YAML manually

Create a .yaml file in your config directory's skills/ folder:

name: code-review
description: Review source code and suggest improvements
model: glm-4-flash
system_prompt: |
  You are an experienced code reviewer. Read the provided source code
  and give actionable suggestions for improvement. Focus on:
  - Code quality and readability
  - Potential bugs
  - Performance issues
  Output your review as structured markdown.
tools:
  - file_read
  - file_write
max_turns: 10
input:
  description: Source code file to review
  args:
    - name: file
      type: file
      required: true
output:
  description: Code review report in markdown
  directory: zhi-output

YAML Field Reference¶

Field	Required	Description
`name`	Yes	Skill name. Letters, digits, hyphens, underscores only. Max 64 chars.
`description`	Yes	Brief description of the skill
`system_prompt`	Yes	System prompt that guides model behavior
`tools`	Yes	List of tools the skill can access
`model`	No	Model to use. Default: `glm-4-flash`
`max_turns`	No	Maximum execution turns. Default: 15
`input`	No	Input parameter definitions
`output`	No	Output config (description and directory)

Available Tools¶

Tool	Function	Risk Level
`file_read`	Read text files	Low
`file_write`	Write new files to zhi-output/	High
`file_list`	List directory contents	Low
`ocr`	OCR for images and PDFs	Low
`shell`	Execute shell commands	High
`web_fetch`	Fetch web page content	Low
`skill_create`	Create new skills	High

Managing Skills¶

You [approve]: /skill show code-review   # View skill details
You [approve]: /skill delete code-review # Delete a custom skill

Info

Skill names must match ^[a-zA-Z0-9][a-zA-Z0-9_-]*$
Each skill can only access tools declared in its tools list
Skill output defaults to the zhi-output/ directory
Only user-created skill YAML files can be deleted; built-in skills are protected

Shell Commands¶

Basic Usage¶

Ask zhi to run commands in conversation:

You [approve]: Run ls -la to see the current directory

zhi displays the command and waits for confirmation:

zhi wants to run: ls -la
Allow? [y/n]:

Type y to allow, n to deny.

Three-Layer Safety Model¶

1. Always Requires Confirmation¶

Shell commands require user confirmation regardless of permission mode. Even in auto mode, you must approve every command.

2. Destructive Command Warnings¶

These commands trigger an extra warning:

File deletion: rm, del, rmdir
File moves: mv
Permission changes: chmod, chown
Disk operations: mkfs, dd, shred, truncate
In-place edits: sed -i
Git danger zone: git reset --hard, git clean

3. Catastrophic Commands Blocked¶

These commands are permanently blocked and cannot be executed:

rm -rf / or rm -rf ~
rm -rf /* or rm -rf ~/
mkfs /dev/...
Fork bombs
dd if=/dev/zero of=/dev/...

Timeout¶

Shell commands have a default timeout of 30 seconds, with a maximum of 300 seconds (5 minutes). When a command times out, zhi kills the entire process group to prevent leftover processes.

Output Limits¶

Command output is capped at 100KB. Anything beyond that is truncated.

Practical Examples¶

You [approve]: Check the system's Python version

You [approve]: Count lines of code in the src/ directory

You [approve]: Run pytest and tell me the results

Warning

Never let the AI run commands you do not understand, even though zhi always asks for confirmation. Cross-platform support: uses CREATE_NEW_PROCESS_GROUP on Windows, start_new_session on Unix.

Web Content¶

Fetching Web Pages¶

You [approve]: Fetch the content from https://example.com

zhi calls the web_fetch tool to retrieve the page content, automatically converting HTML to plain text.

Analyzing Web Content¶

You [approve]: Fetch this page and summarize the key points: https://example.com/article

zhi fetches the content first, then uses the GLM model to analyze it.

Fetching and Saving¶

You [approve]: Scrape https://example.com/data, extract the key data, and save it as a CSV file

zhi combines web_fetch and file_write to complete the task.

Using Web Fetch in Skills¶

Create a skill that includes web_fetch for automated web content processing:

name: web-summary
description: Fetch and summarize web pages
model: glm-4-flash
system_prompt: |
  Fetch the given URL, read the content, and produce a concise
  summary. Save the summary as a markdown file.
tools:
  - web_fetch
  - file_write
max_turns: 5

Then run it:

zhi run web-summary

Info

URLs must start with http:// or https://
Request timeout: 30 seconds
Response content limit: 50KB (excess is truncated)
HTML pages are automatically stripped of tags and converted to plain text
web_fetch follows redirects automatically
SSRF protection is built in

Composite Skills¶

Composite skills chain multiple single-purpose skills into automated multi-step workflows. They still run on glm-4-flash, keeping costs low.

contract-review¶

Pipeline: analyze + compare + proofread

Reviews a contract document through three lenses: structural analysis to identify key clauses and risks, optional comparison with a previous version to highlight changes, and proofreading to catch ambiguous wording.

# Review a single contract
zhi run contract-review contract.pdf

# Compare two versions
zhi run contract-review contract-v2.pdf contract-v1.pdf

Output: A comprehensive review report with executive summary, structural analysis, version changes (if applicable), language issues, risk assessment, and negotiation recommendations.

daily-digest¶

Pipeline: file_list + summarize (batch)

Scans all documents in a folder and produces a single combined digest report with individual summaries and cross-document insights.

zhi run daily-digest ./inbox/

Output: A digest report listing each document's summary, common themes, contradictions, and suggested follow-up actions.

Tip

Supported file types include .txt, .md, .pdf, .csv, .docx, .xlsx, .png, and .jpg. Binary and system files are skipped automatically.

invoice-to-excel¶

Pipeline: extract-table + reformat

Processes invoice files (PDF, image, or text) through OCR table extraction, then consolidates all line items into a structured Excel spreadsheet.

# Single invoice
zhi run invoice-to-excel invoice.pdf

# Batch process a folder of invoices
zhi run invoice-to-excel ./invoices/

Output: An Excel file with two sheets -- "Line Items" (one row per item across all invoices) and "Invoice Summary" (one row per invoice with totals). Dates are normalized to YYYY-MM-DD, currencies are cleaned, and totals are validated.

meeting-followup¶

Pipeline: meeting-notes + summarize + translate

Takes raw meeting notes and produces a complete follow-up package: structured minutes, an executive summary for leadership, and an optional translated summary.

# Basic follow-up
zhi run meeting-followup raw-notes.txt

# With translation
zhi run meeting-followup raw-notes.txt --to english

Output: Three files -- full meeting minutes, a 1-page executive summary, and (optionally) a translated summary. Action items appear in both the full minutes and the summary.

report-polish¶

Pipeline: proofread + analyze + reformat

Takes a draft document and produces a publication-ready version by proofreading for language issues, analyzing structure and flow, and producing a clean final version.

zhi run report-polish draft-report.md

# Specify output format
zhi run report-polish draft.md --format docx

Output: Two files -- the polished document and a change log showing a before/after quality score, all corrections made, structural improvements, and remaining suggestions.

translate-proofread¶

Pipeline: translate + proofread

Translates a document and then proofreads the translation to ensure it reads naturally in the target language.

# Default: translate to Chinese
zhi run translate-proofread article-en.md

# Specify target language
zhi run translate-proofread article.md --to english

Output: Two files -- the polished translation and a quality report with the detected source language, translation quality score, issues found and fixed, and passages that may need human review.

Appendix¶

Pipe Mode¶

Pipe text directly into zhi:

echo "Translate to English: hello world" | zhi

cat article.txt | zhi

Debug Mode¶

Enable debug logging to troubleshoot issues:

zhi --debug

Disabling Color Output¶

For terminals that do not support color:

zhi --no-color

Or set the environment variable:

export NO_COLOR=1

Configuration Reference¶

The config file is located in your system config directory as config.yaml:

api_key: "your-api-key"
default_model: "glm-5"
skill_model: "glm-4-flash"
output_dir: "zhi-output"
max_turns: 30
log_level: "INFO"

Config file locations:

macOS: ~/Library/Application Support/zhi/config.yaml
Linux: ~/.config/zhi/config.yaml
Windows: %APPDATA%\zhi\config.yaml

Environment variable overrides:

Variable	Config Field
`ZHI_API_KEY`	`api_key`
`ZHI_DEFAULT_MODEL`	`default_model`
`ZHI_OUTPUT_DIR`	`output_dir`
`ZHI_LOG_LEVEL`	`log_level`

The environment variable ZHI_API_KEY takes priority over the config file.

FAQ¶

Q: "No API key configured" error

Run zhi --setup or set the ZHI_API_KEY environment variable.

Q: File write fails with "File already exists"

file_write cannot overwrite existing files. Delete or rename the file in zhi-output/ and try again.

Q: OCR returns empty results

Confirm the file format is supported (PDF, PNG, JPG, JPEG, GIF, WEBP) and the file is under 20MB. Image clarity affects recognition quality.

Q: Shell command blocked

Certain dangerous commands (like rm -rf /) are permanently blocked. This is a safety feature that cannot be bypassed.

Q: Excel/Word output fails

Excel (.xlsx) and Word (.docx) are included in the default install. Make sure you're on the latest version: pip install --upgrade zhicli.