The best developer workflow is the one where you describe what you want, walk away, and come back to a working product. GStack makes that goal closer than ever.
最好的開發工作流程是:你描述需求、離開、回來時產品已經完成。GStack 讓這個目標比以往更近。
Why This Matters
Claude Code is powerful, but running it through a full development cycle — planning, building, reviewing, testing, shipping — still requires you to sit there and type commands at every step. That is a lot of human-in-the-loop for what should be an automated pipeline.
GStack Skills solve this by chaining those steps together. Each skill handles one phase and automatically passes its output to the next. When combined with Conductor for parallel sessions and the right Claude Code flags for headless execution, you get a system that can run multiple sprints with minimal human intervention.
This guide covers exactly how to set that up.
為什麼這很重要
Claude Code 很強大,但要跑完整個開發週期——規劃、建構、審查、測試、發布——你仍然需要坐在那裡,每一步都手動輸入指令。對於一個應該自動化的流水線來說,這實在有太多人工介入了。
GStack Skills 透過把這些步驟串聯起來解決這個問題。每個 skill 處理一個階段,並自動將產出傳給下一個階段。搭配 Conductor 的平行 session 和正確的 Claude Code headless 旗標,你就能得到一個只需最少人工介入就能跑完多個 Sprint 的系統。
這篇指南會完整說明如何設定。
The GStack Sprint Pipeline
GStack organizes development into a 7-phase pipeline. Each phase maps to one or more skills:
| Phase | Skill | What It Does |
|---|---|---|
| Think | /office-hours | Product thinking — clarifies requirements, surfaces edge cases |
| Plan | /autoplan | Chains CEO, Design, and Eng reviews into one automated sequence |
| Build | (manual or automated) | Implementation phase — writes the actual code |
| Review | /review | Auto-fixes obvious issues, only escalates ambiguous judgment calls |
| Test | /qa | Diff-aware browser testing, auto-generates regression tests |
| Ship | /ship | Syncs main, runs tests, audits coverage, pushes, opens PR |
| Reflect | /retro | Retrospective metrics — what went well, what to improve |
The key insight: these skills are designed to chain. The output of /autoplan feeds into the build phase. The test plan from /plan-eng-review (part of autoplan) feeds into /qa. The review results feed back into the build. Each handoff is automatic.
GStack Sprint 流水線
GStack 將開發組織成 7 個階段的流水線。每個階段對應一個或多個 skill:
| 階段 | Skill | 功能 |
|---|---|---|
| Think | /office-hours | 產品思考——釐清需求、發現邊界案例 |
| Plan | /autoplan | 將 CEO、Design、Eng 三輪審查串成一個自動化流程 |
| Build | (手動或自動) | 實作階段——撰寫實際程式碼 |
| Review | /review | 自動修復明顯問題,僅將模糊的判斷性問題升級給人類 |
| Test | /qa | Diff 感知的瀏覽器測試,自動產生迴歸測試 |
| Ship | /ship | 同步 main、執行測試、審核覆蓋率、推送、開 PR |
| Reflect | /retro | 回顧指標——哪些做得好、哪些需要改進 |
關鍵洞察:這些 skill 天生就是為了串聯而設計的。/autoplan 的產出餵進 build 階段。/plan-eng-review(autoplan 的一部分)產生的測試計劃餵進 /qa。審查結果回饋到 build。每一次交接都是自動的。
/autoplan: The Auto-Pilot for Planning
/autoplan is the single most important skill for minimizing human intervention. Here is what happens when you invoke it:
- CEO Review — evaluates the plan from a product perspective. Is this solving the right problem? Are priorities correct?
- Design Review — checks UX implications, consistency, accessibility concerns.
- Eng Review — produces a detailed technical plan including a test plan artifact that
/qawill later consume.
All three reviews happen in sequence, automatically. The skill only surfaces taste decisions — choices where there is no objectively correct answer and human judgment is needed.
If you are working solo and the decisions are straightforward, you can skip the interruptions entirely:
gstack-config set skip_eng_review trueThis tells the pipeline to proceed without waiting for human approval at the engineering review gate. The plan still gets generated — it just does not pause for your sign-off.
Completion Status Protocol
Every GStack skill reports its exit status using a standard protocol:
- DONE — completed successfully, no issues
- DONE_WITH_CONCERNS — completed but flagged items for your attention
- BLOCKED — cannot proceed without human input
- NEEDS_CONTEXT — missing information, will ask a specific question
When running automated pipelines, you want to see DONE or DONE_WITH_CONCERNS. If the pipeline hits BLOCKED, that is your signal to step in.
/autoplan:規劃的自動駕駛
/autoplan 是最小化人工介入中最重要的一個 skill。當你呼叫它時,會發生以下事情:
- CEO Review ——從產品角度評估計劃。這是在解決正確的問題嗎?優先順序對嗎?
- Design Review ——檢查 UX 影響、一致性、無障礙性問題。
- Eng Review ——產生詳細的技術計劃,包含一份 測試計劃 artifact,之後會被
/qa消費。
三輪審查自動依序執行。這個 skill 只會浮出品味決策——那些沒有客觀正確答案、需要人類判斷的選擇。
如果你是獨立作業,而且決策都很直覺,你可以完全跳過這些中斷:
gstack-config set skip_eng_review true這告訴流水線在工程審查閘門不需等待人類核准就繼續進行。計劃仍然會被產生——只是不會暫停等你簽核。
完成狀態協定
每個 GStack skill 都使用標準協定回報退出狀態:
- DONE ——成功完成,沒有問題
- DONE_WITH_CONCERNS ——完成但標記了需要注意的項目
- BLOCKED ——無法繼續,需要人類輸入
- NEEDS_CONTEXT ——缺少資訊,會問一個具體的問題
在跑自動化流水線時,你希望看到 DONE 或 DONE_WITH_CONCERNS。如果流水線碰到 BLOCKED,那就是你需要介入的信號。
/review + /qa: Automated Quality Gates
/review — The Auto-Fixer
/review does not just find problems — it fixes them. When invoked:
- It scans the codebase changes for issues.
- Obvious fixes (formatting, naming, missing imports) are applied automatically.
- Ambiguous judgment calls (architecture decisions, API design trade-offs) are escalated to the human.
The skill has a built-in risk budget: it caps at 30 auto-fixes per run and stops entirely if the cumulative risk exceeds 20%. This prevents runaway "fix cascades" where one change triggers a chain of increasingly questionable modifications.
/qa — Diff-Aware Testing
/qa is smarter than "run all tests." It:
- Picks up the test plan artifact generated by
/plan-eng-reviewduring the autoplan phase. - Analyzes the actual diff to determine which tests are relevant.
- Runs targeted browser tests using its own Chromium instance.
- Auto-generates regression tests for new functionality.
The diff-awareness is critical for speed. In a large project, running the full test suite after every change is slow. /qa runs only what matters.
How They Chain Together
In a typical automated sprint:
/autoplan → build → /review → /qa → /shipIf /review finds and fixes issues, those fixes feed back into /qa. If /qa discovers regressions, they feed back into the build phase. The loop continues until both gates pass cleanly — or until a BLOCKED status requires human attention.
/review + /qa:自動化品質閘門
/review——自動修復器
/review 不只是找問題——它會修復問題。當被呼叫時:
- 掃描程式碼變更中的問題。
- 明顯的修復(格式化、命名、缺少 import)會自動套用。
- 模糊的判斷性問題(架構決策、API 設計取捨)才會升級給人類。
這個 skill 有內建的風險預算:每次執行最多 30 個自動修復,如果累積風險超過 20% 就完全停止。這能防止失控的「修復連鎖反應」——一個變更觸發一連串越來越可疑的修改。
/qa——Diff 感知測試
/qa 比「跑所有測試」聰明得多。它會:
- 取得 autoplan 階段中
/plan-eng-review產生的測試計劃 artifact。 - 分析實際的 diff 來決定哪些測試是相關的。
- 使用自己的 Chromium 實例執行針對性的瀏覽器測試。
- 為新功能自動產生迴歸測試。
Diff 感知對速度至關重要。在大型專案中,每次變更後跑完整測試套件很慢。/qa 只跑相關的部分。
它們如何串聯
在典型的自動化 Sprint 中:
/autoplan → build → /review → /qa → /ship如果 /review 發現並修復了問題,這些修復會回饋給 /qa。如果 /qa 發現迴歸,會回饋到 build 階段。這個循環持續進行,直到兩個閘門都乾淨通過——或直到 BLOCKED 狀態需要人類注意。
/ship: One-Command Release
/ship handles the entire release process in a single invocation:
- Syncs with main — pulls latest changes, rebases if needed
- Runs the full test suite — not just diff-aware tests, but everything
- Audits test coverage — flags if coverage dropped below threshold
- Pushes to remote — commits and pushes to the feature branch
- Opens a Pull Request — with auto-generated description based on the sprint's changes
This is the "last mile" skill. By the time you reach /ship, the code has already passed /review and /qa. Ship is the final sanity check before the PR goes up.
The entire Think-to-Ship pipeline can run without human intervention if all status codes come back as DONE.
/ship:一鍵發布
/ship 在一次呼叫中處理整個發布流程:
- 同步 main ——拉取最新變更,必要時 rebase
- 執行完整測試套件 ——不只是 diff 感知測試,而是全部
- 審核測試覆蓋率 ——如果覆蓋率低於閾值就標記
- 推送到遠端 ——commit 並 push 到 feature branch
- 開啟 Pull Request ——根據 Sprint 的變更自動產生描述
這是「最後一哩路」的 skill。當你到達 /ship 時,程式碼已經通過了 /review 和 /qa。Ship 是 PR 上去之前的最終健全性檢查。
如果所有狀態碼都回傳 DONE,整個 Think 到 Ship 的流水線可以在沒有人工介入的情況下完成。
Conductor: Running Sprints in Parallel
A single GStack sprint is useful. Running multiple sprints in parallel is transformative. That is what Conductor does.
What Conductor Provides
- Up to 10 parallel Claude Code sessions simultaneously
- Each session gets its own Chromium instance, cookies, and workspace
- Git worktree isolation per session — no merge conflicts between parallel work
- Random port allocation (10000-60000) eliminates port collisions
How Worktree Isolation Works
Each Conductor session operates in its own git worktree. Think of worktrees as lightweight branches that exist as separate directories on disk:
project/
.git/ # shared git database
worktree-session-1/ # Sprint 1: new auth flow
worktree-session-2/ # Sprint 2: dashboard redesign
worktree-session-3/ # Sprint 3: API refactorEach worktree has its own working directory and index. Sessions can commit, branch, and modify files without interfering with each other. When a sprint completes and passes /ship, its changes merge back to main through a PR — just like a human developer working on a feature branch.
Practical Example: Three Parallel Sprints
Session 1: /autoplan → build auth flow → /review → /qa → /ship
Session 2: /autoplan → build dashboard → /review → /qa → /ship
Session 3: /autoplan → build API layer → /review → /qa → /shipAll three run simultaneously. Session 1 might finish in 20 minutes while Session 3 takes 45 minutes. Each operates independently. If Session 2 hits a BLOCKED status, it waits for human input while the others continue.
What You Can Parallelize
The most effective Conductor patterns:
- One session running
/qawhile another does/reviewwhile a third implements the next feature - Multiple independent features each running the full pipeline in their own session
- Competitive builds — same spec, multiple approaches, pick the best result (this is what gstack-auto formalizes)
Conductor:平行跑 Sprint
單一 GStack Sprint 已經很有用。平行跑多個 Sprint 則是質的飛躍。這就是 Conductor 的作用。
Conductor 提供什麼
- 同時最多 10 個平行 Claude Code session
- 每個 session 擁有自己的 Chromium 實例、cookies 和工作空間
- 每個 session 有 Git worktree 隔離 ——平行工作之間不會有合併衝突
- 隨機 port 分配(10000-60000)消除 port 碰撞
Worktree 隔離如何運作
每個 Conductor session 在自己的 git worktree 中運作。把 worktree 想成是存在於磁碟上獨立目錄的輕量級分支:
project/
.git/ # 共享的 git 資料庫
worktree-session-1/ # Sprint 1:新的認證流程
worktree-session-2/ # Sprint 2:儀表板重設計
worktree-session-3/ # Sprint 3:API 重構每個 worktree 有自己的工作目錄和 index。Session 之間可以 commit、建立分支、修改檔案而不會互相干擾。當一個 Sprint 完成並通過 /ship 時,它的變更透過 PR 合併回 main——就像一個人類開發者在 feature branch 上工作一樣。
實際範例:三個平行 Sprint
Session 1: /autoplan → build 認證流程 → /review → /qa → /ship
Session 2: /autoplan → build 儀表板 → /review → /qa → /ship
Session 3: /autoplan → build API 層 → /review → /qa → /ship三個同時運行。Session 1 可能 20 分鐘完成,Session 3 可能需要 45 分鐘。各自獨立運作。如果 Session 2 碰到 BLOCKED 狀態,它會等待人類輸入,而其他的繼續進行。
什麼可以平行化
最有效的 Conductor 模式:
- 一個 session 跑
/qa,同時另一個做/review,第三個實作下一個功能 - 多個獨立功能各自在自己的 session 中跑完整流水線
- 競爭式建構 ——相同規格、多種方式、選最好的結果(這就是 gstack-auto 正式化的做法)
Going Full Auto: gstack-auto Competitive Builds
gstack-auto (community tool by loperanger7) takes the parallel concept to its logical extreme: instead of running different features in parallel, it runs the same feature multiple times and picks the best result.
How It Works
- You provide a product spec as input.
- gstack-auto spawns N parallel builds (default: 3), each in its own git worktree.
- Each build runs the full GStack pipeline independently.
- Each result is scored on 5 quality dimensions.
- The highest-scoring version wins and becomes the baseline for the next round.
The Math Behind It
This is where competitive builds get interesting. If a single AI agent has a 25% chance of producing a viable solution:
- 1 agent: 25% success probability
- 4 parallel agents: 68% probability of at least one viable solution
- 10 parallel agents: 94% probability
The formula: 1 - (1 - p)^n where p is per-agent success rate and n is number of agents.
You are not betting on one agent being perfect. You are betting on at least one out of many being good enough.
Progressive Quality Improvement
gstack-auto runs in rounds. Each round's winner becomes the starting point for the next:
| Round | Best Score | What Happened |
|---|---|---|
| 1 | 7.2/10 | Three parallel builds from scratch |
| 2 | 8.4/10 | Three builds starting from Round 1 winner |
| 3 | 9.1/10 | Three builds starting from Round 2 winner |
Quality converges upward with each round.
Zero-Intervention Configuration
The key setting for fully automated operation:
auto_accept_winner: trueWith this flag, gstack-auto does not pause between rounds to ask if you want to proceed. It scores, picks the winner, and immediately starts the next round. You provide the spec, walk away, and come back to the best result from however many rounds you configured.
全自動:gstack-auto 競爭式建構
gstack-auto(loperanger7 的社群工具)將平行概念推向邏輯上的極致:不是平行跑不同功能,而是同一個功能跑多次,然後選最好的結果。
運作方式
- 你提供一份產品規格作為輸入。
- gstack-auto 產生 N 個平行建構(預設:3),每個在自己的 git worktree 中。
- 每個建構獨立跑完整的 GStack 流水線。
- 每個結果在 5 個品質維度上被評分。
- 最高分的版本勝出,成為下一輪的基線。
背後的數學
這就是競爭式建構有趣的地方。如果單一 AI agent 有 25% 的機率產出可行方案:
- 1 個 agent:25% 成功機率
- 4 個平行 agent:68% 的機率至少有一個可行方案
- 10 個平行 agent:94% 的機率
公式:1 - (1 - p)^n,其中 p 是單一 agent 成功率,n 是 agent 數量。
你不是在賭某一個 agent 完美。你是在賭多個中至少有一個夠好。
漸進式品質提升
gstack-auto 以輪次運行。每一輪的贏家成為下一輪的起點:
| 輪次 | 最佳分數 | 發生了什麼 |
|---|---|---|
| 1 | 7.2/10 | 三個平行建構從零開始 |
| 2 | 8.4/10 | 三個建構從第 1 輪贏家開始 |
| 3 | 9.1/10 | 三個建構從第 2 輪贏家開始 |
品質隨著每一輪向上收斂。
零介入設定
全自動運作的關鍵設定:
auto_accept_winner: true有了這個旗標,gstack-auto 不會在輪次之間暫停詢問你是否要繼續。它評分、選出贏家、立即開始下一輪。你提供規格、離開、回來時得到的就是你設定的輪數中最好的結果。
Claude Code Headless Configuration
To run GStack pipelines without human interaction, you need to configure Claude Code for headless (non-interactive) operation.
Essential Flags
-p (print mode) — The most important flag. Runs Claude Code non-interactively with a single prompt:
claude -p "Run /autoplan for the auth feature spec in specs/auth.md"--allowedTools — Granular tool allowlist. This is the RECOMMENDED approach for controlling what Claude can do:
claude -p --allowedTools "Bash(git*),Read,Write,Edit" "Run /review"--max-turns — Prevents runaway loops. Set this to a reasonable number based on the complexity of the task:
claude -p --max-turns 50 "Run /qa on the latest changes"--continue — Resumes the previous conversation. Useful for multi-step pipelines where you want to chain commands:
claude -p --continue "Now run /ship"Project-Level Permission Configuration
Instead of passing flags every time, configure permissions in .claude/settings.json:
{
"permissions": {
"allow": [
"Bash(npm test*)",
"Bash(npm run*)",
"Bash(git add*)",
"Bash(git commit*)",
"Bash(git push*)",
"Read",
"Write",
"Edit"
],
"deny": [
"Bash(rm -rf*)",
"Bash(sudo*)",
"Bash(curl*|*bash)"
]
}
}This gives the agent enough permissions to run the full GStack pipeline — build, test, commit, push — while blocking obviously dangerous operations.
The Nuclear Option: --dangerously-skip-permissions
claude -p --dangerously-skip-permissions "Run full pipeline"This bypasses ALL permission prompts. Only use this inside containers or VMs. Never on your development machine. One wrong command and your entire filesystem is fair game.
The Safe Middle Ground: --bare
claude -p --bare "Run /review"--bare skips auto-discovery of hooks, skills, and CLAUDE.md. Useful when you want a clean environment without any project-specific configuration interfering.
Claude Code Headless 設定
要讓 GStack 流水線在無人互動的情況下運行,你需要將 Claude Code 設定為 headless(非互動式)模式。
必要旗標
-p(print mode) ——最重要的旗標。以非互動方式執行 Claude Code,使用單一 prompt:
claude -p "Run /autoplan for the auth feature spec in specs/auth.md"--allowedTools ——細粒度的工具允許清單。這是控制 Claude 能做什麼的推薦方式:
claude -p --allowedTools "Bash(git*),Read,Write,Edit" "Run /review"--max-turns ——防止失控迴圈。根據任務複雜度設定合理的數字:
claude -p --max-turns 50 "Run /qa on the latest changes"--continue ——恢復前一次對話。適合多步驟流水線中你想串聯指令的情境:
claude -p --continue "Now run /ship"專案級權限設定
與其每次都傳旗標,不如在 .claude/settings.json 中設定權限:
{
"permissions": {
"allow": [
"Bash(npm test*)",
"Bash(npm run*)",
"Bash(git add*)",
"Bash(git commit*)",
"Bash(git push*)",
"Read",
"Write",
"Edit"
],
"deny": [
"Bash(rm -rf*)",
"Bash(sudo*)",
"Bash(curl*|*bash)"
]
}
}這給了 agent 足夠的權限來跑完整的 GStack 流水線——建構、測試、commit、push——同時阻擋明顯危險的操作。
核彈選項:--dangerously-skip-permissions
claude -p --dangerously-skip-permissions "Run full pipeline"這會繞過所有權限提示。只在容器或虛擬機中使用。 絕對不要在你的開發機器上用。一個錯誤的指令,你整個檔案系統就任人宰割了。
安全的中間方案:--bare
claude -p --bare "Run /review"--bare 跳過 hooks、skills 和 CLAUDE.md 的自動探索。當你想要一個乾淨的環境、不受任何專案特定設定干擾時很有用。
Safety Guardrails
Automation without guardrails is a disaster waiting to happen. GStack provides multiple layers of protection.
Skill-Level Guards
/careful — Warns before any destructive operation (file deletion, force push, database writes):
/careful/freeze — Restricts all edits to a single directory. Everything outside that directory is read-only:
/freeze src/auth//guard — Combines both /careful and /freeze into one command:
/guard src/auth//review Risk Budget
The /review skill has built-in limits:
- Maximum 30 auto-fixes per run
- Stops if cumulative risk exceeds 20% — prevents cascading changes
If the risk budget is exceeded, /review reports DONE_WITH_CONCERNS and lists the remaining issues for human review instead of attempting to fix them.
PreToolUse Hooks
For fine-grained control, configure PreToolUse hooks in your project to intercept and block specific operations before they execute. This is the lowest-level safety net — it catches things that even /careful might miss.
Container Isolation
For fully automated pipelines (especially with --dangerously-skip-permissions), always run inside a container:
- Docker containers with mounted volumes for the project
- Ephemeral VMs that get destroyed after each pipeline run
- CI/CD environments with restricted network access
The rule is simple: the more autonomous the agent, the more isolated the environment.
安全防護機制
沒有防護機制的自動化就是等著出事。GStack 提供多層保護。
Skill 級別的防護
/careful ——在任何破壞性操作(檔案刪除、force push、資料庫寫入)之前發出警告:
/careful/freeze ——限制所有編輯到單一目錄。該目錄以外的所有內容都是唯讀的:
/freeze src/auth//guard ——將 /careful 和 /freeze 合併為一個指令:
/guard src/auth//review 風險預算
/review skill 有內建限制:
- 每次執行最多 30 個自動修復
- 累積風險超過 20% 就停止 ——防止連鎖變更
如果風險預算被超出,/review 會回報 DONE_WITH_CONCERNS 並列出剩餘問題讓人類審查,而不是嘗試修復。
PreToolUse Hooks
如需更細粒度的控制,在專案中設定 PreToolUse hooks,在特定操作執行前攔截並阻擋。這是最底層的安全網——它能捕捉到連 /careful 都可能遺漏的東西。
容器隔離
對於全自動化流水線(特別是使用 --dangerously-skip-permissions 時),務必在容器中運行:
- Docker 容器,掛載專案的 volume
- 臨時 VM,每次流水線跑完後銷毀
- CI/CD 環境,限制網路存取
規則很簡單:agent 越自主,環境就越需要隔離。
Practical Recipes
Here are three concrete automation recipes at different autonomy levels.
Recipe 1: Semi-Automated Sprint (Low Autonomy)
Human stays in the loop for planning and review. Agent handles build and testing.
# Step 1: Human-guided planning
claude -p "Run /office-hours for the feature described in specs/search.md"
# Human reviews output, makes decisions
# Step 2: Automated planning with human approval
claude -p --continue "Run /autoplan"
# Human reviews plan, approves or adjusts
# Step 3: Automated build + test loop
claude -p --continue --max-turns 100 \
"Build the feature, then run /review and /qa. Fix any issues found."
# Step 4: Human reviews, then ships
claude -p --continue "Run /ship"Human touchpoints: 3 (after office-hours, after autoplan, before ship)
Recipe 2: Automated Sprint with Quality Gates (Medium Autonomy)
Agent runs the full pipeline. Human only intervenes if something is BLOCKED.
claude -p --max-turns 200 \
--allowedTools "Bash(npm*),Bash(git*),Read,Write,Edit" \
"Run the full GStack sprint for specs/search.md:
1. /autoplan (skip eng review)
2. Build the feature
3. /review
4. /qa
5. If review and qa pass, /ship
Report BLOCKED if you need human input at any point."Pre-configure for solo mode:
gstack-config set skip_eng_review true
gstack-config set proactive falseHuman touchpoints: 0-1 (only if BLOCKED)
Recipe 3: Competitive Parallel Builds (Maximum Autonomy)
Let gstack-auto handle everything. Multiple parallel builds, automatic scoring, progressive improvement.
# gstack-auto configuration
cat > gstack-auto.config.yaml << 'EOF'
spec_file: specs/search.md
parallel_builds: 3
rounds: 3
auto_accept_winner: true
pipeline:
- autoplan
- build
- review
- qa
- ship
scoring:
- correctness
- performance
- code_quality
- test_coverage
- ux_polish
EOF
# Launch
gstack-auto run --config gstack-auto.config.yamlHuman touchpoints: 0 (spec provided upfront, everything else is automated)
Choosing the Right Level
| Situation | Recipe |
|---|---|
| High-stakes production feature | Recipe 1 — keep human in the loop |
| Internal tool or low-risk feature | Recipe 2 — automated with safety gates |
| Prototype or exploratory work | Recipe 3 — let the machines compete |
Start with Recipe 1. As you build trust in the system, graduate to Recipe 2. Use Recipe 3 for greenfield projects where exploration matters more than control.
實戰食譜
以下是三個不同自主程度的具體自動化食譜。
食譜 1:半自動 Sprint(低自主度)
人類在規劃和審查環節保持介入。Agent 處理建構和測試。
# 步驟 1:人類引導的規劃
claude -p "Run /office-hours for the feature described in specs/search.md"
# 人類審查輸出,做決策
# 步驟 2:自動規劃,人類核准
claude -p --continue "Run /autoplan"
# 人類審查計劃,核准或調整
# 步驟 3:自動建構 + 測試循環
claude -p --continue --max-turns 100 \
"Build the feature, then run /review and /qa. Fix any issues found."
# 步驟 4:人類審查後發布
claude -p --continue "Run /ship"人類觸及點:3 個(office-hours 之後、autoplan 之後、ship 之前)
食譜 2:帶品質閘門的自動 Sprint(中自主度)
Agent 跑完整個流水線。人類只在 BLOCKED 時介入。
claude -p --max-turns 200 \
--allowedTools "Bash(npm*),Bash(git*),Read,Write,Edit" \
"Run the full GStack sprint for specs/search.md:
1. /autoplan (skip eng review)
2. Build the feature
3. /review
4. /qa
5. If review and qa pass, /ship
Report BLOCKED if you need human input at any point."預先設定 solo 模式:
gstack-config set skip_eng_review true
gstack-config set proactive false人類觸及點:0-1 個(只在 BLOCKED 時)
食譜 3:競爭式平行建構(最大自主度)
讓 gstack-auto 處理一切。多個平行建構、自動評分、漸進式改善。
# gstack-auto 設定
cat > gstack-auto.config.yaml << 'EOF'
spec_file: specs/search.md
parallel_builds: 3
rounds: 3
auto_accept_winner: true
pipeline:
- autoplan
- build
- review
- qa
- ship
scoring:
- correctness
- performance
- code_quality
- test_coverage
- ux_polish
EOF
# 啟動
gstack-auto run --config gstack-auto.config.yaml人類觸及點:0 個(規格預先提供,其他一切自動化)
選擇正確的層級
| 情境 | 食譜 |
|---|---|
| 高風險的生產功能 | 食譜 1 ——保持人類在迴圈中 |
| 內部工具或低風險功能 | 食譜 2 ——帶安全閘門的自動化 |
| 原型或探索性工作 | 食譜 3 ——讓機器競爭 |
從食譜 1 開始。隨著你對系統建立信任,逐漸升級到食譜 2。食譜 3 用於探索比控制更重要的全新專案。
Summary
The path from "typing every command manually" to "providing a spec and walking away" is not a single leap — it is a gradient with clear steps:
Learn the GStack pipeline — Think, Plan, Build, Review, Test, Ship, Reflect. Understand what each skill does and how they chain.
Use /autoplan to collapse three review rounds into one automated sequence. Configure
skip_eng_reviewfor solo work.Let /review and /qa be your automated quality gates. They fix what they can, escalate what they cannot.
Use /ship as your one-command release. The pipeline handles sync, test, coverage, push, and PR.
Scale with Conductor when you need parallel sprints. Worktree isolation means no conflicts.
Go full auto with gstack-auto for competitive parallel builds. Multiple attempts, automatic scoring, progressive improvement.
Configure Claude Code headless flags —
-p,--allowedTools,--max-turns— to enable non-interactive execution.Never skip safety —
/guardfor skill-level protection, risk budgets for/review, containers for--dangerously-skip-permissions.
The goal is not to remove humans from development entirely. It is to move humans from doing the work to setting the direction and reviewing the results. GStack makes that shift practical today.
總結
從「手動輸入每個指令」到「提供規格然後走開」不是一步到位——而是一個有明確步驟的漸進過程:
學會 GStack 流水線 —— Think、Plan、Build、Review、Test、Ship、Reflect。理解每個 skill 做什麼以及它們如何串聯。
使用 /autoplan 將三輪審查壓縮成一個自動化流程。獨立作業時設定
skip_eng_review。讓 /review 和 /qa 成為你的自動化品質閘門。它們修復能修的,升級不能修的。
使用 /ship 作為一鍵發布。流水線處理同步、測試、覆蓋率、推送和 PR。
用 Conductor 擴展,當你需要平行 Sprint 時。Worktree 隔離意味著沒有衝突。
用 gstack-auto 全自動化,進行競爭式平行建構。多次嘗試、自動評分、漸進式改善。
設定 Claude Code headless 旗標 ——
-p、--allowedTools、--max-turns——啟用非互動式執行。永遠不要跳過安全機制 ——
/guard用於 skill 級別保護、/review的風險預算、--dangerously-skip-permissions搭配容器。
目標不是把人類從開發中完全移除。而是把人類從做工作移到設定方向和審查結果。GStack 讓這個轉變在今天就能實現。