feat(article): 添加 GPT-5.4 深度解析文章及配套图表
- 新增 GPT-5.4 深度解析文章,涵盖六大核心能力详解 - 添加 SVG 格式的 GPT-5.4 能力全景图 - 添加 Mermaid 格式的模型家族关系图 - 添加 GPT-5.4 六大能力思维导图 - 添加 Computer Use 工作流程图 - 添加 OSWorld 桌面操作基准测试图表 - 添加 上下文窗口演进对比图 - 添加 上下文压缩原理图 - 添加 Tool Search 机制对比图 - 添加 可配置推理深度图 - 添加 GDPval 对比图表 - 添加 三方模型对比图 - 添加 API 定价对比图 - 添加 Mermaid 配置文件和样式文件 - 添加模型选择指南 SVG 图片
299
articles/003-GPT-5.4 深度解析:OpenAI 的全能战士来了.md
Normal file
@ -0,0 +1,299 @@
|
|||||||
|
# GPT-5.4 深度解析:OpenAI 的全能战士来了
|
||||||
|
|
||||||
|
> 发布日期:2026-03-16
|
||||||
|
> 分类:技术解读 / 深度分析
|
||||||
|
> 作者:老邓唠AI
|
||||||
|
|
||||||
|

|
||||||
|
|
||||||
|
## 引子:这次不只是"更强",而是"会干活了"
|
||||||
|
|
||||||
|
3 月 5 日深夜,OpenAI 扔出了一颗重磅炸弹——**GPT-5.4**。
|
||||||
|
|
||||||
|
如果你以为这又是一次"跑分更高、回答更准"的常规升级,那你低估了这次更新的意义。GPT-5.4 不只是变聪明了,它**第一次学会了操作电脑**。
|
||||||
|
|
||||||
|
是的,你没看错。它能看到你的屏幕截图,然后像一个真人一样移动鼠标、点击按钮、敲键盘——帮你订机票、填表格、发邮件、操作 Excel。在桌面操作测试中,**它的表现超越了人类**。
|
||||||
|
|
||||||
|
这不是概念演示。这是已经上线的 API,任何开发者今天就能调用。
|
||||||
|
|
||||||
|
今天老邓带你全面拆解 GPT-5.4 的六大核心能力、跑分数据、定价策略,以及它跟 Claude Opus 4.6、Gemini 3.1 Pro 的正面对决。
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 一、GPT-5.4 是什么?
|
||||||
|
|
||||||
|
GPT-5.4 是 OpenAI 于 2026 年 3 月 5 日发布的最新旗舰模型,官方定义为**"最强大且高效的专业工作前沿模型"**。
|
||||||
|
|
||||||
|
它不是一个模型,而是**一个模型家族**:
|
||||||
|
|
||||||
|
| 版本 | 定位 | 适用人群 |
|
||||||
|
|------|------|---------|
|
||||||
|
| **GPT-5.4** | 标准版,日常专业工作 | ChatGPT Plus / Team / API 开发者 |
|
||||||
|
| **GPT-5.4 Thinking** | 推理增强版,展示思考过程 | ChatGPT Plus / Team / Pro |
|
||||||
|
| **GPT-5.4 Pro** | 最高性能版,极限推理深度 | ChatGPT Pro / Enterprise / API |
|
||||||
|
|
||||||
|
三个版本共享同一个基座模型,区别在于**推理深度**和**计算资源分配**。
|
||||||
|
|
||||||
|

|
||||||
|
|
||||||
|

|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 二、六大核心能力拆解
|
||||||
|
|
||||||
|

|
||||||
|
|
||||||
|
### 2.1 原生计算机操控(Computer Use)
|
||||||
|
|
||||||
|
这是 GPT-5.4 最炸裂的能力——**OpenAI 首个原生支持计算机操控的通用模型**。
|
||||||
|
|
||||||
|

|
||||||
|
|
||||||
|
它的工作方式很直觉:
|
||||||
|
|
||||||
|

|
||||||
|
|
||||||
|
1. **看屏幕**:模型接收桌面/浏览器的截图
|
||||||
|
2. **理解界面**:识别按钮、输入框、菜单等 UI 元素
|
||||||
|
3. **发出指令**:返回结构化的鼠标移动、点击、键盘输入动作
|
||||||
|
4. **你的程序执行**:由你的代码(harness)将这些动作应用到真实环境
|
||||||
|
|
||||||
|
简单说,GPT-5.4 就像一个**坐在你电脑前的远程助手**,看着屏幕告诉你"点这里、输入那个"。
|
||||||
|
|
||||||
|
**实际能做什么?**
|
||||||
|
|
||||||
|
- 自动填写复杂的 Web 表单
|
||||||
|
- 跨应用操作工作流(打开邮件 → 读取内容 → 创建日历事件)
|
||||||
|
- 操作 ERP、CRM 等企业系统
|
||||||
|
- 自动化测试 Web 应用
|
||||||
|
|
||||||
|
**跑分有多强?**
|
||||||
|
|
||||||
|
| 基准测试 | GPT-5.4 | GPT-5.2 | 人类表现 |
|
||||||
|
|---------|---------|---------|---------|
|
||||||
|
| OSWorld-Verified(桌面操作) | **75.0%** | 47.3% | 72.4% |
|
||||||
|
| WebArena-Verified(浏览器操作) | **67.3%** | - | - |
|
||||||
|
| Online-Mind2Web(截图识别) | **92.8%** | - | - |
|
||||||
|
|
||||||
|
**OSWorld 75.0%,人类 72.4%——AI 在桌面操作任务上,第一次超越了人类基准。**
|
||||||
|
|
||||||
|

|
||||||
|
|
||||||
|
当然也有局限:截图传输有延迟,密集 UI 元素(如超大表格)的精确度还不够完美。但作为 v1 版本,这个起点已经足够惊艳。
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### 2.2 百万级上下文窗口
|
||||||
|
|
||||||
|
GPT-5.4 的标准上下文窗口为 **272K tokens**(比 GPT-5.3 Codex 的 200K 扩大了 36%),而通过 Codex 配置,可以解锁高达 **100 万 tokens** 的超大上下文。
|
||||||
|
|
||||||
|
100 万 tokens 是什么概念?
|
||||||
|
|
||||||
|
| 内容类型 | 大约容量 |
|
||||||
|
|---------|---------|
|
||||||
|
| 普通中文文字 | 约 150 万字 |
|
||||||
|
| 代码 | 约 75 万行 |
|
||||||
|
| PDF 文档 | 约 3000 页 |
|
||||||
|
|
||||||
|
这意味着你可以把**一整个代码仓库**、**一本完整的技术手册**、或者**几个月的聊天记录**一次性喂给模型,它都能理解和引用。
|
||||||
|
|
||||||
|
对于 Agent 场景尤其关键——智能体在执行长链条任务时,不会因为"忘了之前做过什么"而翻车。
|
||||||
|
|
||||||
|

|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### 2.3 上下文压缩(Compaction)
|
||||||
|
|
||||||
|
大上下文的问题是**贵**。100 万 tokens 每个请求都要收费,成本飞涨。
|
||||||
|
|
||||||
|
GPT-5.4 引入了一个巧妙的解决方案——**Compaction(上下文压缩)**。这是 OpenAI 首个在主线模型中训练支持的压缩能力。
|
||||||
|
|
||||||
|

|
||||||
|
|
||||||
|
它的原理是:在长对话或 Agent 执行过程中,模型会**自动总结和压缩早期的上下文**,保留关键信息,丢弃冗余细节。这样即使对话轮次很多,也不会撑爆上下文窗口。
|
||||||
|
|
||||||
|
开发者可以通过两个参数来控制:
|
||||||
|
- `model_context_window`:设置最大上下文窗口
|
||||||
|
- `model_auto_compact_token_limit`:设置触发自动压缩的阈值
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### 2.4 工具搜索(Tool Search)
|
||||||
|
|
||||||
|
这是一个面向 API 开发者的重磅特性。
|
||||||
|
|
||||||
|
传统做法是把所有工具的定义一股脑塞进 prompt,100 个工具的 schema 轻松吃掉几万 tokens。**GPT-5.4 的工具搜索彻底改变了这个局面。**
|
||||||
|
|
||||||
|
新方案:
|
||||||
|
1. 模型只接收一个**轻量的工具列表**(名称 + 简短描述)
|
||||||
|
2. 需要用某个工具时,**按需加载**该工具的完整定义
|
||||||
|
3. 用完即弃,不占用后续请求的 token
|
||||||
|
|
||||||
|
效果?**Token 使用量直降 47%,准确率不变。**
|
||||||
|
|
||||||
|
对于构建大规模 Agent 系统的团队来说,这意味着成本直接砍半。
|
||||||
|
|
||||||
|

|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### 2.5 可配置推理深度
|
||||||
|
|
||||||
|
GPT-5.4 提供了 **5 档推理深度**,开发者可以精细控制模型的"思考力度":
|
||||||
|
|
||||||
|
| 档位 | 用途 | 成本 |
|
||||||
|
|------|------|------|
|
||||||
|
| `none` | 直接回答,不推理 | 最低 |
|
||||||
|
| `low` | 简单逻辑、摘要 | 低 |
|
||||||
|
| `medium` | 通用场景,平衡性价比 | 中 |
|
||||||
|
| `high` | 多步分析、自我修正 | 高 |
|
||||||
|
| `xhigh` | 极限推理,科研级 | 最高 |
|
||||||
|
|
||||||
|
不同场景用不同档位,简单问题不浪费算力,复杂问题全力以赴——这是一个非常实用的成本优化手段。
|
||||||
|
|
||||||
|

|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### 2.6 编码能力大幅升级
|
||||||
|
|
||||||
|
GPT-5.4 融合了 GPT-5.3 Codex 的编码能力,在代码任务上表现惊人:
|
||||||
|
|
||||||
|
| 基准测试 | GPT-5.4 | GPT-5.3 Codex | Claude Opus 4.6 |
|
||||||
|
|---------|---------|---------------|-----------------|
|
||||||
|
| SWE-Bench Verified | **~80.0%** | 75.2% | 80.8% |
|
||||||
|
| HumanEval | **95.1%** | 93.8% | 94.6% |
|
||||||
|
| Terminal-Bench 2.0 | **75.1%** | - | 65.4% |
|
||||||
|
| SWE-Bench Pro | **57.7%** | - | - |
|
||||||
|
|
||||||
|
在 SWE-Bench Verified(真实 GitHub issue 修复能力)上,GPT-5.4 与 Claude Opus 4.6 仅差 0.8 个百分点,几乎持平。而在 Terminal-Bench 2.0(终端操作能力)上,GPT-5.4 以 75.1% 的成绩大幅领先。
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 三、专业知识工作:逼近人类专家
|
||||||
|
|
||||||
|
GPT-5.4 最让行业震动的数据来自 **GDPval 基准测试**——这个测试覆盖 44 个职业领域,衡量模型在"真实经济价值工作"中的表现。
|
||||||
|
|
||||||
|
| 指标 | GPT-5.4 | GPT-5.2 | 提升 |
|
||||||
|
|------|---------|---------|------|
|
||||||
|
| GDPval 综合 | **83.0%** | 70.9% | +12.1% |
|
||||||
|
| 投行电子表格建模 | **87.3%** | 68.4% | +18.9% |
|
||||||
|
| 演示文稿偏好率 | **68.0%** | 32.0% | - |
|
||||||
|
| 错误率降低 | **-33%** | - | 单个陈述 |
|
||||||
|
| 整体回答错误率降低 | **-18%** | - | 完整回答 |
|
||||||
|
|
||||||
|
**83% 的 GDPval 成绩意味着什么?** 在 44 个职业领域中,GPT-5.4 的工作输出质量已经**接近行业从业者的平均水平**。投行建模 87.3%,比 GPT-5.2 猛涨近 19 个百分点——这不是微调,这是质的飞跃。
|
||||||
|
|
||||||
|

|
||||||
|
|
||||||
|

|
||||||
|
> 浅色 = GPT-5.2,深色 = GPT-5.4
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 四、三国争霸:GPT-5.4 vs Claude Opus 4.6 vs Gemini 3.1 Pro
|
||||||
|
|
||||||
|
2026 年 3 月,三大 AI 巨头的旗舰模型罕见地同台竞技。老邓帮你拉了一张全维度对比表:
|
||||||
|
|
||||||
|
### 4.1 基准跑分对比
|
||||||
|
|
||||||
|
| 基准测试 | GPT-5.4 | Claude Opus 4.6 | Gemini 3.1 Pro | 谁赢了 |
|
||||||
|
|---------|---------|-----------------|----------------|--------|
|
||||||
|
| GDPval(知识工作) | **83.0%** | 78.0% | - | GPT-5.4 |
|
||||||
|
| GPQA Diamond(科学推理) | 92.8% | 91.3% | **94.3%** | Gemini |
|
||||||
|
| ARC-AGI-2(抽象推理) | 73.3% | 75.2% | **77.1%** | Gemini |
|
||||||
|
| MMMU Pro(视觉理解) | 81.2% | **85.1%** | 80.5% | Claude |
|
||||||
|
| SWE-Bench Verified(代码修复) | ~80.0% | **80.8%** | 80.6% | Claude(微弱) |
|
||||||
|
| Terminal-Bench 2.0(终端操作) | **75.1%** | 65.4% | 68.5% | GPT-5.4 |
|
||||||
|
| OSWorld(桌面操控) | **75.0%** | - | - | GPT-5.4 |
|
||||||
|
| BrowseComp(网页浏览) | 82.7% | 84.0% | **85.9%** | Gemini |
|
||||||
|
|
||||||
|
### 4.2 定价对比
|
||||||
|
|
||||||
|
| 模型 | 输入价格(/百万 tokens) | 输出价格(/百万 tokens) | 上下文窗口 |
|
||||||
|
|------|----------------------|----------------------|-----------|
|
||||||
|
| Gemini 3.1 Pro | **$2** | **$12** | 2M |
|
||||||
|
| GPT-5.4 | $2.50 | $15 | 272K(最大 1M) |
|
||||||
|
| Claude Opus 4.6 | $5 | $25 | 200K |
|
||||||
|
| GPT-5.4 Pro | $30 | $180 | 272K(最大 1M) |
|
||||||
|
|
||||||
|
### 4.3 各家优势领域一目了然
|
||||||
|
|
||||||
|

|
||||||
|
|
||||||
|
### 4.4 怎么选?
|
||||||
|
|
||||||
|
**一句话总结:没有全能冠军,只有场景之王。**
|
||||||
|
|
||||||
|
- **选 GPT-5.4**:如果你需要**桌面自动化、知识工作、工具编排**——它是唯一一个 Computer Use 超越人类的模型
|
||||||
|
- **选 Claude Opus 4.6**:如果你的核心场景是**代码开发、多文件重构、视觉理解**——它在 SWE-Bench 和 MMMU Pro 上仍然最强
|
||||||
|
- **选 Gemini 3.1 Pro**:如果你**预算有限但要求高质量推理**——它用 GPT-5.4 Pro 十五分之一的价格,达到了同级别的科学推理水平
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 五、定价与可用性
|
||||||
|
|
||||||
|
### 5.1 API 定价
|
||||||
|
|
||||||
|
| 模型 | 输入 | 输出 | 备注 |
|
||||||
|
|------|------|------|------|
|
||||||
|
| GPT-5.4 | $2.50/M | $15/M | 标准档 |
|
||||||
|
| GPT-5.4 Pro | $30/M | $180/M | 极限性能 |
|
||||||
|
| Batch 模式 | 标准 50% | 标准 50% | 异步批量处理 |
|
||||||
|
| Flex 模式 | 标准 50% | 标准 50% | 弹性定价 |
|
||||||
|
| Priority 模式 | 标准 200% | 标准 200% | 优先响应 |
|
||||||
|
|
||||||
|
### 5.2 谁能用?
|
||||||
|
|
||||||
|
| 渠道 | 可用版本 |
|
||||||
|
|------|---------|
|
||||||
|
| ChatGPT Plus / Team | GPT-5.4 Thinking |
|
||||||
|
| ChatGPT Pro / Enterprise | GPT-5.4 Thinking + GPT-5.4 Pro |
|
||||||
|
| API | gpt-5.4、gpt-5.4-pro |
|
||||||
|
|
||||||
|
GPT-5.2 Thinking 将保留至 **2026 年 6 月 5 日**,之后下线。如果你还在用旧版,记得提前迁移。
|
||||||
|
|
||||||
|

|
||||||
|
|
||||||
|

|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 六、老邓的观点
|
||||||
|
|
||||||
|
说几句大实话。
|
||||||
|
|
||||||
|
**GPT-5.4 最大的意义不在跑分,而在 Computer Use。**
|
||||||
|
|
||||||
|
跑分上,GPT-5.4、Claude Opus 4.6、Gemini 3.1 Pro 三家在大多数评测中只差 2-3 个百分点,说实话对普通用户几乎没有体感差异。真正拉开差距的是**能力维度的拓展**。
|
||||||
|
|
||||||
|
Computer Use 让 AI 第一次真正能"用电脑"。这不是花活,这是生产力工具的范式转变。想象一下:
|
||||||
|
|
||||||
|
- 财务人员让 AI 自动操作 SAP 系统出报表
|
||||||
|
- 运营人员让 AI 自动在后台批量上架商品
|
||||||
|
- HR 让 AI 自动在多个招聘平台发布岗位
|
||||||
|
|
||||||
|
这些场景以前需要 RPA(机器人流程自动化)工具,写一堆脆弱的规则脚本。现在?给 GPT-5.4 一个截图,它自己看着干。
|
||||||
|
|
||||||
|
当然,v1 版本还有明显的局限——延迟、精确度、安全边界都需要打磨。但方向是对的,OpenAI 在这一局抢了先手。
|
||||||
|
|
||||||
|
**另一个被低估的特性是 Tool Search。** 47% 的 token 节省对大规模 Agent 系统来说是巨大的成本优化,这个设计思路值得所有做 AI 应用的团队学习。
|
||||||
|
|
||||||
|
**最后说说价格。** Gemini 3.1 Pro 用十五分之一的价格打到了同级别的推理水平,Google 在性价比上确实卷得最狠。但 OpenAI 的 Batch 和 Flex 半价模式也很香,异步场景下成本可以压得很低。
|
||||||
|
|
||||||
|
总之,2026 年的 AI 模型市场,已经不是"谁最强"的问题了,而是**"谁在你的场景里最合适"**。
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 参考资料
|
||||||
|
|
||||||
|
- [Introducing GPT-5.4 | OpenAI](https://openai.com/index/introducing-gpt-5-4/)
|
||||||
|
- [OpenAI launches GPT-5.4 with Pro and Thinking versions | TechCrunch](https://techcrunch.com/2026/03/05/openai-launches-gpt-5-4-with-pro-and-thinking-versions/)
|
||||||
|
- [GPT-5.4: Native Computer Use, 1M Context Window, Tool Search | DataCamp](https://www.datacamp.com/blog/gpt-5-4)
|
||||||
|
- [GPT-5.4 vs Opus 4.6 vs Gemini 3.1 Pro: Best AI Model? | DigitalApplied](https://www.digitalapplied.com/blog/gpt-5-4-vs-opus-4-6-vs-gemini-3-1-pro-best-frontier-model)
|
||||||
|
- [GPT-5.4 Release Date, Features & Pricing | NxCode](https://www.nxcode.io/resources/news/gpt-5-4-release-date-features-pricing-2026)
|
||||||
|
- [OpenAI GPT-5.4 正式登场 | IT之家](https://www.ithome.com/0/926/344.htm)
|
||||||
|
- [Computer Use API | OpenAI](https://developers.openai.com/api/docs/guides/tools-computer-use/)
|
||||||
|
- [GPT-5.4 API Developer Guide | NxCode](https://www.nxcode.io/resources/news/gpt-5-4-api-developer-guide-reasoning-computer-use-2026)
|
||||||
BIN
articles/003/capability-stack.png
Normal file
|
After Width: | Height: | Size: 579 KiB |
54
articles/003/capability-stack.svg
Normal file
@ -0,0 +1,54 @@
|
|||||||
|
<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 1200 700" font-family="'Inter', 'SF Pro', system-ui, sans-serif">
|
||||||
|
<title>GPT-5.4 能力全景:专业工作、编码、计算机操控、工具使用与长上下文</title>
|
||||||
|
<defs>
|
||||||
|
<linearGradient id="bg" x1="0%" y1="0%" x2="100%" y2="100%">
|
||||||
|
<stop offset="0%" stop-color="#040816"/>
|
||||||
|
<stop offset="100%" stop-color="#111827"/>
|
||||||
|
</linearGradient>
|
||||||
|
</defs>
|
||||||
|
|
||||||
|
<rect width="1200" height="700" fill="url(#bg)"/>
|
||||||
|
|
||||||
|
<text x="600" y="56" fill="#F8FAFC" font-size="32" font-weight="800" text-anchor="middle">GPT-5.4 能力全景</text>
|
||||||
|
<text x="600" y="84" fill="#94A3B8" font-size="15" text-anchor="middle">这不是常规升级,而是一次模型产品线的重置</text>
|
||||||
|
|
||||||
|
<g transform="translate(90,130)">
|
||||||
|
<rect width="1020" height="90" rx="16" fill="#0B172A" stroke="#22D3EE" stroke-opacity="0.45"/>
|
||||||
|
<text x="28" y="38" fill="#67E8F9" font-size="20" font-weight="800">1. 专业知识工作</text>
|
||||||
|
<text x="28" y="64" fill="#E0F2FE" font-size="14">GDPval 83.0%,投行建模 87.3%,演示文稿、文档、事实准确性均大幅超越 GPT-5.2。</text>
|
||||||
|
<text x="946" y="56" fill="#67E8F9" font-size="28" font-weight="800" text-anchor="end">83.0%</text>
|
||||||
|
</g>
|
||||||
|
|
||||||
|
<g transform="translate(130,250)">
|
||||||
|
<rect width="940" height="82" rx="16" fill="#0B1A16" stroke="#10B981" stroke-opacity="0.45"/>
|
||||||
|
<text x="28" y="35" fill="#6EE7B7" font-size="19" font-weight="800">2. 编码与智能体循环</text>
|
||||||
|
<text x="28" y="60" fill="#D1FAE5" font-size="14">SWE-Bench Pro 57.7%,与 GPT-5.3-Codex 持平,同时在研究和工具使用上覆盖面更广。</text>
|
||||||
|
<text x="866" y="53" fill="#6EE7B7" font-size="26" font-weight="800" text-anchor="end">57.7%</text>
|
||||||
|
</g>
|
||||||
|
|
||||||
|
<g transform="translate(170,360)">
|
||||||
|
<rect width="860" height="82" rx="16" fill="#1A1205" stroke="#F59E0B" stroke-opacity="0.45"/>
|
||||||
|
<text x="28" y="35" fill="#FCD34D" font-size="19" font-weight="800">3. 原生计算机操控</text>
|
||||||
|
<text x="28" y="60" fill="#FEF3C7" font-size="14">OSWorld-Verified 75.0%,支持基于截图的浏览器和桌面操作,通过 Computer Use API 实现。</text>
|
||||||
|
<text x="786" y="53" fill="#FCD34D" font-size="26" font-weight="800" text-anchor="end">75.0%</text>
|
||||||
|
</g>
|
||||||
|
|
||||||
|
<g transform="translate(210,470)">
|
||||||
|
<rect width="780" height="82" rx="16" fill="#111827" stroke="#818CF8" stroke-opacity="0.45"/>
|
||||||
|
<text x="28" y="35" fill="#C7D2FE" font-size="19" font-weight="800">4. 工具使用与 MCP</text>
|
||||||
|
<text x="28" y="60" fill="#E0E7FF" font-size="14">BrowseComp 82.7%,MCP Atlas 67.2%,支持大规模延迟工具目录的 Tool Search 机制。</text>
|
||||||
|
<text x="706" y="53" fill="#C7D2FE" font-size="26" font-weight="800" text-anchor="end">82.7%</text>
|
||||||
|
</g>
|
||||||
|
|
||||||
|
<g transform="translate(250,580)">
|
||||||
|
<rect width="700" height="82" rx="16" fill="#1A1020" stroke="#F472B6" stroke-opacity="0.45"/>
|
||||||
|
<text x="28" y="35" fill="#F9A8D4" font-size="19" font-weight="800">5. 长上下文与推理</text>
|
||||||
|
<text x="28" y="60" fill="#FBCFE8" font-size="14">上下文窗口 105 万 tokens,输出 12.8 万 tokens,但远端检索质量仍有下降。</text>
|
||||||
|
<text x="626" y="53" fill="#F9A8D4" font-size="26" font-weight="800" text-anchor="end">1.05M</text>
|
||||||
|
</g>
|
||||||
|
|
||||||
|
<path d="M600 220 L600 250" stroke="#22D3EE" stroke-width="2" stroke-dasharray="6 6" opacity="0.5"/>
|
||||||
|
<path d="M600 332 L600 360" stroke="#10B981" stroke-width="2" stroke-dasharray="6 6" opacity="0.5"/>
|
||||||
|
<path d="M600 442 L600 470" stroke="#F59E0B" stroke-width="2" stroke-dasharray="6 6" opacity="0.5"/>
|
||||||
|
<path d="M600 552 L600 580" stroke="#818CF8" stroke-width="2" stroke-dasharray="6 6" opacity="0.5"/>
|
||||||
|
</svg>
|
||||||
|
After Width: | Height: | Size: 3.6 KiB |
BIN
articles/003/cover-compressed.png
Normal file
|
After Width: | Height: | Size: 1.2 MiB |
BIN
articles/003/cover.png
Normal file
|
After Width: | Height: | Size: 3.9 MiB |
18
articles/003/diagram-01-model-family.mmd
Normal file
@ -0,0 +1,18 @@
|
|||||||
|
graph TD
|
||||||
|
A["GPT-5.4 基座模型"] --> B["GPT-5.4 标准版"]
|
||||||
|
A --> C["GPT-5.4 Thinking"]
|
||||||
|
A --> D["GPT-5.4 Pro"]
|
||||||
|
|
||||||
|
B --> B1["日常专业工作"]
|
||||||
|
B --> B2["API 调用"]
|
||||||
|
|
||||||
|
C --> C1["展示推理过程"]
|
||||||
|
C --> C2["复杂问题求解"]
|
||||||
|
|
||||||
|
D --> D1["极限推理深度"]
|
||||||
|
D --> D2["科研/金融级任务"]
|
||||||
|
|
||||||
|
style A fill:#10a37f,stroke:#fff,color:#fff
|
||||||
|
style B fill:#1a7f64,stroke:#fff,color:#fff
|
||||||
|
style C fill:#1a7f64,stroke:#fff,color:#fff
|
||||||
|
style D fill:#1a7f64,stroke:#fff,color:#fff
|
||||||
BIN
articles/003/diagram-01-model-family.png
Normal file
|
After Width: | Height: | Size: 114 KiB |
21
articles/003/diagram-02-capabilities.mmd
Normal file
@ -0,0 +1,21 @@
|
|||||||
|
mindmap
|
||||||
|
root((GPT-5.4<br/>六大能力))
|
||||||
|
原生计算机操控
|
||||||
|
截图理解
|
||||||
|
鼠标键盘操作
|
||||||
|
跨应用工作流
|
||||||
|
百万级上下文
|
||||||
|
标准 272K
|
||||||
|
最大 1M tokens
|
||||||
|
上下文压缩
|
||||||
|
自动总结压缩
|
||||||
|
保留关键信息
|
||||||
|
工具搜索
|
||||||
|
按需加载工具
|
||||||
|
Token 降低 47%
|
||||||
|
可配置推理
|
||||||
|
5 档深度
|
||||||
|
成本精细控制
|
||||||
|
编码能力升级
|
||||||
|
SWE-Bench ~80%
|
||||||
|
Terminal-Bench 75.1%
|
||||||
BIN
articles/003/diagram-02-capabilities.png
Normal file
|
After Width: | Height: | Size: 245 KiB |
10
articles/003/diagram-03-computer-use.mmd
Normal file
@ -0,0 +1,10 @@
|
|||||||
|
graph LR
|
||||||
|
A["🖥️ 截取屏幕"] --> B["🧠 GPT-5.4<br/>理解界面"]
|
||||||
|
B --> C["📋 返回动作指令<br/>点击/输入/滚动"]
|
||||||
|
C --> D["⚙️ 你的程序<br/>执行操作"]
|
||||||
|
D --> A
|
||||||
|
|
||||||
|
style A fill:#f9f,stroke:#333
|
||||||
|
style B fill:#10a37f,stroke:#fff,color:#fff
|
||||||
|
style C fill:#bbf,stroke:#333
|
||||||
|
style D fill:#fbb,stroke:#333
|
||||||
BIN
articles/003/diagram-03-computer-use.png
Normal file
|
After Width: | Height: | Size: 60 KiB |
5
articles/003/diagram-04-osworld.mmd
Normal file
@ -0,0 +1,5 @@
|
|||||||
|
xychart-beta
|
||||||
|
title "OSWorld-Verified 桌面操作基准测试"
|
||||||
|
x-axis ["GPT-5.2", "人类表现", "GPT-5.4"]
|
||||||
|
y-axis "成功率 (%)" 0 --> 100
|
||||||
|
bar [47.3, 72.4, 75.0]
|
||||||
BIN
articles/003/diagram-04-osworld.png
Normal file
|
After Width: | Height: | Size: 19 KiB |
5
articles/003/diagram-05-context-window.mmd
Normal file
@ -0,0 +1,5 @@
|
|||||||
|
xychart-beta
|
||||||
|
title "上下文窗口演进(单位:K tokens)"
|
||||||
|
x-axis ["GPT-5.2", "GPT-5.3 Codex", "GPT-5.4 标准", "GPT-5.4 最大"]
|
||||||
|
y-axis "K tokens" 0 --> 1100
|
||||||
|
bar [128, 200, 272, 1000]
|
||||||
BIN
articles/003/diagram-05-context-window.png
Normal file
|
After Width: | Height: | Size: 21 KiB |
17
articles/003/diagram-06-compaction.mmd
Normal file
@ -0,0 +1,17 @@
|
|||||||
|
graph LR
|
||||||
|
subgraph 压缩前["❌ 压缩前:全量加载"]
|
||||||
|
A1["轮次 1<br/>1.2K tokens"] --> A2["轮次 2<br/>3.5K tokens"]
|
||||||
|
A2 --> A3["轮次 3<br/>2.8K tokens"]
|
||||||
|
A3 --> A4["轮次 4<br/>4.1K tokens"]
|
||||||
|
A4 --> A5["...<br/>持续膨胀 💥"]
|
||||||
|
end
|
||||||
|
|
||||||
|
A5 --> C["🗜️ Compaction<br/>自动压缩"]
|
||||||
|
|
||||||
|
subgraph 压缩后["✅ 压缩后:智能摘要"]
|
||||||
|
C --> B1["摘要<br/>0.6K tokens"]
|
||||||
|
B1 --> B2["轮次 4<br/>4.1K tokens"]
|
||||||
|
B2 --> B3["新轮次<br/>继续对话 ✅"]
|
||||||
|
end
|
||||||
|
|
||||||
|
style C fill:#10a37f,stroke:#fff,color:#fff
|
||||||
BIN
articles/003/diagram-06-compaction.png
Normal file
|
After Width: | Height: | Size: 118 KiB |
15
articles/003/diagram-07-tool-search.mmd
Normal file
@ -0,0 +1,15 @@
|
|||||||
|
graph TB
|
||||||
|
subgraph 传统方式["❌ 传统方式:全量加载"]
|
||||||
|
T1["请求开始"] --> T2["加载全部 100 个工具定义<br/>~40K tokens"]
|
||||||
|
T2 --> T3["模型选择 1 个工具"]
|
||||||
|
T3 --> T4["99 个工具的定义被浪费"]
|
||||||
|
end
|
||||||
|
|
||||||
|
subgraph 新方式["✅ Tool Search:按需加载"]
|
||||||
|
N1["请求开始"] --> N2["加载轻量工具列表<br/>~2K tokens"]
|
||||||
|
N2 --> N3["模型搜索需要的工具"]
|
||||||
|
N3 --> N4["按需加载 1 个工具定义<br/>~400 tokens"]
|
||||||
|
end
|
||||||
|
|
||||||
|
style T4 fill:#c62828,stroke:#fff,color:#fff
|
||||||
|
style N4 fill:#10a37f,stroke:#fff,color:#fff
|
||||||
BIN
articles/003/diagram-07-tool-search.png
Normal file
|
After Width: | Height: | Size: 141 KiB |
13
articles/003/diagram-08-reasoning.mmd
Normal file
@ -0,0 +1,13 @@
|
|||||||
|
graph LR
|
||||||
|
A["用户请求"] --> B{"判断任务复杂度"}
|
||||||
|
B -->|"简单查询"| C["none<br/>⚡ 极速 · 极省"]
|
||||||
|
B -->|"日常问答"| D["low<br/>💬 轻推理"]
|
||||||
|
B -->|"通用任务"| E["medium<br/>⚖️ 平衡"]
|
||||||
|
B -->|"复杂分析"| F["high<br/>🔬 深度思考"]
|
||||||
|
B -->|"科研/金融"| G["xhigh<br/>🧠 极限推理"]
|
||||||
|
|
||||||
|
style C fill:#4caf50,stroke:#fff,color:#fff
|
||||||
|
style D fill:#8bc34a,stroke:#fff,color:#fff
|
||||||
|
style E fill:#ff9800,stroke:#fff,color:#fff
|
||||||
|
style F fill:#f44336,stroke:#fff,color:#fff
|
||||||
|
style G fill:#9c27b0,stroke:#fff,color:#fff
|
||||||
BIN
articles/003/diagram-08-reasoning.png
Normal file
|
After Width: | Height: | Size: 138 KiB |
6
articles/003/diagram-09-gdpval.mmd
Normal file
@ -0,0 +1,6 @@
|
|||||||
|
xychart-beta
|
||||||
|
title "GPT-5.4 vs GPT-5.2 关键指标提升"
|
||||||
|
x-axis ["GDPval 综合", "投行建模", "OSWorld 桌面", "BrowseComp 网页"]
|
||||||
|
y-axis "得分 (%)" 0 --> 100
|
||||||
|
bar [70.9, 68.4, 47.3, 65.7]
|
||||||
|
bar [83.0, 87.3, 75.0, 82.7]
|
||||||
BIN
articles/003/diagram-09-gdpval.png
Normal file
|
After Width: | Height: | Size: 21 KiB |
22
articles/003/diagram-10-comparison.mmd
Normal file
@ -0,0 +1,22 @@
|
|||||||
|
graph TB
|
||||||
|
subgraph GPT54["🟢 GPT-5.4 领先"]
|
||||||
|
G1["GDPval 知识工作 83.0%"]
|
||||||
|
G2["OSWorld 桌面操控 75.0%"]
|
||||||
|
G3["Terminal-Bench 终端 75.1%"]
|
||||||
|
end
|
||||||
|
|
||||||
|
subgraph Claude["🔵 Claude Opus 4.6 领先"]
|
||||||
|
C1["SWE-Bench 代码修复 80.8%"]
|
||||||
|
C2["MMMU Pro 视觉理解 85.1%"]
|
||||||
|
C3["多文件重构 最佳体验"]
|
||||||
|
end
|
||||||
|
|
||||||
|
subgraph Gemini["🟡 Gemini 3.1 Pro 领先"]
|
||||||
|
GE1["GPQA Diamond 科学推理 94.3%"]
|
||||||
|
GE2["ARC-AGI-2 抽象推理 77.1%"]
|
||||||
|
GE3["BrowseComp 网页浏览 85.9%"]
|
||||||
|
end
|
||||||
|
|
||||||
|
style GPT54 fill:#0d2137,stroke:#10a37f,color:#e0f7fa
|
||||||
|
style Claude fill:#0d2137,stroke:#d97706,color:#e0f7fa
|
||||||
|
style Gemini fill:#0d2137,stroke:#4285f4,color:#e0f7fa
|
||||||
BIN
articles/003/diagram-10-comparison.png
Normal file
|
After Width: | Height: | Size: 189 KiB |
5
articles/003/diagram-11-pricing.mmd
Normal file
@ -0,0 +1,5 @@
|
|||||||
|
xychart-beta
|
||||||
|
title "API 输出价格对比($/百万 tokens)"
|
||||||
|
x-axis ["Gemini 3.1 Pro", "GPT-5.4", "Claude Opus 4.6", "GPT-5.4 Pro"]
|
||||||
|
y-axis "价格 ($)" 0 --> 200
|
||||||
|
bar [12, 15, 25, 180]
|
||||||
BIN
articles/003/diagram-11-pricing.png
Normal file
|
After Width: | Height: | Size: 20 KiB |
BIN
articles/003/gdpval-chart.png
Normal file
|
After Width: | Height: | Size: 26 KiB |
37
articles/003/mermaid-config.json
Normal file
@ -0,0 +1,37 @@
|
|||||||
|
{
|
||||||
|
"theme": "base",
|
||||||
|
"themeVariables": {
|
||||||
|
"primaryColor": "#0d2137",
|
||||||
|
"primaryTextColor": "#e0f7fa",
|
||||||
|
"primaryBorderColor": "#00e5ff",
|
||||||
|
"lineColor": "#00b8d4",
|
||||||
|
"secondaryColor": "#0a1628",
|
||||||
|
"secondaryTextColor": "#b2ebf2",
|
||||||
|
"secondaryBorderColor": "#00bcd4",
|
||||||
|
"tertiaryColor": "#112240",
|
||||||
|
"tertiaryTextColor": "#e0f7fa",
|
||||||
|
"tertiaryBorderColor": "#26c6da",
|
||||||
|
"noteBkgColor": "#0d2137",
|
||||||
|
"noteTextColor": "#e0f7fa",
|
||||||
|
"noteBorderColor": "#00e5ff",
|
||||||
|
"edgeLabelBackground": "#0a1628",
|
||||||
|
"clusterBkg": "#0a1a2e",
|
||||||
|
"clusterBorder": "#1a5276",
|
||||||
|
"titleColor": "#00e5ff",
|
||||||
|
"actorBkg": "#0d2137",
|
||||||
|
"actorBorder": "#00e5ff",
|
||||||
|
"actorTextColor": "#e0f7fa",
|
||||||
|
"actorLineColor": "#00b8d4",
|
||||||
|
"signalColor": "#00e5ff",
|
||||||
|
"signalTextColor": "#e0f7fa",
|
||||||
|
"labelBoxBkgColor": "#0d2137",
|
||||||
|
"labelBoxBorderColor": "#00e5ff",
|
||||||
|
"labelTextColor": "#e0f7fa",
|
||||||
|
"loopTextColor": "#80deea",
|
||||||
|
"activationBorderColor": "#00e5ff",
|
||||||
|
"activationBkgColor": "#112240",
|
||||||
|
"sequenceNumberColor": "#00e5ff",
|
||||||
|
"fontFamily": "SF Pro Display, -apple-system, BlinkMacSystemFont, Segoe UI, Helvetica Neue, Arial, sans-serif",
|
||||||
|
"fontSize": "15px"
|
||||||
|
}
|
||||||
|
}
|
||||||
7
articles/003/mermaid-fix.css
Normal file
@ -0,0 +1,7 @@
|
|||||||
|
.mindmap-node text, .mindmap-node tspan,
|
||||||
|
.node text, .node tspan,
|
||||||
|
.label text, .label tspan,
|
||||||
|
text, tspan {
|
||||||
|
fill: #ffffff !important;
|
||||||
|
color: #ffffff !important;
|
||||||
|
}
|
||||||
54
articles/003/mermaid-mindmap.css
Normal file
@ -0,0 +1,54 @@
|
|||||||
|
/* Mindmap sci-fi style - force visible node backgrounds */
|
||||||
|
|
||||||
|
text, tspan {
|
||||||
|
fill: #e0f7fa !important;
|
||||||
|
font-family: 'SF Pro Display', -apple-system, BlinkMacSystemFont, 'Segoe UI', 'Helvetica Neue', Arial, sans-serif !important;
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Root node */
|
||||||
|
.mindmap-node:first-of-type circle {
|
||||||
|
fill: #0d2137 !important;
|
||||||
|
stroke: #00e5ff !important;
|
||||||
|
stroke-width: 3px !important;
|
||||||
|
filter: drop-shadow(0 0 12px rgba(0, 229, 255, 0.6)) !important;
|
||||||
|
}
|
||||||
|
|
||||||
|
/* All mindmap section/node shapes */
|
||||||
|
.mindmap-node rect,
|
||||||
|
.mindmap-node polygon,
|
||||||
|
.mindmap-node circle,
|
||||||
|
.mindmap-node ellipse,
|
||||||
|
.mindmap-node path {
|
||||||
|
stroke: #00b8d4 !important;
|
||||||
|
stroke-width: 2px !important;
|
||||||
|
filter: drop-shadow(0 0 6px rgba(0, 229, 255, 0.35)) !important;
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Force different section colors instead of black */
|
||||||
|
.section-0 rect, .section-0 path { fill: #0d3b66 !important; stroke: #00e5ff !important; }
|
||||||
|
.section-1 rect, .section-1 path { fill: #1a3a4a !important; stroke: #26c6da !important; }
|
||||||
|
.section-2 rect, .section-2 path { fill: #1b3044 !important; stroke: #4dd0e1 !important; }
|
||||||
|
.section-3 rect, .section-3 path { fill: #14293d !important; stroke: #00bcd4 !important; }
|
||||||
|
.section-4 rect, .section-4 path { fill: #0f2b3d !important; stroke: #80deea !important; }
|
||||||
|
.section-5 rect, .section-5 path { fill: #0d3352 !important; stroke: #4fc3f7 !important; }
|
||||||
|
.section-6 rect, .section-6 path { fill: #102a40 !important; stroke: #29b6f6 !important; }
|
||||||
|
.section-7 rect, .section-7 path { fill: #0e2d44 !important; stroke: #81d4fa !important; }
|
||||||
|
.section-8 rect, .section-8 path { fill: #113148 !important; stroke: #b3e5fc !important; }
|
||||||
|
|
||||||
|
/* Generic fallback: any rect/path that ends up black */
|
||||||
|
rect[fill="#000"], rect[fill="#000000"], rect[fill="black"],
|
||||||
|
path[fill="#000"], path[fill="#000000"], path[fill="black"] {
|
||||||
|
fill: #0d3b66 !important;
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Catch-all: any element with inline black-ish fill */
|
||||||
|
[style*="fill: rgb(0, 0, 0)"], [style*="fill:rgb(0,0,0)"],
|
||||||
|
[style*="fill:#000"], [style*="fill: #000"] {
|
||||||
|
fill: #0d3b66 !important;
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Lines between nodes */
|
||||||
|
line, path.edge {
|
||||||
|
stroke: #00b8d4 !important;
|
||||||
|
stroke-width: 2px !important;
|
||||||
|
}
|
||||||
115
articles/003/mermaid-tech.css
Normal file
@ -0,0 +1,115 @@
|
|||||||
|
/* Sci-fi / Tech style for mermaid diagrams */
|
||||||
|
|
||||||
|
/* Node styling */
|
||||||
|
.node rect, .node polygon, .node circle, .node ellipse {
|
||||||
|
stroke-width: 2px !important;
|
||||||
|
filter: drop-shadow(0 0 6px rgba(0, 229, 255, 0.4)) !important;
|
||||||
|
rx: 8 !important;
|
||||||
|
ry: 8 !important;
|
||||||
|
}
|
||||||
|
|
||||||
|
/* All text white/cyan */
|
||||||
|
text, tspan {
|
||||||
|
fill: #e0f7fa !important;
|
||||||
|
font-family: 'SF Pro Display', -apple-system, BlinkMacSystemFont, 'Segoe UI', 'Helvetica Neue', Arial, sans-serif !important;
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Cluster/subgraph borders */
|
||||||
|
.cluster rect {
|
||||||
|
stroke: #1a6a8a !important;
|
||||||
|
stroke-width: 2px !important;
|
||||||
|
stroke-dasharray: 6 3 !important;
|
||||||
|
fill: rgba(10, 26, 46, 0.7) !important;
|
||||||
|
rx: 12 !important;
|
||||||
|
ry: 12 !important;
|
||||||
|
filter: drop-shadow(0 0 8px rgba(0, 184, 212, 0.2)) !important;
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Cluster labels */
|
||||||
|
.cluster text, .cluster tspan {
|
||||||
|
fill: #4dd0e1 !important;
|
||||||
|
font-weight: 600 !important;
|
||||||
|
font-size: 14px !important;
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Edge/arrow lines */
|
||||||
|
.edge-pattern-solid, .flowchart-link {
|
||||||
|
stroke: #00b8d4 !important;
|
||||||
|
stroke-width: 2px !important;
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Arrow markers */
|
||||||
|
marker path {
|
||||||
|
fill: #00e5ff !important;
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Sequence diagram lines */
|
||||||
|
.messageLine0, .messageLine1 {
|
||||||
|
stroke: #00b8d4 !important;
|
||||||
|
stroke-width: 2px !important;
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Sequence diagram actors */
|
||||||
|
.actor {
|
||||||
|
stroke: #00e5ff !important;
|
||||||
|
fill: #0d2137 !important;
|
||||||
|
stroke-width: 2px !important;
|
||||||
|
filter: drop-shadow(0 0 6px rgba(0, 229, 255, 0.3)) !important;
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Labels on edges */
|
||||||
|
.edgeLabel rect {
|
||||||
|
fill: #0a1628 !important;
|
||||||
|
opacity: 0.9 !important;
|
||||||
|
}
|
||||||
|
|
||||||
|
.edgeLabel span, .edgeLabel text, .edgeLabel tspan {
|
||||||
|
fill: #80deea !important;
|
||||||
|
color: #80deea !important;
|
||||||
|
font-size: 12px !important;
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Note boxes */
|
||||||
|
.note {
|
||||||
|
fill: #112240 !important;
|
||||||
|
stroke: #00e5ff !important;
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Mindmap specific */
|
||||||
|
.mindmap-node rect, .mindmap-node circle, .mindmap-node polygon {
|
||||||
|
filter: drop-shadow(0 0 6px rgba(0, 229, 255, 0.4)) !important;
|
||||||
|
}
|
||||||
|
|
||||||
|
.mindmap-node text, .mindmap-node tspan {
|
||||||
|
fill: #e0f7fa !important;
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Activation bars in sequence diagrams */
|
||||||
|
.activation0, .activation1, .activation2 {
|
||||||
|
fill: #112240 !important;
|
||||||
|
stroke: #00e5ff !important;
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Loop/alt boxes */
|
||||||
|
.loopLine {
|
||||||
|
stroke: #1a6a8a !important;
|
||||||
|
stroke-dasharray: 4 3 !important;
|
||||||
|
}
|
||||||
|
|
||||||
|
.loopText tspan, .loopText text {
|
||||||
|
fill: #4dd0e1 !important;
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Label styling */
|
||||||
|
.label text, .label tspan {
|
||||||
|
fill: #e0f7fa !important;
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Highlighted nodes with custom styles from mermaid */
|
||||||
|
[style*="fill:#4A90D9"], [style*="fill:#E74C3C"], [style*="fill:#27AE60"],
|
||||||
|
[style*="fill:#F39C12"], [style*="fill:#8E44AD"], [style*="fill:#9B59B6"],
|
||||||
|
[style*="fill:#2ECC71"], [style*="fill:#3498DB"], [style*="fill:#E67E22"] {
|
||||||
|
filter: drop-shadow(0 0 10px rgba(0, 229, 255, 0.6)) !important;
|
||||||
|
stroke: #00e5ff !important;
|
||||||
|
stroke-width: 2px !important;
|
||||||
|
}
|
||||||
BIN
articles/003/model-selection-map.png
Normal file
|
After Width: | Height: | Size: 747 KiB |
51
articles/003/model-selection-map.svg
Normal file
@ -0,0 +1,51 @@
|
|||||||
|
<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 1200 720" font-family="'Inter', 'SF Pro', system-ui, sans-serif">
|
||||||
|
<title>GPT-5.4 模型选择指南:GPT-5.4、GPT-5.4 Pro、GPT-5.3-Codex、GPT-5.2 对比</title>
|
||||||
|
<defs>
|
||||||
|
<linearGradient id="bg" x1="0%" y1="0%" x2="100%" y2="100%">
|
||||||
|
<stop offset="0%" stop-color="#040816"/>
|
||||||
|
<stop offset="100%" stop-color="#111827"/>
|
||||||
|
</linearGradient>
|
||||||
|
</defs>
|
||||||
|
|
||||||
|
<rect width="1200" height="720" fill="url(#bg)"/>
|
||||||
|
|
||||||
|
<text x="600" y="54" fill="#F8FAFC" font-size="30" font-weight="800" text-anchor="middle">该选哪个 OpenAI 模型?</text>
|
||||||
|
<text x="600" y="82" fill="#94A3B8" font-size="15" text-anchor="middle">X 轴:工作流覆盖广度 Y 轴:成本与延迟</text>
|
||||||
|
|
||||||
|
<line x1="130" y1="610" x2="1070" y2="610" stroke="#334155" stroke-width="2"/>
|
||||||
|
<line x1="130" y1="610" x2="130" y2="130" stroke="#334155" stroke-width="2"/>
|
||||||
|
|
||||||
|
<text x="1085" y="620" fill="#94A3B8" font-size="13">覆盖面更广的混合工作流 →</text>
|
||||||
|
<text x="130" y="640" fill="#94A3B8" font-size="13">← 窄领域专精工作流</text>
|
||||||
|
<text x="76" y="150" fill="#94A3B8" font-size="13" transform="rotate(-90 76 150)">更贵 / 更慢 ↑</text>
|
||||||
|
<text x="102" y="610" fill="#94A3B8" font-size="13" transform="rotate(-90 102 610)">↓ 更便宜 / 更快</text>
|
||||||
|
|
||||||
|
<rect x="500" y="270" width="260" height="150" rx="22" fill="#082032" stroke="#22D3EE" stroke-width="2.2"/>
|
||||||
|
<text x="630" y="322" fill="#67E8F9" font-size="26" font-weight="800" text-anchor="middle">GPT-5.4</text>
|
||||||
|
<text x="630" y="354" fill="#E0F2FE" font-size="15" text-anchor="middle">最佳全能默认选择</text>
|
||||||
|
<text x="630" y="378" fill="#E0F2FE" font-size="15" text-anchor="middle">研究 + 编码 + 工具 + 浏览器</text>
|
||||||
|
<text x="630" y="403" fill="#A5F3FC" font-size="13" text-anchor="middle">$2.50 输入 / $15 输出</text>
|
||||||
|
|
||||||
|
<rect x="800" y="150" width="250" height="145" rx="22" fill="#1E1025" stroke="#F472B6" stroke-width="2.2"/>
|
||||||
|
<text x="925" y="202" fill="#F9A8D4" font-size="25" font-weight="800" text-anchor="middle">GPT-5.4 Pro</text>
|
||||||
|
<text x="925" y="234" fill="#FCE7F3" font-size="15" text-anchor="middle">最高性能天花板</text>
|
||||||
|
<text x="925" y="258" fill="#FCE7F3" font-size="15" text-anchor="middle">准确度优先于延迟时使用</text>
|
||||||
|
<text x="925" y="283" fill="#FBCFE8" font-size="13" text-anchor="middle">$30 输入 / $180 输出</text>
|
||||||
|
|
||||||
|
<rect x="190" y="310" width="250" height="145" rx="22" fill="#0C1A14" stroke="#10B981" stroke-width="2.2"/>
|
||||||
|
<text x="315" y="362" fill="#6EE7B7" font-size="25" font-weight="800" text-anchor="middle">GPT-5.3-Codex</text>
|
||||||
|
<text x="315" y="394" fill="#D1FAE5" font-size="15" text-anchor="middle">编码专精之选</text>
|
||||||
|
<text x="315" y="418" fill="#D1FAE5" font-size="15" text-anchor="middle">擅长终端密集型循环任务</text>
|
||||||
|
<text x="315" y="443" fill="#BBF7D0" font-size="13" text-anchor="middle">$1.75 输入 / $14 输出</text>
|
||||||
|
|
||||||
|
<rect x="240" y="490" width="220" height="110" rx="18" fill="#151A22" stroke="#94A3B8" stroke-width="1.8"/>
|
||||||
|
<text x="350" y="530" fill="#CBD5E1" font-size="23" font-weight="800" text-anchor="middle">GPT-5.2</text>
|
||||||
|
<text x="350" y="556" fill="#CBD5E1" font-size="14" text-anchor="middle">上代模型(过渡参考)</text>
|
||||||
|
<text x="350" y="580" fill="#94A3B8" font-size="12" text-anchor="middle">临时迁移基准线</text>
|
||||||
|
|
||||||
|
<path d="M440 380 C470 360, 490 340, 500 330" stroke="#10B981" stroke-width="2" stroke-dasharray="7 7" opacity="0.55"/>
|
||||||
|
<path d="M760 310 C790 280, 820 250, 800 223" stroke="#F472B6" stroke-width="2" stroke-dasharray="7 7" opacity="0.55"/>
|
||||||
|
<path d="M460 540 C500 520, 530 500, 560 455" stroke="#94A3B8" stroke-width="2" stroke-dasharray="7 7" opacity="0.45"/>
|
||||||
|
|
||||||
|
<text x="635" y="675" fill="#64748B" font-size="12" text-anchor="middle">选型建议:通用工作选 GPT-5.4,追求极限精度选 GPT-5.4 Pro,纯编码场景选 GPT-5.3-Codex。</text>
|
||||||
|
</svg>
|
||||||
|
After Width: | Height: | Size: 3.9 KiB |
BIN
articles/003/openai-gpt54.png
Normal file
|
After Width: | Height: | Size: 1013 KiB |
227
scripts/render_mermaid.js
Normal file
@ -0,0 +1,227 @@
|
|||||||
|
/**
|
||||||
|
* 用 Playwright 渲染 Mermaid .mmd 文件为高质量 PNG
|
||||||
|
* 用法: node render_mermaid.js <input.mmd> <output.png> [width]
|
||||||
|
*/
|
||||||
|
|
||||||
|
const { chromium } = require('/Users/bing/node_modules/.pnpm/playwright@1.58.2/node_modules/playwright');
|
||||||
|
const fs = require('fs');
|
||||||
|
const path = require('path');
|
||||||
|
|
||||||
|
const mmdFile = process.argv[2];
|
||||||
|
const outFile = process.argv[3];
|
||||||
|
const width = parseInt(process.argv[4] || '2400', 10);
|
||||||
|
|
||||||
|
if (!mmdFile || !outFile) {
|
||||||
|
console.error('Usage: node render_mermaid.js <input.mmd> <output.png> [width]');
|
||||||
|
process.exit(1);
|
||||||
|
}
|
||||||
|
|
||||||
|
const mmdContent = fs.readFileSync(mmdFile, 'utf-8');
|
||||||
|
|
||||||
|
// 检测图表类型
|
||||||
|
const isMindmap = mmdContent.trim().startsWith('mindmap');
|
||||||
|
const isXYChart = mmdContent.includes('xychart-beta');
|
||||||
|
|
||||||
|
const html = `<!DOCTYPE html>
|
||||||
|
<html>
|
||||||
|
<head>
|
||||||
|
<meta charset="utf-8">
|
||||||
|
<script src="https://cdn.jsdelivr.net/npm/mermaid@11/dist/mermaid.min.js"></script>
|
||||||
|
<style>
|
||||||
|
body {
|
||||||
|
margin: 0;
|
||||||
|
padding: 40px;
|
||||||
|
background: #080c18;
|
||||||
|
display: flex;
|
||||||
|
justify-content: center;
|
||||||
|
align-items: center;
|
||||||
|
min-height: 100vh;
|
||||||
|
}
|
||||||
|
#diagram {
|
||||||
|
display: inline-block;
|
||||||
|
}
|
||||||
|
|
||||||
|
/* ===== 全局文字 ===== */
|
||||||
|
.mermaid text, .mermaid tspan, .mermaid span, .mermaid p {
|
||||||
|
font-family: 'SF Pro Display', -apple-system, BlinkMacSystemFont, 'Segoe UI', 'Helvetica Neue', Arial, sans-serif !important;
|
||||||
|
}
|
||||||
|
|
||||||
|
/* ===== 流程图节点文字强制白色 ===== */
|
||||||
|
.mermaid .node .label,
|
||||||
|
.mermaid .node .label span,
|
||||||
|
.mermaid .node .label p,
|
||||||
|
.mermaid .nodeLabel,
|
||||||
|
.mermaid .edgeLabel span,
|
||||||
|
.mermaid .edgeLabel p,
|
||||||
|
.mermaid .cluster-label span,
|
||||||
|
.mermaid .cluster-label p {
|
||||||
|
color: #e0f7fa !important;
|
||||||
|
}
|
||||||
|
.mermaid .node text, .mermaid .node tspan {
|
||||||
|
fill: #e0f7fa !important;
|
||||||
|
}
|
||||||
|
|
||||||
|
/* ===== 流程图节点发光 ===== */
|
||||||
|
.mermaid .node rect, .mermaid .node polygon, .mermaid .node circle {
|
||||||
|
filter: drop-shadow(0 0 6px rgba(0, 229, 255, 0.4));
|
||||||
|
rx: 8;
|
||||||
|
ry: 8;
|
||||||
|
}
|
||||||
|
|
||||||
|
/* ===== 流程图边线 ===== */
|
||||||
|
.mermaid .edge-pattern-solid, .mermaid .flowchart-link {
|
||||||
|
stroke: #00b8d4 !important;
|
||||||
|
stroke-width: 2px !important;
|
||||||
|
}
|
||||||
|
.mermaid marker path {
|
||||||
|
fill: #00e5ff !important;
|
||||||
|
}
|
||||||
|
|
||||||
|
/* ===== Cluster/subgraph ===== */
|
||||||
|
.mermaid .cluster rect {
|
||||||
|
stroke: #1a6a8a !important;
|
||||||
|
stroke-width: 2px !important;
|
||||||
|
stroke-dasharray: 6 3 !important;
|
||||||
|
fill: rgba(10, 26, 46, 0.7) !important;
|
||||||
|
rx: 12 !important;
|
||||||
|
ry: 12 !important;
|
||||||
|
}
|
||||||
|
.mermaid .cluster-label text, .mermaid .cluster-label tspan {
|
||||||
|
fill: #4dd0e1 !important;
|
||||||
|
}
|
||||||
|
.mermaid .cluster-label span, .mermaid .cluster-label p {
|
||||||
|
color: #4dd0e1 !important;
|
||||||
|
}
|
||||||
|
|
||||||
|
/* ===== Edge labels ===== */
|
||||||
|
.mermaid .edgeLabel rect {
|
||||||
|
fill: #0a1628 !important;
|
||||||
|
opacity: 0.9 !important;
|
||||||
|
}
|
||||||
|
.mermaid .edgeLabel text, .mermaid .edgeLabel tspan {
|
||||||
|
fill: #80deea !important;
|
||||||
|
}
|
||||||
|
|
||||||
|
/* ===== Mindmap 修复黑块 ===== */
|
||||||
|
.mermaid .mindmap-node rect,
|
||||||
|
.mermaid .mindmap-node polygon,
|
||||||
|
.mermaid .mindmap-node circle,
|
||||||
|
.mermaid .mindmap-node ellipse,
|
||||||
|
.mermaid .mindmap-node path {
|
||||||
|
stroke: #00b8d4 !important;
|
||||||
|
stroke-width: 2px !important;
|
||||||
|
filter: drop-shadow(0 0 6px rgba(0, 229, 255, 0.35)) !important;
|
||||||
|
}
|
||||||
|
.mermaid .mindmap-node text, .mermaid .mindmap-node tspan {
|
||||||
|
fill: #e0f7fa !important;
|
||||||
|
}
|
||||||
|
/* Mindmap 分区颜色 */
|
||||||
|
.mermaid .section-0 rect, .mermaid .section-0 path { fill: #0d3b66 !important; stroke: #00e5ff !important; }
|
||||||
|
.mermaid .section-1 rect, .mermaid .section-1 path { fill: #1a3a4a !important; stroke: #26c6da !important; }
|
||||||
|
.mermaid .section-2 rect, .mermaid .section-2 path { fill: #1b3044 !important; stroke: #4dd0e1 !important; }
|
||||||
|
.mermaid .section-3 rect, .mermaid .section-3 path { fill: #14293d !important; stroke: #00bcd4 !important; }
|
||||||
|
.mermaid .section-4 rect, .mermaid .section-4 path { fill: #0f2b3d !important; stroke: #80deea !important; }
|
||||||
|
.mermaid .section-5 rect, .mermaid .section-5 path { fill: #0d3352 !important; stroke: #4fc3f7 !important; }
|
||||||
|
.mermaid .section-6 rect, .mermaid .section-6 path { fill: #102a40 !important; stroke: #29b6f6 !important; }
|
||||||
|
.mermaid .section-7 rect, .mermaid .section-7 path { fill: #0e2d44 !important; stroke: #81d4fa !important; }
|
||||||
|
/* 根节点 */
|
||||||
|
.mermaid .mindmap-node:first-of-type circle {
|
||||||
|
fill: #0d2137 !important;
|
||||||
|
stroke: #00e5ff !important;
|
||||||
|
stroke-width: 3px !important;
|
||||||
|
filter: drop-shadow(0 0 12px rgba(0, 229, 255, 0.6)) !important;
|
||||||
|
}
|
||||||
|
/* 黑色回退 */
|
||||||
|
.mermaid rect[fill="#000"], .mermaid rect[fill="#000000"], .mermaid rect[fill="black"],
|
||||||
|
.mermaid path[fill="#000"], .mermaid path[fill="#000000"], .mermaid path[fill="black"] {
|
||||||
|
fill: #0d3b66 !important;
|
||||||
|
}
|
||||||
|
</style>
|
||||||
|
</head>
|
||||||
|
<body>
|
||||||
|
<div id="diagram">
|
||||||
|
<pre class="mermaid">
|
||||||
|
${mmdContent}
|
||||||
|
</pre>
|
||||||
|
</div>
|
||||||
|
<script>
|
||||||
|
mermaid.initialize({
|
||||||
|
startOnLoad: true,
|
||||||
|
theme: 'base',
|
||||||
|
themeVariables: {
|
||||||
|
primaryColor: '#0d2137',
|
||||||
|
primaryTextColor: '#e0f7fa',
|
||||||
|
primaryBorderColor: '#00e5ff',
|
||||||
|
lineColor: '#00b8d4',
|
||||||
|
secondaryColor: '#1a3a4a',
|
||||||
|
secondaryTextColor: '#e0f7fa',
|
||||||
|
secondaryBorderColor: '#00bcd4',
|
||||||
|
tertiaryColor: '#112240',
|
||||||
|
tertiaryTextColor: '#e0f7fa',
|
||||||
|
tertiaryBorderColor: '#26c6da',
|
||||||
|
noteBkgColor: '#0d2137',
|
||||||
|
noteTextColor: '#e0f7fa',
|
||||||
|
noteBorderColor: '#00e5ff',
|
||||||
|
edgeLabelBackground: '#0a1628',
|
||||||
|
clusterBkg: '#0a1a2e',
|
||||||
|
clusterBorder: '#1a5276',
|
||||||
|
titleColor: '#00e5ff',
|
||||||
|
background: '#080c18',
|
||||||
|
mainBkg: '#0d2137',
|
||||||
|
nodeBorder: '#00e5ff',
|
||||||
|
nodeTextColor: '#e0f7fa',
|
||||||
|
fontFamily: 'SF Pro Display, -apple-system, BlinkMacSystemFont, Segoe UI, Helvetica Neue, Arial, sans-serif',
|
||||||
|
fontSize: '16px',
|
||||||
|
xyChart: {
|
||||||
|
backgroundColor: '#080c18',
|
||||||
|
titleColor: '#00e5ff',
|
||||||
|
xAxisLabelColor: '#e0f7fa',
|
||||||
|
yAxisLabelColor: '#e0f7fa',
|
||||||
|
xAxisTitleColor: '#00e5ff',
|
||||||
|
yAxisTitleColor: '#00e5ff',
|
||||||
|
xAxisTickColor: '#00b8d4',
|
||||||
|
yAxisTickColor: '#00b8d4',
|
||||||
|
xAxisLineColor: '#00b8d4',
|
||||||
|
yAxisLineColor: '#00b8d4',
|
||||||
|
plotColorPalette: '#00e5ff,#4dd0e1,#26c6da,#00bcd4,#80deea,#b2ebf2'
|
||||||
|
}
|
||||||
|
},
|
||||||
|
mindmap: { useMaxWidth: false },
|
||||||
|
flowchart: { useMaxWidth: false, htmlLabels: true }
|
||||||
|
});
|
||||||
|
</script>
|
||||||
|
</body>
|
||||||
|
</html>`;
|
||||||
|
|
||||||
|
(async () => {
|
||||||
|
const browser = await chromium.launch();
|
||||||
|
const page = await browser.newPage({
|
||||||
|
viewport: { width: width, height: 1600 },
|
||||||
|
deviceScaleFactor: 2
|
||||||
|
});
|
||||||
|
|
||||||
|
await page.setContent(html, { waitUntil: 'networkidle' });
|
||||||
|
await page.waitForSelector('.mermaid svg', { timeout: 15000 });
|
||||||
|
await page.waitForTimeout(1000);
|
||||||
|
|
||||||
|
const box = await page.locator('#diagram').boundingBox();
|
||||||
|
if (!box) {
|
||||||
|
console.error('Failed to locate diagram');
|
||||||
|
await browser.close();
|
||||||
|
process.exit(1);
|
||||||
|
}
|
||||||
|
|
||||||
|
const padding = 40;
|
||||||
|
await page.screenshot({
|
||||||
|
path: outFile,
|
||||||
|
clip: {
|
||||||
|
x: Math.max(0, box.x - padding),
|
||||||
|
y: Math.max(0, box.y - padding),
|
||||||
|
width: box.width + padding * 2,
|
||||||
|
height: box.height + padding * 2
|
||||||
|
}
|
||||||
|
});
|
||||||
|
|
||||||
|
console.log(`OK: ${outFile} (${Math.round(box.width)}x${Math.round(box.height)})`);
|
||||||
|
await browser.close();
|
||||||
|
})();
|
||||||
53
scripts/upload_qiniu_003.py
Normal file
@ -0,0 +1,53 @@
|
|||||||
|
"""上传文章 003 的所有图片到七牛云 OSS"""
|
||||||
|
|
||||||
|
import os
|
||||||
|
import glob
|
||||||
|
from qiniu import Auth, put_file
|
||||||
|
|
||||||
|
ACCESS_KEY = 't1PIPGcvBY9lJVXFZFb48maTQsGGhvLsR5QQlNq0'
|
||||||
|
SECRET_KEY = 'KGooFdF5eCLdCIMCOD6x5ofMzu4vYE17T5Mvp9qC'
|
||||||
|
BUCKET_NAME = 'union-saas'
|
||||||
|
CDN_DOMAIN = 'https://cdn.union.jxyunge.com'
|
||||||
|
UPLOAD_PREFIX = 'self-media/003/'
|
||||||
|
|
||||||
|
|
||||||
|
def upload_file(local_path, key):
|
||||||
|
"""上传单个文件到七牛"""
|
||||||
|
q = Auth(ACCESS_KEY, SECRET_KEY)
|
||||||
|
token = q.upload_token(BUCKET_NAME, key, 3600)
|
||||||
|
ret, info = put_file(token, key, local_path, version='v2')
|
||||||
|
if info.status_code == 200:
|
||||||
|
url = f'{CDN_DOMAIN}/{key}'
|
||||||
|
print(f' OK {os.path.basename(local_path)} -> {url}')
|
||||||
|
return url
|
||||||
|
else:
|
||||||
|
print(f' FAIL {os.path.basename(local_path)}: {info}')
|
||||||
|
return None
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
img_dir = os.path.join(os.path.dirname(__file__), '..', 'articles', '003')
|
||||||
|
|
||||||
|
# 收集所有图片文件(PNG + SVG)
|
||||||
|
files = sorted(
|
||||||
|
glob.glob(os.path.join(img_dir, '*.png'))
|
||||||
|
+ glob.glob(os.path.join(img_dir, '*.svg'))
|
||||||
|
)
|
||||||
|
|
||||||
|
results = {}
|
||||||
|
for f in files:
|
||||||
|
name = os.path.basename(f)
|
||||||
|
key = UPLOAD_PREFIX + name
|
||||||
|
url = upload_file(f, key)
|
||||||
|
if url:
|
||||||
|
results[name] = url
|
||||||
|
|
||||||
|
print(f'\n===== 上传完成: {len(results)}/{len(files)} =====')
|
||||||
|
for name, url in results.items():
|
||||||
|
print(f'{name}: {url}')
|
||||||
|
|
||||||
|
return results
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == '__main__':
|
||||||
|
main()
|
||||||