Compare commits

..

No commits in common. "main" and "v1.0.0" have entirely different histories.
main ... v1.0.0

22 changed files with 59 additions and 1528 deletions

1
.gitignore vendored
View File

@ -5,4 +5,3 @@ env.json
env.yml
env.yaml
.log.meta.json
/test_res/

View File

@ -1,50 +1,5 @@
# CHANGELOG - apigo.cc/go/vision
## v1.0.9 (2026-05-17)
- **新特性**: 内置全能命令行工具 `vision` (`cmd/vision`)。
- **功能增强**: `vision.Load` 增加多级环境探测sips, heif-convert, magick, ffmpeg完美支持 HEIC 及其网格重构解码。
- **功能增强**: `GenerateVideoPreview` 升级为动态采样算法(覆盖视频全长,限 3-8 帧),深度优化 VLM图生文解析体验。
- **功能增强**: `GenerateAudioPreview` 优化压缩策略12kbps Opus为 STT 提供极致轻量的语音摘要。
- **环境对齐**: 增加 HEIC 转换工具的自动探测与缺失警告引导。
## v1.0.8 (2026-05-15)
- **基础设施同步**: 更新核心依赖版本。
## v1.0.7 (2026-05-14)
- **依赖对齐**: 内部组件版本同步。
## v1.0.6 (2026-05-13)
- **功能完善**: 优化调色板提取精度。
## v1.0.5 (2026-05-13)
- **高级水印系统**:
- 为 `Watermark``TextWatermark` 增加旋转角度 (`angle`) 支持。
- 新增 `TileWatermark``TileTextWatermark` 实现全图平铺水印,支持自定义间距与角度。
- **GIF 水印支持**: 为 `Animation` 结构增加全套水印方法,支持对动图所有帧批量添加水印。
- **状态确认**: 确认并完善了二维码 (`QR Code`) 与条形码 (`Barcode`) 的生成与识别能力。
## v1.0.4 (2026-05-13)
- **水印系统**: 新增 `Watermark` (图片) 和 `TextWatermark` (文字) 支持九宫格位置定义与透明度。
- **视频水印**: 扩展 `Video` 结构,支持通过 FFmpeg 一键给视频添加水印。
- **滑块验证码**: 新增 `GenerateJigsaw` 自动生成拼图路径、带槽口底图及拼图块。
- **动态验证码**: 新增 `GenerateGIFCaptcha` 生成抗 OCR 的动态 GIF 验证码。
- **功能补完**: 新增 `Canvas.Clone` 方法。
## v1.0.3 (2026-05-13)
- **性能优化**:优化 `Load` 函数,移除冗余的字符串转换,直接使用 `bytes.Reader` 进行图像解码。
- **基准测试**:新增 `BenchmarkWarpPerspective``BenchmarkPHash``BenchmarkExtractPalette` 性能测试。
- **防御性编程**:在 `Load` 函数中增加路径非空检查,提升健壮性。
- **文档更新**:在 `TEST.md` 中同步性能基准指标。
## v1.0.2 (2026-05-12)
- **文档增强**:重构 `README.md`,增加透视变换、动画合成、验证码生成等深度示例。
- **发布测试指南**:新增 `TEST.md` 明确测试覆盖范围与验证流程。
- **API 完善**:在 `Canvas` 中新增 `Invert` 滤镜支持。
## v1.0.1 (2026-05-12)
- **基础设施对齐**:全面移除原生 `os``strconv` 依赖,强制对齐 `@go` 核心设施。
- **内存优化**:使用 `go/file` 支持内存中的图像处理与序列化。
## v1.0.0 (2026-05-12)
* **Initial Release**: Complete migration and evolution from `@gojs/img`.

21
LICENSE
View File

@ -1,21 +0,0 @@
MIT License
Copyright (c) 2026 ssgo
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

112
README.md
View File

@ -4,7 +4,7 @@
## 🎯 设计哲学
`go/vision` 致力于消除 Go 语言在媒体处理领域的摩擦。通过纯 Go 的核心算法与标准化的外部工具编排,提供一套语义一致、零摩擦、高性能一站式 API 体系。
`go/vision` 致力于消除 Go 语言在媒体处理领域的摩擦。通过纯 Go 的核心算法与标准化的外部工具编排,提供一套语义一致、零摩擦、高性能 API 体系。
* **零摩擦**: 自动探测/引导环境准备(如 FFmpeg一键式识别与转换。
* **工业级**: 错误驱动架构No internal logging完备的单元测试覆盖。
@ -18,14 +18,13 @@
### 2. 图像处理与变换
* **几何变换**: 缩放 (Resize/Fit/Fill)、旋转、镜像、**4 点透视变换 (WarpPerspective)**。
* **高级滤镜**: 模糊、锐化、灰度、亮度/对比度、色彩反转、卷积滤波
* **高级滤镜**: 模糊、锐化、灰度、亮度/对比度、怀旧 (Sepia)、像素化
* **色彩分析**: 调色板提取 (`ExtractPalette`)、平均色计算。
### 3. 智能视觉 (Intelligence)
* **码码识别**: 集成 QR Code、条形码 (Code128, UPC/EAN) 的生成与自动解码识别。
* **感知哈希 (PHash)**: 基于图像特征的指纹计算,用于海量图片相似度查重。
* **验证码引擎**: 高强度抗 OCR 图形验证码生成。
* **模板匹配**: `FindTemplate` 支持在大图中精准定位子图。
### 4. 动态媒体 (Animation & Video)
* **GIF 引擎**: 高质量 GIF 序列生成,内置 `Plan9` 调色板与 `Floyd-Steinberg` 抖动。
@ -39,114 +38,37 @@ go get apigo.cc/go/vision
## 💡 快速开始
### 1. 扫码识别
### 扫码识别
```go
// 自动尝试 QR 和条码识别
c, _ := vision.Load("code.jpg")
content, err := c.DecodeAll()
// 生成二维码并保存
qr, _ := vision.GenerateQRCode("https://apigo.cc", 256)
vision.Save(qr, "qr.png")
content, err := c.DecodeAll() // 自动尝试 QR 和条码
```
### 2. 透视变换 (WarpPerspective)
常用于文档扫描纠偏。
```go
c, _ := vision.Load("skewed_doc.jpg")
// 指定源图中的四个角点 (TL, TR, BR, BL)
srcPoints := [4]image.Point{
{150, 20}, {450, 50}, {480, 380}, {100, 350},
}
c.WarpPerspective(srcPoints, 300, 400)
vision.Save(c, "flat_doc.png")
```
### 3. 生成 GIF 动画
```go
anim := vision.NewAnimation()
for i := 0; i < 10; i++ {
c := vision.New(100, 100, "#FFFFFF")
c.Circle(50, 50, float64(i*5), &vision.DrawStyle{FillColor: "#FF0000"})
anim.AddFrame(c, 10) // 100ms 延迟
}
anim.SaveGIF("motion.gif", 0) // 0 表示无限循环
```
### 4. 视频帧提取
### 视频帧处理
```go
v, _ := vision.NewVideo()
frame, _ := v.ExtractFrame("movie.mp4", 5.0) // 提取第 5 秒的帧
frame.Blur(2.0)
vision.Save(frame, "preview.jpg")
frame, _ := v.ExtractFrame("video.mp4", 5.0)
frame.Grayscale()
vision.Save(frame, "snapshot.png")
```
### 5. 多媒体预览与转写转码
针对 Web 端、列表缩略图或语音转写场景的一站式优化预览。支持自动缩放并裁剪以适应指定尺寸。
### 提取主色调
```go
// 生成图片缩略图 (WebP, 自动填充/裁剪)
vision.GenerateImagePreview("photo.jpg", "thumb.webp", 200, 200)
// 生成动画预览 (WebP/GIF, 默认 30s 采样一帧,自动填充/裁剪)
vision.GenerateVideoPreview("movie.mp4", "preview.webp", 320, 180)
// 生成单张预览图 (JPG/PNG, 取视频中间帧)
vision.GenerateVideoPreview("movie.mp4", "preview.jpg", 320, 180)
// 提取多张预览帧到文件夹 (输出 1.webp, 2.webp...)
vision.GenerateVideoPreview("movie.mp4", "frames_dir", 320, 180)
// 提取音频预览片段 (16kHz Ogg Opus, 最长 3 分钟)
vision.GenerateAudioPreview("input.mp4", "preview.ogg")
```
## 命令行工具 (vision)
`vision` 包内置了一个全能的命令行工具,位于 `cmd/vision` 目录下。
### 安装
使用 `go install` 安装:
```bash
go install apigo.cc/go/vision/cmd/vision@latest
```
### 常用命令
```bash
# 1. 查看图片信息与调色板
vision photo.jpg
# 2. 识别二维码/条码
vision code.png --decode
# 3. 生成二维码
vision --data "https://apigo.cc" -o qr.png --size 512
# 4. 批量处理图像 (缩放、模糊、灰度)
vision in.png -o out.png --resize 800x600 --blur 1.5 --grayscale
# 5. 生成视频动态预览 (WebP)
vision video.mp4 --type video -o preview.webp --width 320 --height 180 --step 30
# 6. 生成验证码
vision --captcha -o captcha.png --len 6
palette := canvas.ExtractPalette(5)
for _, c := range palette {
fmt.Println("发现主色:", c.Hex)
}
```
## 🛠 API 概览
| 模块 | 主要 API |
| :--- | :--- |
| **Canvas** | `New`, `Load`, `Save`, `Clear`, `Sub`, `Clone`, `Put`, `LoadFonts` |
| **Canvas** | `New`, `Load`, `Save`, `Clear`, `Sub`, `Clone`, `Put` |
| **Draw** | `Rect`, `RoundedRect`, `Circle`, `Line`, `Path`, `RandBG` |
| **Effect** | `Resize`, `Rotate`, `Blur`, `Sharpen`, `AdjustBrightness`, `Grayscale`, `Invert` |
| **Transform** | `WarpPerspective`, `FlipH`, `FlipV` |
| **Recognition** | `DecodeQRCode`, `DecodeBarcode`, `DecodeAll`, `PHash`, `Distance`, `FindTemplate` |
| **Media** | `NewAnimation`, `NewVideo`, `ProcessVideoFrames`, `DiffFrames` |
## ⚙️ 环境依赖
* **FFmpeg**: 视频处理模块依赖 `ffmpeg` 二进制文件。
* `vision.NewVideo()` 会尝试自动探测系统路径。
* 如果未安装,它会提示下载路径或尝试自动引导(取决于权限)。
| **Effect** | `Resize`, `Rotate`, `Blur`, `Sharpen`, `AdjustBrightness`, `Grayscale` |
| **Recognition** | `DecodeQRCode`, `DecodeBarcode`, `DecodeAll`, `PHash`, `Distance` |
| **Media** | `NewAnimation`, `NewVideo`, `ConvertAll`, `Optimize` |
---
本项目由 AI 驱动开发与维护,遵循极致的代码质量与性能标准。

45
TEST.md
View File

@ -1,45 +0,0 @@
# Testing @go/vision
`go/vision` 拥有完善的单元测试覆盖,确保在各种图像处理场景下的稳定性。
## 运行测试
`vision` 目录下运行标准 Go 测试命令:
```bash
go test -v .
```
## 测试覆盖范围
* **Canvas & Drawing**: 验证基础绘图、颜色解析、图层叠加等功能。
* **Intelligence**:
* `QRCode`: 验证二维码的生成与识别一致性。
* `Barcode`: 验证条形码 (Code128, UPC) 的生成与识别。
* `PHash`: 验证相似图片的指纹距离计算。
* **Captcha**: 验证图形验证码的生成。
* **Transform**: 验证缩放、旋转以及复杂的 `WarpPerspective` 透视变换。
* **Animation**: 验证 GIF 序列的合成。
## 视觉回归测试
部分测试会生成临时的图片文件(如 `test.png`, `captcha.png`),测试脚本会自动清理这些文件。在开发新滤镜或绘图功能时,建议手动查看生成的图片以确保视觉效果符合预期。
## 性能基准测试
可以使用以下命令运行基准测试:
```bash
go test -bench .
```
以下是在 Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz 环境下的基准测试结果:
| 测试项 | 耗时 (ns/op) |
| :--- | :--- |
| **WarpPerspective** | 7,079,540 |
| **PHash** | 958,618 |
| **ExtractPalette** | 402,176 |
---
所有测试均遵循 `@go` 基础设施标准,无外部系统依赖(除 FFmpeg 视频测试外,该部分会自动跳过或提示引导)。

View File

@ -5,8 +5,8 @@ import (
"image/color/palette"
"image/draw"
"image/gif"
"os"
"apigo.cc/go/file"
"github.com/fogleman/gg"
)
@ -51,7 +51,7 @@ func (a *Animation) SaveGIF(path string, loopCount int) error {
out.Delay = append(out.Delay, a.Delays[i])
}
f, err := file.Create(path)
f, err := os.Create(path)
if err != nil {
return err
}
@ -62,7 +62,7 @@ func (a *Animation) SaveGIF(path string, loopCount int) error {
// LoadGIF 从文件加载 GIF 动画
func LoadGIF(path string) (*Animation, error) {
f, err := file.Open(path)
f, err := os.Open(path)
if err != nil {
return nil, err
}
@ -83,31 +83,3 @@ func LoadGIF(path string) (*Animation, error) {
}
return anim, nil
}
// Watermark 给动画所有帧添加图片水印
func (a *Animation) Watermark(mark *Canvas, pos Position, opacity float64, padding int, angle ...float64) {
for _, f := range a.Frames {
f.Watermark(mark, pos, opacity, padding, angle...)
}
}
// TileWatermark 给动画所有帧平铺图片水印
func (a *Animation) TileWatermark(mark *Canvas, opacity float64, spacing int, angle float64) {
for _, f := range a.Frames {
f.TileWatermark(mark, opacity, spacing, angle)
}
}
// TextWatermark 给动画所有帧添加文字水印
func (a *Animation) TextWatermark(text string, pos Position, style *DrawStyle, opacity float64, padding int, angle ...float64) {
for _, f := range a.Frames {
f.TextWatermark(text, pos, style, opacity, padding, angle...)
}
}
// TileTextWatermark 给动画所有帧平铺文字水印
func (a *Animation) TileTextWatermark(text string, style *DrawStyle, opacity float64, spacing int, angle float64) {
for _, f := range a.Frames {
f.TileTextWatermark(text, style, opacity, spacing, angle)
}
}

View File

@ -74,17 +74,3 @@ func (c *Canvas) DecodeAll() (string, error) {
// 再尝试条形码
return c.DecodeBarcode()
}
// Recognize 是 DecodeAll 的别名,用于自动识别图片中的任何码 (QR, Barcode)
func (c *Canvas) Recognize() (string, error) {
return c.DecodeAll()
}
// Recognize 从指定路径加载图片并自动识别其中的任何码 (QR, Barcode)
func Recognize(path string) (string, error) {
c, err := Load(path)
if err != nil {
return "", err
}
return c.DecodeAll()
}

View File

@ -1,7 +1,6 @@
package vision
import (
"bytes"
"fmt"
"image"
"image/color"
@ -9,10 +8,9 @@ import (
"image/jpeg"
_ "image/png"
"os"
"os/exec"
"path/filepath"
"strings"
"apigo.cc/go/cast"
"apigo.cc/go/file"
"github.com/fogleman/gg"
"golang.org/x/image/font"
@ -43,25 +41,17 @@ func New(width, height int, backgroundColor ...string) *Canvas {
// Load 从文件加载图像并创建画布
func Load(path string) (*Canvas, error) {
if path == "" {
return nil, fmt.Errorf("path is empty")
}
if !file.Exists(path) {
return nil, fmt.Errorf("file not found: %s", path)
}
data, err := file.ReadBytes(path)
data, err := file.Read(path)
if err != nil {
return nil, err
}
img, _, err := image.Decode(bytes.NewReader(data))
img, _, err := image.Decode(strings.NewReader(cast.String(data)))
if err != nil {
// 尝试使用 FFmpeg 作为 fallback (用于 HEIC 等格式)
ext := strings.ToLower(filepath.Ext(path))
if ext == ".heic" || ext == ".heif" || ext == ".webp" || ext == ".avif" {
return loadWithFFmpeg(path)
}
return nil, fmt.Errorf("decode image failed: %v", err)
}
@ -70,49 +60,6 @@ func Load(path string) (*Canvas, error) {
}, nil
}
func loadWithFFmpeg(path string) (*Canvas, error) {
tmpFile := filepath.Join(os.TempDir(), fmt.Sprintf("vision_load_%d.png", os.Getpid()))
defer os.Remove(tmpFile)
// 如果是 HEIC/HEIF优先使用专门的转换工具
ext := strings.ToLower(filepath.Ext(path))
if ext == ".heic" || ext == ".heif" {
if err := ConvertHEIC(path, tmpFile); err == nil {
return loadPNG(tmpFile)
}
}
// 否则或失败后,回退到 FFmpeg
ffmpeg, err := EnsureFFmpeg()
if err != nil {
return nil, fmt.Errorf("ffmpeg not found for fallback: %w", err)
}
// 将输入文件转换为 PNG (FFmpeg 对 HEIC 的网格重构支持较弱)
cmd := exec.Command(ffmpeg, "-i", path, "-frames:v", "1", "-y", tmpFile)
if err := cmd.Run(); err != nil {
return nil, fmt.Errorf("ffmpeg decode fallback failed: %w", err)
}
return loadPNG(tmpFile)
}
func loadPNG(path string) (*Canvas, error) {
data, err := file.ReadBytes(path)
if err != nil {
return nil, err
}
img, _, err := image.Decode(bytes.NewReader(data))
if err != nil {
return nil, err
}
return &Canvas{
dc: gg.NewContextForImage(img),
}, nil
}
// Save 将画布保存到文件
func Save(c *Canvas, path string, quality ...int) error {
var err error
@ -123,7 +70,7 @@ func Save(c *Canvas, path string, quality ...int) error {
q = quality[0]
}
// gg 没有内置 SaveJPG 到 context我们需要手动编码
f, createErr := file.Create(path)
f, createErr := os.Create(path)
if createErr != nil {
return createErr
}

View File

@ -45,41 +45,6 @@ func GenerateCaptcha(opt *CaptchaOption) *Canvas {
return c
}
// GenerateGIFCaptcha 生成动态 GIF 验证码
func GenerateGIFCaptcha(opt *CaptchaOption) *Animation {
if opt == nil {
opt = &CaptchaOption{Length: 4, Width: 120, Height: 40}
}
const chars = "0123456789"
code := ""
for i := 0; i < opt.Length; i++ {
code += string(chars[rand.Int(0, len(chars)-1)])
}
anim := NewAnimation()
// 每一帧显示不同的干扰和字符位移
for i := 0; i < 10; i++ {
c := New(opt.Width, opt.Height, "#FFFFFF")
c.RandBG(3)
// 绘制字符,每一帧字符的位置都有微小抖动
for idx, char := range code {
x := float64(idx)*float64(opt.Width/opt.Length) + 5 + rand.Float(0.0, 4.0) - 2.0
y := float64(opt.Height/2) + 10 + rand.Float(0.0, 6.0) - 3.0
c.dc.Push()
c.dc.RotateAbout(rand.Float(0.0, 0.4)-0.2, x, y)
c.dc.SetColor(ParseColor(RandColor()))
c.dc.DrawString(string(char), x, y)
c.dc.Pop()
}
anim.AddFrame(c, 15) // 150ms 每帧
}
return anim
}
// RandText 绘制随机扭曲文本 (用于验证码)
func (c *Canvas) RandText(text string) [][4]float64 {
w, h := float64(c.Width()), float64(c.Height())

View File

@ -1,272 +0,0 @@
package main
import (
"flag"
"fmt"
"os"
"strconv"
"strings"
"apigo.cc/go/vision"
)
var (
// 全局参数
outFile = flag.String("o", "", "输出文件路径 (如: out.png, out.webp)")
inspect = flag.Bool("inspect", false, "查看图像详细元数据 (默认行为)")
version = flag.Bool("v", false, "显示版本信息")
// 二维码/条码生成
dataStr = flag.String("data", "", "生成二维码/条码的内容")
size = flag.Int("size", 256, "生成的二维码尺寸 (正方形)")
width = flag.Int("width", 0, "宽度 (针对预览、条码、验证码、缩放)")
height = flag.Int("height", 0, "高度 (针对预览、条码、验证码、缩放)")
// 图像处理
resizeStr = flag.String("resize", "", "缩放尺寸 (格式: 800x600)")
blur = flag.Float64("blur", 0, "模糊程度 (sigma)")
grayscale = flag.Bool("grayscale", false, "转为灰度图")
rotate = flag.Float64("rotate", 0, "顺时针旋转角度")
brightness = flag.Float64("brightness", 0, "亮度调整 (-100 到 100)")
contrast = flag.Float64("contrast", 0, "对比度调整 (-100 到 100)")
// 预览生成
previewType = flag.String("type", "", "预览类型: image, video, audio (自动识别后缀)")
_ = flag.String("p", "", "预览类型 (别名, 同 -type)")
// 验证码
captchaLen = flag.Int("len", 4, "验证码长度")
// 视频
vtime = flag.Float64("time", 0, "提取视频帧的时间点 (秒)")
vstep = flag.Int("step", 30, "视频预览采样间隔 (秒,默认 30)")
)
const visionVersion = "1.0.0"
func main() {
flag.Usage = func() {
fmt.Fprintf(os.Stderr, "👁️ Vision CLI (vision) - 全能图像与媒体处理工具 v%s\n\n", visionVersion)
fmt.Fprintf(os.Stderr, "用法:\n")
fmt.Fprintf(os.Stderr, " vision [flags] [file] # 处理已有文件\n")
fmt.Fprintf(os.Stderr, " vision --qrcode --data \"...\" # 生成二维码\n")
fmt.Fprintf(os.Stderr, " vision --captcha -o c.png # 生成验证码\n\n")
fmt.Fprintf(os.Stderr, "常见示例:\n")
fmt.Fprintf(os.Stderr, " vision photo.jpg # 查看图片信息及主色调\n")
fmt.Fprintf(os.Stderr, " vision code.png --decode # 识别二维码/条码\n")
fmt.Fprintf(os.Stderr, " vision in.png -o out.png --blur 2 --grayscale # 批量图像处理\n")
fmt.Fprintf(os.Stderr, " vision video.mp4 -p video -o p.webp # 生成视频动态预览\n")
fmt.Fprintf(os.Stderr, " vision video.mp4 --time 10.5 -o frame.jpg # 提取视频指定时间帧\n\n")
fmt.Fprintf(os.Stderr, "参数详解:\n")
flag.PrintDefaults()
}
decode := flag.Bool("decode", false, "识别图像中的二维码或条码")
flag.Parse()
if *version {
fmt.Printf("vision version %s\n", visionVersion)
return
}
// 处理 -p 别名
if *previewType == "" {
// 遍历 flag.Args 之前的 flags 找到 -p
flag.Visit(func(f *flag.Flag) {
if f.Name == "p" {
*previewType = f.Value.String()
}
})
}
args := flag.Args()
// 1. 无文件输入时的生成逻辑
if len(args) == 0 {
if *dataStr != "" {
runGenerate()
return
}
if flag.NFlag() > 0 && (*outFile != "" || *width > 0) {
// 如果指定了输出但没输入文件,尝试生成验证码
runCaptcha()
return
}
flag.Usage()
return
}
// 2. 有文件输入时的处理逻辑
srcFile := args[0]
// 识别预览生成 (如果是预览命令)
if *previewType != "" {
runPreview(srcFile)
return
}
// 识别视频帧提取
if strings.HasSuffix(strings.ToLower(srcFile), ".mp4") || strings.HasSuffix(strings.ToLower(srcFile), ".mov") {
if *vtime > 0 || (*outFile != "" && *previewType == "") {
runVideoExtract(srcFile)
return
}
}
// 图像处理逻辑
runImageProcess(srcFile, *decode)
}
func runGenerate() {
if *width > 0 && *height > 0 {
// 生成条码
c, err := vision.GenerateBarcode(*dataStr, *width, *height)
if err != nil {
fail("生成条码失败: %v", err)
}
save(c)
} else {
// 生成二维码
c, err := vision.GenerateQRCode(*dataStr, *size)
if err != nil {
fail("生成二维码失败: %v", err)
}
save(c)
}
}
func runCaptcha() {
opt := &vision.CaptchaOption{
Length: *captchaLen,
Width: *width,
Height: *height,
}
c := vision.GenerateCaptcha(opt)
fmt.Printf("🛡️ 验证码内容: %s\n", opt.Text)
save(c)
}
func runPreview(src string) {
if *outFile == "" {
fail("预览生成必须指定输出路径 (-o)")
}
w, h := *width, *height
if w == 0 { w = 320 }
if h == 0 { h = 180 }
var err error
switch strings.ToLower(*previewType) {
case "image":
err = vision.GenerateImagePreview(src, *outFile, w, h)
case "video":
err = vision.GenerateVideoPreview(src, *outFile, w, h, *vstep)
case "audio":
err = vision.GenerateAudioPreview(src, *outFile)
default:
fail("未知的预览类型: %s (可选: image, video, audio)", *previewType)
}
if err != nil {
fail("生成预览失败: %v", err)
}
fmt.Printf("✅ 预览已生成: %s\n", *outFile)
}
func runVideoExtract(src string) {
v, err := vision.NewVideo()
if err != nil {
fail("初始化视频工具失败: %v", err)
}
frame, err := v.ExtractFrame(src, *vtime)
if err != nil {
fail("提取视频帧失败: %v", err)
}
save(frame)
}
func runImageProcess(src string, doDecode bool) {
c, err := vision.Load(src)
if err != nil {
fail("无法加载图像 '%s': %v", src, err)
}
if doDecode {
res, err := c.DecodeAll()
if err != nil {
fail("解码失败: %v", err)
}
fmt.Printf("📝 解码结果: %s\n", res)
return
}
// 批量处理
modified := false
if *resizeStr != "" {
parts := strings.Split(strings.ToLower(*resizeStr), "x")
if len(parts) == 2 {
w, _ := strconv.Atoi(parts[0])
h, _ := strconv.Atoi(parts[1])
if w > 0 && h > 0 {
c.Resize(w, h)
modified = true
}
}
}
if *blur > 0 {
c.Blur(*blur)
modified = true
}
if *grayscale {
c.Grayscale()
modified = true
}
if *rotate != 0 {
c.Rotate(*rotate)
modified = true
}
if *brightness != 0 {
c.AdjustBrightness(*brightness)
modified = true
}
if *contrast != 0 {
c.AdjustContrast(*contrast)
modified = true
}
if *outFile != "" {
save(c)
} else if modified {
fail("已应用处理,但未指定输出路径 (-o)")
} else {
// 默认 inspect 模式
fmt.Printf("🔍 图像详情: %s\n", src)
fmt.Printf(" 尺寸: %dx%d\n", c.Width(), c.Height())
hash := vision.PHash(c.Image())
fmt.Printf(" 指纹 (PHash): %016X\n", hash)
palette := c.ExtractPalette(5)
fmt.Printf(" 主要颜色 (调色板):\n")
for _, col := range palette {
fmt.Printf(" - %s (%d)\n", col.Hex, col.Count)
}
}
}
func save(c *vision.Canvas) {
path := *outFile
if path == "" {
path = "out.png"
}
if err := vision.Save(c, path); err != nil {
fail("保存失败: %v", err)
}
fmt.Printf("✨ 成功保存至: %s\n", path)
}
func fail(format string, a ...any) {
fmt.Fprintf(os.Stderr, "❌ 错误: "+format+"\n", a...)
os.Exit(1)
}

View File

@ -2,10 +2,9 @@ package vision
import (
"fmt"
"os"
"path/filepath"
"strings"
"apigo.cc/go/file"
)
// Format 定义支持的图像格式
@ -30,23 +29,23 @@ func Convert(srcPath, dstPath string, quality ...int) error {
// ConvertAll 将目录下的所有符合条件的图片转换为目标格式
func ConvertAll(srcDir, dstDir string, toExt string, quality ...int) (int, []error) {
files, err := file.ReadDir(srcDir)
files, err := os.ReadDir(srcDir)
if err != nil {
return 0, []error{err}
}
if err := file.Mkdir(dstDir); err != nil {
if err := os.MkdirAll(dstDir, 0755); err != nil {
return 0, []error{err}
}
count := 0
var errors []error
for _, f := range files {
if f.IsDir {
if f.IsDir() {
continue
}
name := f.Name
name := f.Name()
ext := strings.ToLower(filepath.Ext(name))
if ext == ".png" || ext == ".jpg" || ext == ".jpeg" {
srcPath := filepath.Join(srcDir, name)

16
go.mod
View File

@ -3,10 +3,10 @@ module apigo.cc/go/vision
go 1.25.0
require (
apigo.cc/go/cast v1.5.0
apigo.cc/go/file v1.5.0
apigo.cc/go/jsmod v1.5.0
apigo.cc/go/rand v1.5.0
apigo.cc/go/cast v1.3.0
apigo.cc/go/file v1.3.0
apigo.cc/go/log v1.3.0
apigo.cc/go/rand v1.3.0
github.com/boombuler/barcode v1.1.0
github.com/disintegration/imaging v1.6.2
github.com/flopp/go-findfont v0.1.0
@ -17,10 +17,12 @@ require (
)
require (
apigo.cc/go/encoding v1.5.0 // indirect
apigo.cc/go/safe v1.5.0 // indirect
apigo.cc/go/config v1.3.0 // indirect
apigo.cc/go/encoding v1.3.0 // indirect
apigo.cc/go/id v1.3.0 // indirect
apigo.cc/go/safe v1.3.0 // indirect
apigo.cc/go/shell v1.3.0 // indirect
github.com/golang/freetype v0.0.0-20170609003504-e2365dfdc4a0 // indirect
github.com/kr/text v0.2.0 // indirect
golang.org/x/crypto v0.51.0 // indirect
golang.org/x/sys v0.44.0 // indirect
golang.org/x/text v0.37.0 // indirect

31
go.sum
View File

@ -1,18 +1,23 @@
apigo.cc/go/cast v1.5.0 h1:UBGJtFQ8eJPMQXs37cUgqd7YQo1zI9opuSDBDmn2/pE=
apigo.cc/go/cast v1.5.0/go.mod h1:z2GW5p5WCZGEqVVIJUdhl232vRbLf2Qu4EDlEakX/D8=
apigo.cc/go/encoding v1.5.0 h1:EJNdRVDOMoI2DAvZwQNQTbYuqB/6zsEzvg7lS5pQI+I=
apigo.cc/go/encoding v1.5.0/go.mod h1:8++NfZj3hWig0qh2g7GQRw/4LpSvCYMWUZ+8J+x58cA=
apigo.cc/go/file v1.5.0 h1:Fh1NSDBqaxjuXYJ71yPHPXVJ8BFEv/AGS3l+jkLi5uw=
apigo.cc/go/file v1.5.0/go.mod h1:4YhOGgBINTpmmmgws3H8LAyXQQBGzBp44hYUoCS+kr0=
apigo.cc/go/jsmod v1.5.0 h1:JgQtJNiJWy1NOP9AzE8NX5VXJkpO/x3GqLsCCSny5Ec=
apigo.cc/go/jsmod v1.5.0/go.mod h1:bmyeZtOAP/j5am+YRnaiM89smysK24K7ebk0koFtsSw=
apigo.cc/go/rand v1.5.0 h1:1o8hh8fhdBuk1/h02IvugvamuT3dkWbVJrqEJVQKB2E=
apigo.cc/go/rand v1.5.0/go.mod h1:Lh98S2dm9UY0X+M+kNQQEKyXHG5pcCKSFPyXN0QCGdk=
apigo.cc/go/safe v1.5.0 h1:W1NblmcU8cex1f9Y5z8mNLUJOzZTE1s6fszb3FbhGnk=
apigo.cc/go/safe v1.5.0/go.mod h1:OfQ5d6COePSGEuPvMeOk6KagX2sezw7nvKh7exj9SeM=
apigo.cc/go/cast v1.3.0 h1:ZTcLYijkqZjSWSCSpJUWMfzJYeJKbwKxquKkPrFsROQ=
apigo.cc/go/cast v1.3.0/go.mod h1:lGlwImiOvHxG7buyMWhFzcdvQzmSaoKbmr7bcDfUpHk=
apigo.cc/go/config v1.3.0 h1:TwI3bv3D+BJrAnFx+o62HQo3FarY2Ge3SCGsKchFYGg=
apigo.cc/go/config v1.3.0/go.mod h1:88lqKEBXlIExFKt1geLONVLYyM+QhRVpBe0ok3OEvjI=
apigo.cc/go/encoding v1.3.0 h1:8jqNHoZBR8vOU/BGsLFebfp1Txa1UxDRpd7YwzIFLJs=
apigo.cc/go/encoding v1.3.0/go.mod h1:kT/uUJiuAOkZ4LzUWrUtk/I0iL1D8aatvD+59bDnHBo=
apigo.cc/go/file v1.3.0 h1:xG9FcY3Rv6Br83r9pq9QsIXFrplx4g8ITOkHSzfzXRg=
apigo.cc/go/file v1.3.0/go.mod h1:pYHBlB/XwsrnWpEh7GIFpbiqobrExfiB+rEN8V2d2kY=
apigo.cc/go/id v1.3.0 h1:Tr2Yj0Rl19lfwW5wBTJ407o/zgo2oVRLE20WWEgJzdE=
apigo.cc/go/id v1.3.0/go.mod h1:AFH3kMFwENfXNyijnAFWEhSF1o3y++UBPem1IUlrcxA=
apigo.cc/go/log v1.3.0 h1:61Z80WGN6SnhgxgoR8xuVYIieMdjlJKmf8JX1HXzp0Y=
apigo.cc/go/log v1.3.0/go.mod h1:dz4bSz9BnOgutkUJJZfX3uDDwsMpUxt7WF50mLK9hgE=
apigo.cc/go/rand v1.3.0 h1:k+UFAhMySwXf+dq8Om9TniZV6fm6gAE0evbrqMEdwQU=
apigo.cc/go/rand v1.3.0/go.mod h1:mZ/4Soa3bk+XvDaqPWJuUe1bfEi4eThBj1XmEAuYxsk=
apigo.cc/go/safe v1.3.0 h1:uctdAUsphT9p60Tk4oS5xPCe0NoIdOHfsYv4PNS0Rok=
apigo.cc/go/safe v1.3.0/go.mod h1:tC9X14V+qh0BqIrVg4UkXbl+2pEN+lj2ZNI8IjDB6Fs=
apigo.cc/go/shell v1.3.0 h1:hdxuYPN/7T2BuM/Ja8AjVUhbRqU/wpi8OjcJVziJ0nw=
apigo.cc/go/shell v1.3.0/go.mod h1:aNJiRWibxlA485yX3t+07IVAbrALKmxzv4oGEUC+hK4=
github.com/boombuler/barcode v1.1.0 h1:ChaYjBR63fr4LFyGn8E8nt7dBSt3MiU3zMOZqFvVkHo=
github.com/boombuler/barcode v1.1.0/go.mod h1:paBWMcWSl3LHKBqUq+rly7CNSldXjb2rDl3JlRe0mD8=
github.com/creack/pty v1.1.9/go.mod h1:oKZEueFk5CKHvIhNR5MUki03XCEU+Q6VDXinZuGJ33E=
github.com/disintegration/imaging v1.6.2 h1:w1LecBlG2Lnp8B3jk5zSuNqd7b4DXhcjwek1ei82L+c=
github.com/disintegration/imaging v1.6.2/go.mod h1:44/5580QXChDfwIclfc/PCwrr44amcmDAg8hxG0Ewe4=
github.com/flopp/go-findfont v0.1.0 h1:lPn0BymDUtJo+ZkV01VS3661HL6F4qFlkhcJN55u6mU=

100
heic.go
View File

@ -1,100 +0,0 @@
package vision
import (
"fmt"
"os"
"os/exec"
"runtime"
)
// HEICConverter 定义了处理 HEIC/HEIF 转换的工具路径
var heicConverter string
// DetectHEICConverter 探测系统中可用的 HEIC 转换工具。
// 优先级: sips (macOS) > heif-convert (libheif) > magick (ImageMagick)
func DetectHEICConverter() string {
if heicConverter != "" {
return heicConverter
}
// 1. macOS 专属原生工具
if runtime.GOOS == "darwin" {
if p, err := exec.LookPath("sips"); err == nil {
heicConverter = p
return p
}
}
// 2. 跨平台开源工具 heif-convert (libheif)
if p, err := exec.LookPath("heif-convert"); err == nil {
heicConverter = p
return p
}
// 3. 跨平台全能工具 ImageMagick
if p, err := exec.LookPath("magick"); err == nil {
heicConverter = p
return p
}
// 记录警告信息,指导用户安装
printHEICWarning()
return ""
}
func printHEICWarning() {
fmt.Fprintln(os.Stderr, "⚠️ Warning: No HEIC converter found in PATH.")
switch runtime.GOOS {
case "darwin":
fmt.Fprintln(os.Stderr, " Hint: macOS should have 'sips' pre-installed.")
case "linux":
fmt.Fprintln(os.Stderr, " Hint: Install libheif: 'sudo apt install libheif-examples'")
case "windows":
fmt.Fprintln(os.Stderr, " Hint: Install ImageMagick or libheif for Windows.")
}
}
// ConvertHEIC 使用探测到的工具将 HEIC 转换为 PNG 临时文件
func ConvertHEIC(src, dst string) error {
cmdPath := DetectHEICConverter()
if cmdPath == "" {
return fmt.Errorf("no HEIC converter available")
}
var cmd *exec.Cmd
base := ""
if runtime.GOOS == "windows" {
// 简单处理 Windows 下的路径
base = cmdPath
} else {
// 仅获取文件名判断类型
// 这里简单处理,直接用 DetectHEICConverter 返回的路径
base = cmdPath
}
// 根据不同工具构造命令
if contains(base, "sips") {
// sips -s format png input --out output
cmd = exec.Command(cmdPath, "-s", "format", "png", src, "--out", dst)
} else if contains(base, "heif-convert") {
// heif-convert input output
cmd = exec.Command(cmdPath, src, dst)
} else if contains(base, "magick") {
// magick input output.png
cmd = exec.Command(cmdPath, src, dst)
} else {
return fmt.Errorf("unsupported converter: %s", cmdPath)
}
return cmd.Run()
}
func contains(s, substr string) bool {
// 简单的字符串包含判断
for i := 0; i <= len(s)-len(substr); i++ {
if s[i:i+len(substr)] == substr {
return true
}
}
return false
}

View File

@ -1,79 +0,0 @@
package vision
import (
"image"
"image/color"
"image/draw"
"github.com/fogleman/gg"
)
// JigsawResult 滑块拼图结果
type JigsawResult struct {
Background *Canvas // 带槽口的背景
Piece *Canvas // 拼图块
X, Y int // 槽口位置
}
// GenerateJigsaw 生成滑块拼图物料
func (c *Canvas) GenerateJigsaw(x, y, size int) *JigsawResult {
sw, sh := c.Width(), c.Height()
if x < 0 { x = 0 }
if y < 0 { y = 0 }
if x+size > sw { x = sw - size }
if y+size > sh { y = sh - size }
// 1. 创建背景副本并绘制槽口
bg := c.Clone()
piece := New(size, size)
// 定义拼图路径 (带有四个圆润突起/凹陷的拼图块形状)
drawPuzzlePath := func(dc *gg.Context, px, py, s float64) {
r := s / 4.0
dc.MoveTo(px, py)
dc.LineTo(px+s/2-r, py)
dc.QuadraticTo(px+s/2, py-r*1.5, px+s/2+r, py) // 上凸起
dc.LineTo(px+s, py)
dc.LineTo(px+s, py+s/2-r)
dc.QuadraticTo(px+s+r*1.5, py+s/2, px+s, py+s/2+r) // 右凸起
dc.LineTo(px+s, py+s)
dc.LineTo(px+s/2+r, py+s)
dc.QuadraticTo(px+s/2, py+s-r*1.5, px+s/2-r, py+s) // 下凹陷 (可以改反向)
dc.LineTo(px, py+s)
dc.LineTo(px, py+s/2+r)
dc.QuadraticTo(px-r*1.5, py+s/2, px, py+s/2-r) // 左凹陷
dc.ClosePath()
}
// 2. 提取拼图块内容
// 我们需要一个蒙版来裁剪
mask := gg.NewContext(sw, sh)
drawPuzzlePath(mask, float64(x), float64(y), float64(size))
mask.SetFillRule(gg.FillRuleWinding)
mask.Fill()
// 裁剪 Piece
draw.DrawMask(piece.dc.Image().(draw.Image), image.Rect(0, 0, size, size), c.dc.Image(), image.Pt(x, y), mask.Image(), image.Pt(x, y), draw.Src)
// 3. 在背景上绘制半透明槽口 (遮罩)
bg.dc.Push()
drawPuzzlePath(bg.dc, float64(x), float64(y), float64(size))
bg.dc.SetColor(color.RGBA{0, 0, 0, 160})
bg.dc.Fill()
bg.dc.Pop()
// 4. 给拼图块加一点描边或投影增强识别度
piece.dc.Push()
drawPuzzlePath(piece.dc, 0, 0, float64(size))
piece.dc.SetColor(color.RGBA{255, 255, 255, 128})
piece.dc.SetLineWidth(2)
piece.dc.Stroke()
piece.dc.Pop()
return &JigsawResult{
Background: bg,
Piece: piece,
X: x,
Y: y,
}
}

View File

@ -1,152 +0,0 @@
package vision
import (
"context"
"apigo.cc/go/file"
"apigo.cc/go/jsmod"
)
func init() {
jsmod.Register("vision", map[string]any{
// 基础操作
"load": func(ctx context.Context, path string) (*jsCanvas, error) {
p, err := file.VerifyPathForSafeMode(ctx, path)
if err != nil {
return nil, err
}
c, err := Load(p)
if err != nil {
return nil, err
}
return &jsCanvas{ctx: ctx, c: c}, nil
},
"new": func(ctx context.Context, w, h int, bg ...string) *jsCanvas {
return &jsCanvas{ctx: ctx, c: New(w, h, bg...)}
},
"save": func(ctx context.Context, j *jsCanvas, path string, quality ...int) error {
p, err := file.VerifyPathForSafeMode(ctx, path)
if err != nil {
return err
}
return Save(j.c, p, quality...)
},
// 生成类
"generateQRCode": func(ctx context.Context, content string, size int) (*jsCanvas, error) {
c, err := GenerateQRCode(content, size)
if err != nil {
return nil, err
}
return &jsCanvas{ctx: ctx, c: c}, nil
},
"generateBarcode": func(ctx context.Context, content string, w, h int) (*jsCanvas, error) {
c, err := GenerateBarcode(content, w, h)
if err != nil {
return nil, err
}
return &jsCanvas{ctx: ctx, c: c}, nil
},
"generateCaptcha": func(ctx context.Context, opt *CaptchaOption) *jsCanvas {
return &jsCanvas{ctx: ctx, c: GenerateCaptcha(opt)}
},
// 预览类 (直接写文件)
"generateImagePreview": func(ctx context.Context, src, out string, w, h int) error {
pSrc, err := file.VerifyPathForSafeMode(ctx, src)
if err != nil {
return err
}
pOut, err := file.VerifyPathForSafeMode(ctx, out)
if err != nil {
return err
}
return GenerateImagePreview(pSrc, pOut, w, h)
},
"generateVideoPreview": func(ctx context.Context, src, out string, w, h int, interval ...int) error {
pSrc, err := file.VerifyPathForSafeMode(ctx, src)
if err != nil {
return err
}
pOut, err := file.VerifyPathForSafeMode(ctx, out)
if err != nil {
return err
}
return GenerateVideoPreview(pSrc, pOut, w, h, interval...)
},
// 转换类
"convert": func(ctx context.Context, src, dst string, quality ...int) error {
pSrc, err := file.VerifyPathForSafeMode(ctx, src)
if err != nil {
return err
}
pDst, err := file.VerifyPathForSafeMode(ctx, dst)
if err != nil {
return err
}
return Convert(pSrc, pDst, quality...)
},
"optimize": func(ctx context.Context, path string, maxWidth int, quality int) error {
p, err := file.VerifyPathForSafeMode(ctx, path)
if err != nil {
return err
}
return Optimize(p, maxWidth, quality)
},
// 辅助工具
"loadFonts": LoadFonts,
"recognize": func(ctx context.Context, path string) (string, error) {
p, err := file.VerifyPathForSafeMode(ctx, path)
if err != nil {
return "", err
}
return Recognize(p)
},
})
}
// jsCanvas 包装器,支持链式调用
type jsCanvas struct {
ctx context.Context
c *Canvas
}
func (j *jsCanvas) Width() int { return j.c.Width() }
func (j *jsCanvas) Height() int { return j.c.Height() }
func (j *jsCanvas) Save(path string, quality ...int) error {
p, err := file.VerifyPathForSafeMode(j.ctx, path)
if err != nil {
return err
}
return Save(j.c, p, quality...)
}
// 效果与绘图 (返回自身以支持链式)
func (j *jsCanvas) Resize(w, h int) *jsCanvas { j.c.Resize(w, h); return j }
func (j *jsCanvas) Blur(sigma float64) *jsCanvas { j.c.Blur(sigma); return j }
func (j *jsCanvas) Grayscale() *jsCanvas { j.c.Grayscale(); return j }
func (j *jsCanvas) Invert() *jsCanvas { j.c.Invert(); return j }
func (j *jsCanvas) Rotate(angle float64) *jsCanvas { j.c.Rotate(angle); return j }
func (j *jsCanvas) FlipH() *jsCanvas { j.c.FlipH(); return j }
func (j *jsCanvas) FlipV() *jsCanvas { j.c.FlipV(); return j }
func (j *jsCanvas) Sharpen(sigma float64) *jsCanvas { j.c.Sharpen(sigma); return j }
func (j *jsCanvas) AdjustBrightness(p float64) *jsCanvas { j.c.AdjustBrightness(p); return j }
func (j *jsCanvas) Rect(x, y, w, h float64, opt *DrawStyle) *jsCanvas { j.c.Rect(x, y, w, h, opt); return j }
func (j *jsCanvas) Circle(x, y, r float64, opt *DrawStyle) *jsCanvas { j.c.Circle(x, y, r, opt); return j }
func (j *jsCanvas) Line(x1, y1, x2, y2 float64, opt *DrawStyle) *jsCanvas {
j.c.Line(x1, y1, x2, y2, opt)
return j
}
func (j *jsCanvas) DrawText(x, y float64, text string, opt *TextOption) *jsCanvas {
j.c.DrawText(x, y, text, opt)
return j
}
// 识别类
func (j *jsCanvas) DecodeAll() (string, error) { return j.c.DecodeAll() }
func (j *jsCanvas) Recognize() (string, error) { return j.c.Recognize() }
func (j *jsCanvas) DecodeQRCode() (string, error) { return j.c.DecodeQRCode() }

View File

@ -1,163 +0,0 @@
package vision
import (
"fmt"
"os"
"os/exec"
"path/filepath"
"strings"
)
// GenerateImagePreview 生成图片预览
// 支持缩放并裁剪以填充指定尺寸 (Fill 模式)
// 根据 outPath 后缀自动转换格式 (.webp, .jpg, .png 等)
func GenerateImagePreview(srcPath, outPath string, width, height int) error {
// 使用统一的 Load() 加载,内部已处理好 HEIC/sips/FFmpeg 的复杂格式兼容
c, err := Load(srcPath)
if err != nil {
return err
}
c.Fill(width, height)
ext := strings.ToLower(filepath.Ext(outPath))
if ext == ".webp" {
// 借用 FFmpeg 将生成的画布转为高质量 WebP (比标准库或第三方库压缩更好)
tmpFile := filepath.Join(os.TempDir(), fmt.Sprintf("preview_%d.png", os.Getpid()))
defer os.Remove(tmpFile)
if err := Save(c, tmpFile); err != nil {
return err
}
v, err := NewVideo()
if err == nil {
cmd := exec.Command(v.FFmpegPath, "-i", tmpFile, "-c:v", "libwebp", "-quality", "80", "-y", outPath)
if err := cmd.Run(); err == nil {
return nil
}
}
}
return Save(c, outPath)
}
// GenerateVideoPreview 生成视频预览
// 根据 outPath 后缀判断输出格式:
// - .webp | .gif: 生成动态动画 (默认每 30 秒采样一帧,可通过 frameInterval 调整)
// - .jpg | .jpeg | .png: 生成单张预览图 (取视频中间帧)
// - 其他: 将 outPath 视为文件夹,在其中生成多张静态 .webp 图像
// frameInterval: 每隔多少秒采样一帧,默认 30。
func GenerateVideoPreview(videoPath, outPath string, width, height int, frameInterval ...int) error {
v, err := NewVideo()
if err != nil {
return err
}
duration, err := getVideoDuration(videoPath)
if err != nil {
return err
}
ext := strings.ToLower(filepath.Ext(outPath))
vf := fmt.Sprintf("scale=%d:%d:force_original_aspect_ratio=increase,crop=%d:%d", width, height, width, height)
// 1. 单张图片模式
if ext == ".jpg" || ext == ".jpeg" || ext == ".png" {
t := duration * 0.5
cmd := exec.Command(v.FFmpegPath, "-ss", fmt.Sprintf("%f", t), "-i", videoPath, "-frames:v", "1", "-vf", vf, "-y", outPath)
return cmd.Run()
}
// 2. 动画或多图模式需要计算多帧
interval := 30
if len(frameInterval) > 0 && frameInterval[0] > 0 {
interval = frameInterval[0]
}
// 动态计算帧数: 避免过多的帧浪费 Token每 interval 秒 1 帧,最少 3 帧,最多 8 帧
frameCount := int(duration / float64(interval))
if frameCount < 3 {
frameCount = 3
} else if frameCount > 8 {
frameCount = 8
}
times := make([]float64, frameCount)
for i := 0; i < frameCount; i++ {
times[i] = duration * (0.10 + 0.80*(float64(i)/float64(frameCount-1)))
}
// 2a. 动画模式 (.webp, .gif)
if ext == ".webp" || ext == ".gif" {
tmpDir, _ := os.MkdirTemp("", "frames")
defer os.RemoveAll(tmpDir)
for i, t := range times {
framePath := filepath.Join(tmpDir, fmt.Sprintf("frame_%02d.png", i))
cmd := exec.Command(v.FFmpegPath, "-ss", fmt.Sprintf("%f", t), "-i", videoPath, "-frames:v", "1", "-vf", vf, "-y", framePath)
if err := cmd.Run(); err != nil {
return err
}
}
var cmd *exec.Cmd
if ext == ".webp" {
cmd = exec.Command(v.FFmpegPath, "-framerate", "1", "-i", filepath.Join(tmpDir, "frame_%02d.png"),
"-c:v", "libwebp", "-lossless", "0", "-quality", "70", "-loop", "0", "-y", outPath)
} else {
cmd = exec.Command(v.FFmpegPath, "-framerate", "1", "-i", filepath.Join(tmpDir, "frame_%02d.png"), "-y", outPath)
}
return cmd.Run()
}
// 2b. 文件夹多图模式
if err := os.MkdirAll(outPath, 0755); err != nil {
return err
}
for i, t := range times {
framePath := filepath.Join(outPath, fmt.Sprintf("%d.jpg", i+1))
cmd := exec.Command(v.FFmpegPath, "-ss", fmt.Sprintf("%f", t), "-i", videoPath, "-frames:v", "1", "-vf", vf, "-q:v", "2", "-y", framePath)
if err := cmd.Run(); err != nil {
return err
}
}
return nil
}
// GenerateAudioPreview 提取音频用于预览或语音转写
// 支持根据 outPath 后缀输出格式:
// - .ogg: 使用 libopus (16kHz, 单声道, 12kbps), 极致压缩且保留人声特征,适合转写
// - .wav: 标准 PCM (16kHz, 单声道), 无损但体积较大,部分转写引擎强制要求
// - 其他: 默认使用 libopus 转为 ogg
func GenerateAudioPreview(mediaPath, outPath string) error {
v, err := NewVideo()
if err != nil {
return err
}
ext := strings.ToLower(filepath.Ext(outPath))
// 通用参数: 禁用视频, 16kHz 采样率 (STT 标准), 单声道
args := []string{"-i", mediaPath, "-vn", "-ar", "16000", "-ac", "1"}
if ext == ".wav" {
// WAV 格式,保留 PCM最长 180 秒避免 LLM OOM
args = append(args, "-t", "180", "-y", outPath)
} else {
// 默认或 .ogg 使用 libopus 极致压缩,最长 180 秒
args = append(args, "-c:a", "libopus", "-b:a", "12k", "-t", "180", "-y", outPath)
}
cmd := exec.Command(v.FFmpegPath, args...)
return cmd.Run()
}
func getVideoDuration(videoPath string) (float64, error) {
out, err := exec.Command("ffprobe", "-v", "error", "-show_entries", "format=duration", "-of", "default=noprint_wrappers=1:nokey=1", videoPath).Output()
if err != nil {
return 0, err
}
var duration float64
_, err = fmt.Sscanf(strings.TrimSpace(string(out)), "%f", &duration)
return duration, err
}

View File

@ -1,123 +0,0 @@
package vision
import (
"fmt"
"image/color"
"os"
"os/exec"
"path/filepath"
"testing"
"github.com/fogleman/gg"
)
func TestPreviewer(t *testing.T) {
// 1. 创建测试环境
tmpDir, err := os.MkdirTemp("", "vision_test")
if err != nil {
t.Fatal(err)
}
defer os.RemoveAll(tmpDir)
videoPath := filepath.Join(tmpDir, "test.mp4")
webPPath := filepath.Join(tmpDir, "preview.webp")
oggPath := filepath.Join(tmpDir, "preview.ogg")
// 2. 模拟生成素材 (生成 5 张纯色帧图片用于合成视频)
imgPattern := filepath.Join(tmpDir, "frame_%d.png")
for i := 0; i < 5; i++ {
dc := gg.NewContext(320, 240)
dc.SetColor(color.RGBA{uint8(i * 50), 100, 200, 255})
dc.Clear()
dc.SavePNG(fmt.Sprintf(imgPattern, i))
}
// 使用现有的 vision.Video 逻辑生成视频
v, err := NewVideo()
if err != nil {
t.Fatal(err)
}
// 这里通过 ffmpeg 拼接图片并添加一个静音音频产生视频
err = v.CreateVideoFromImages(filepath.Join(tmpDir, "frame_%d.png"), 1, videoPath)
if err != nil {
t.Skip("FFmpeg video generation failed, skipping integration test")
}
// 为测试视频添加静音音频,否则提取音频会失败
audioPath := filepath.Join(tmpDir, "silent.aac")
err = exec.Command("ffmpeg", "-f", "lavfi", "-i", "anullsrc=r=44100:cl=mono", "-t", "5", "-c:a", "aac", audioPath).Run()
if err == nil {
finalVideo := filepath.Join(tmpDir, "final.mp4")
exec.Command("ffmpeg", "-i", videoPath, "-i", audioPath, "-c", "copy", finalVideo).Run()
videoPath = finalVideo
}
// 3. 测试 Preview 功能
t.Run("GenerateImagePreview", func(t *testing.T) {
imgPath := filepath.Join(tmpDir, "frame_0.png")
previewImgPath := filepath.Join(tmpDir, "img_preview.webp")
err := GenerateImagePreview(imgPath, previewImgPath, 100, 100)
if err != nil {
t.Errorf("GenerateImagePreview failed: %v", err)
}
if _, err := os.Stat(previewImgPath); os.IsNotExist(err) {
t.Error("Image preview output not created")
}
})
t.Run("GenerateVideoPreview", func(t *testing.T) {
err := GenerateVideoPreview(videoPath, webPPath, 160, 120)
if err != nil {
t.Errorf("GenerateVideoPreview failed: %v", err)
}
if _, err := os.Stat(webPPath); os.IsNotExist(err) {
t.Error("WebP output not created")
}
})
t.Run("GenerateVideoPreview_SingleImage", func(t *testing.T) {
jpgPath := filepath.Join(tmpDir, "preview.jpg")
err := GenerateVideoPreview(videoPath, jpgPath, 160, 120)
if err != nil {
t.Errorf("GenerateVideoPreview (jpg) failed: %v", err)
}
if _, err := os.Stat(jpgPath); os.IsNotExist(err) {
t.Error("JPG output not created")
}
})
t.Run("GenerateVideoPreview_Directory", func(t *testing.T) {
dirPath := filepath.Join(tmpDir, "frames_dir")
err := GenerateVideoPreview(videoPath, dirPath, 160, 120)
if err != nil {
t.Errorf("GenerateVideoPreview (dir) failed: %v", err)
}
files, err := os.ReadDir(dirPath)
if err != nil {
t.Fatalf("ReadDir failed: %v", err)
}
if len(files) == 0 {
t.Error("No frames generated in directory")
}
})
t.Run("GenerateAudioPreview", func(t *testing.T) {
err := GenerateAudioPreview(videoPath, oggPath)
if err != nil {
t.Errorf("GenerateAudioPreview (ogg) failed: %v", err)
}
if _, err := os.Stat(oggPath); os.IsNotExist(err) {
t.Error("Ogg output not created")
}
wavPath := filepath.Join(tmpDir, "preview.wav")
err = GenerateAudioPreview(videoPath, wavPath)
if err != nil {
t.Errorf("GenerateAudioPreview (wav) failed: %v", err)
}
if _, err := os.Stat(wavPath); os.IsNotExist(err) {
t.Error("Wav output not created")
}
})
}

View File

@ -3,9 +3,9 @@ package vision
import (
"fmt"
"image/color"
"strconv"
"strings"
"apigo.cc/go/cast"
"apigo.cc/go/rand"
)
@ -38,7 +38,8 @@ func ParseColor(hex string) color.Color {
}
func parseHex(s string) uint8 {
return cast.To[uint8]("0x" + s)
val, _ := strconv.ParseUint(s, 16, 8)
return uint8(val)
}
// RandColor 生成随机颜色 hex 字符串

View File

@ -24,20 +24,6 @@ func NewVideo() (*Video, error) {
return &Video{FFmpegPath: p}, nil
}
// Capture 是从视频中提取指定时间的帧的便捷封装。
// 如果未提供 offsetSeconds默认提取第 0 秒。
func Capture(videoPath string, offsetSeconds ...float64) (*Canvas, error) {
v, err := NewVideo()
if err != nil {
return nil, err
}
offset := 0.0
if len(offsetSeconds) > 0 {
offset = offsetSeconds[0]
}
return v.ExtractFrame(videoPath, offset)
}
// ExtractFrame 从视频中提取指定时间的帧
func (v *Video) ExtractFrame(videoPath string, offsetSeconds float64) (*Canvas, error) {
tmpFile := filepath.Join(os.TempDir(), fmt.Sprintf("frame_%d.png", os.Getpid()))
@ -51,21 +37,6 @@ func (v *Video) ExtractFrame(videoPath string, offsetSeconds float64) (*Canvas,
return Load(tmpFile)
}
// WatermarkVideo 给视频添加水印
// videoPath: 输入视频
// markPath: 水印图片路径
// outPath: 输出视频路径
// pos: 水印位置 (使用 FFmpeg 语法, 如 '10:10', 'main_w-overlay_w-10:10')
func (v *Video) WatermarkVideo(videoPath, markPath, outPath, pos string) error {
if pos == "" {
pos = "main_w-overlay_w-10:main_h-overlay_h-10" // 默认右下角
}
filter := fmt.Sprintf("overlay=%s", pos)
cmd := exec.Command(v.FFmpegPath, "-i", videoPath, "-i", markPath, "-filter_complex", filter, "-codec:a", "copy", outPath)
return cmd.Run()
}
// CreateVideoFromImages 从一系列图片创建视频
func (v *Video) CreateVideoFromImages(imagePattern string, frameRate int, outPath string) error {
cmd := exec.Command(v.FFmpegPath, "-framerate", fmt.Sprintf("%d", frameRate), "-i", imagePattern, "-c:v", "libx264", "-pix_fmt", "yuv420p", outPath)

View File

@ -1,8 +1,6 @@
package vision
import (
"image"
"image/color"
"os"
"testing"
)
@ -65,36 +63,6 @@ func TestPHash(t *testing.T) {
t.Logf("pHash distance: %d", dist)
}
func TestTileWatermark(t *testing.T) {
c := New(400, 300, "#FFFFFF")
c.Circle(200, 150, 50, &DrawStyle{FillColor: "#0000FF"})
c.TileTextWatermark("CONFIDENTIAL", &DrawStyle{FillColor: "#FF0000"}, 0.2, 50, -0.785) // 45度
err := Save(c, "test_tile_watermark.png")
if err != nil {
t.Fatalf("save tile watermark test failed: %v", err)
}
defer os.Remove("test_tile_watermark.png")
}
func TestAnimationWatermark(t *testing.T) {
anim := NewAnimation()
for i := 0; i < 5; i++ {
c := New(100, 100, "#FFFFFF")
c.Circle(float64(i*20), 50, 10, &DrawStyle{FillColor: "#00FF00"})
anim.AddFrame(c, 10)
}
anim.TextWatermark("GO", Center, &DrawStyle{FillColor: "#000000"}, 0.5, 0)
err := anim.SaveGIF("test_anim_watermark.gif", 0)
if err != nil {
t.Fatalf("save anim watermark failed: %v", err)
}
defer os.Remove("test_anim_watermark.gif")
}
func TestQRCode(t *testing.T) {
content := "https://apigo.cc"
c, err := GenerateQRCode(content, 200)
@ -111,42 +79,6 @@ func TestQRCode(t *testing.T) {
}
}
func TestWatermark(t *testing.T) {
c := New(400, 300, "#EEEEEE")
c.Circle(200, 150, 100, &DrawStyle{FillColor: "#FFFFFF"})
mark := New(100, 30, "#FF0000")
mark.dc.SetColor(color.White)
mark.dc.DrawString("WATERMARK", 5, 20)
c.Watermark(mark, BottomRight, 0.5, 10)
err := Save(c, "test_watermark.png")
if err != nil {
t.Fatalf("save watermark test failed: %v", err)
}
defer os.Remove("test_watermark.png")
}
func TestJigsaw(t *testing.T) {
c := New(400, 300, "#FFFFFF")
c.RandBG(5)
res := c.GenerateJigsaw(150, 100, 60)
err := Save(res.Background, "test_jigsaw_bg.png")
if err != nil {
t.Errorf("save jigsaw bg failed: %v", err)
}
err = Save(res.Piece, "test_jigsaw_piece.png")
if err != nil {
t.Errorf("save jigsaw piece failed: %v", err)
}
defer os.Remove("test_jigsaw_bg.png")
defer os.Remove("test_jigsaw_piece.png")
}
func TestBarcode(t *testing.T) {
content := "12345678"
c, err := GenerateBarcode(content, 200, 50)
@ -162,34 +94,3 @@ func TestBarcode(t *testing.T) {
t.Errorf("expected %s, got %s", content, decoded)
}
}
func BenchmarkWarpPerspective(b *testing.B) {
c := New(1000, 1000, "#FFFFFF")
c.Circle(500, 500, 300, &DrawStyle{FillColor: "#FF0000"})
srcPoints := [4]image.Point{
{100, 100}, {900, 150}, {850, 850}, {150, 800},
}
b.ResetTimer()
for i := 0; i < b.N; i++ {
c.WarpPerspective(srcPoints, 500, 500)
}
}
func BenchmarkPHash(b *testing.B) {
c := New(500, 500, "#FFFFFF")
c.Circle(250, 250, 100, &DrawStyle{FillColor: "#000000"})
img := c.Image()
b.ResetTimer()
for i := 0; i < b.N; i++ {
PHash(img)
}
}
func BenchmarkExtractPalette(b *testing.B) {
c := New(500, 500, "#FFFFFF")
c.RandBG(5)
b.ResetTimer()
for i := 0; i < b.N; i++ {
c.ExtractPalette(10)
}
}

View File

@ -1,139 +0,0 @@
package vision
import (
"image"
"image/color"
"image/draw"
)
// Position 水印位置
type Position int
const (
TopLeft Position = iota
TopCenter
TopRight
LeftCenter
Center
RightCenter
BottomLeft
BottomCenter
BottomRight
)
// Watermark 给画布添加图片水印
// mark: 水印画布
// pos: 位置
// opacity: 透明度 (0.0 - 1.0)
// padding: 边距
// angle: 旋转角度 (弧度)
func (c *Canvas) Watermark(mark *Canvas, pos Position, opacity float64, padding int, angle ...float64) {
if mark == nil || opacity < 0.01 {
return
}
w, h := c.Width(), c.Height()
mw, mh := mark.Width(), mark.Height()
var x, y int
switch pos {
case TopLeft:
x, y = padding, padding
case TopCenter:
x, y = (w-mw)/2, padding
case TopRight:
x, y = w-mw-padding, padding
case LeftCenter:
x, y = padding, (h-mh)/2
case Center:
x, y = (w-mw)/2, (h-mh)/2
case RightCenter:
x, y = w-mw-padding, (h-mh)/2
case BottomLeft:
x, y = padding, h-mh-padding
case BottomCenter:
x, y = (w-mw)/2, h-mh-padding
case BottomRight:
x, y = w-mw-padding, h-mh-padding
}
c.dc.Push()
if len(angle) > 0 && angle[0] != 0 {
c.dc.RotateAbout(angle[0], float64(x+mw/2), float64(y+mh/2))
}
// 处理透明度
if opacity >= 0.99 {
c.dc.DrawImage(mark.dc.Image(), x, y)
} else {
mask := image.NewUniform(color.Alpha{uint8(255 * opacity)})
draw.DrawMask(c.dc.Image().(draw.Image), mark.dc.Image().Bounds().Add(image.Pt(x, y)), mark.dc.Image(), image.Point{}, mask, image.Point{}, draw.Over)
}
c.dc.Pop()
}
// TileWatermark 平铺图片水印
// spacing: 间距
// angle: 旋转角度 (弧度)
func (c *Canvas) TileWatermark(mark *Canvas, opacity float64, spacing int, angle float64) {
if mark == nil || opacity < 0.01 {
return
}
mw, mh := mark.Width(), mark.Height()
w, h := c.Width(), c.Height()
stepX := mw + spacing
stepY := mh + spacing
for y := -mh; y < h+mh; y += stepY {
for x := -mw; x < w+mw; x += stepX {
c.dc.Push()
c.dc.RotateAbout(angle, float64(x+mw/2), float64(y+mh/2))
if opacity >= 0.99 {
c.dc.DrawImage(mark.dc.Image(), x, y)
} else {
mask := image.NewUniform(color.Alpha{uint8(255 * opacity)})
draw.DrawMask(c.dc.Image().(draw.Image), mark.dc.Image().Bounds().Add(image.Pt(x, y)), mark.dc.Image(), image.Point{}, mask, image.Point{}, draw.Over)
}
c.dc.Pop()
}
}
}
// TextWatermark 给画布添加文字水印
func (c *Canvas) TextWatermark(text string, pos Position, style *DrawStyle, opacity float64, padding int, angle ...float64) {
// 创建一个临时画布来渲染文字,然后调用 Watermark
c.dc.Push()
tw, th := c.dc.MeasureString(text)
temp := New(int(tw)+4, int(th)+4)
if style != nil && style.FillColor != "" {
temp.dc.SetColor(ParseColor(style.FillColor))
} else {
temp.dc.SetColor(color.Black)
}
temp.dc.DrawString(text, 2, th+2)
ang := 0.0
if len(angle) > 0 {
ang = angle[0]
}
c.Watermark(temp, pos, opacity, padding, ang)
c.dc.Pop()
}
// TileTextWatermark 平铺文字水印
func (c *Canvas) TileTextWatermark(text string, style *DrawStyle, opacity float64, spacing int, angle float64) {
c.dc.Push()
tw, th := c.dc.MeasureString(text)
temp := New(int(tw)+4, int(th)+4)
if style != nil && style.FillColor != "" {
temp.dc.SetColor(ParseColor(style.FillColor))
} else {
temp.dc.SetColor(color.Black)
}
temp.dc.DrawString(text, 2, th+2)
c.TileWatermark(temp, opacity, spacing, angle)
c.dc.Pop()
}