simd 标签归档 | 土法炼钢兴趣小组的算法知识备份

【向量检索引擎】Knowhere：向量索引执行引擎与插件契约

2026-07-12 | database · storage | #knowhere #milvus #faiss #hnsw #ivf #bitset #simd #vector-index #vector-engine

按官方 Knowhere 文档说明其在 Milvus 中的位置、相对 Faiss 的扩展（bitset、SIMD 选择、二进制度量）、VecIndex 类层次与 IDMAP/IVF/HNSW 等类型，用插件注册、CPU/GPU 分发与 bitset 进查询三张图钉住工程契约，并与 db-frontier/08 的算法细节分工。

【列存引擎内核】向量化执行引擎

2026-06-18 | database · storage | #clickhouse #vectorized-execution #block #processor #pipeline #simd #volcano-model #24-lts

ClickHouse Block 列向量 batch、IProcessor Pipeline 与 filter/project/aggregate 向量实现；对照 PostgreSQL 火山模型 ExecProcNode。源码入口 src/Processors、src/Columns。24.x LTS。

【分布式 OLAP 查询引擎】向量化批处理与 Morsel-Driven 并行

2026-07-07 | database · distributed | #vectorized-execution #morsel-driven #duckdb #trino #page #block #simd #selection-vector #batch

拆解列向量 batch、SelectionVector 与 flat/dictionary 编码；对照 columnar-engine/04 的 ClickHouse Block 直觉，说明 DuckDB morsel-driven 与 Trino Page 流在 MPP 上的落地，并给出本机 DuckDB 1.5.4 实测。

算法工程索引

2026-04-22 | algorithms | #algorithms #sorting #hashing #simd #compiler #data-structures

汇总本站算法工程相关文章，覆盖排序、哈希、树、字符串、近似数据结构、SIMD、随机化与编译器相关算法。

字符串匹配算法选型索引

2026-06-12 | algorithms | #string-matching #kmp #boyer-moore #rabin-karp #aho-corasick #simd #pattern-matching #index

字符串匹配算法工程选型：KMP、Boyer-Moore（BM）、Rabin-Karp、AC 自动机、后缀数组、SIMD 与模糊匹配——按场景选算法，附本站深度文章导航。

【存储工程】列式存储原理：为什么分析查询快 10 倍

2025-09-13 | storage | #columnar-storage #row-store #simd #vectorized #compression #pax

一条典型的分析查询只访问表中数百列里的三四列，行式存储却把整行数据从磁盘搬进内存，绝大多数字节在读入后立刻被丢弃。列式存储（Columnar Storage）把同一列的值连续存放，查询只需要读取涉及到的列，I/O 量可以降低一到两个数量级。但 I/O 减少只是故事的一半——列式布局还为压缩、向量化执行（Vectoriz…

整数压缩：varint → PForDelta → SIMD-BP128

2026-05-12 | algorithms | #integer-compression #varint #pfordelta #simd #inverted-index #search

搜索引擎的倒排索引压缩，是整数压缩最大的战场。

向量化哈希：xxHash3 与 wyhash 的 SIMD 实现

2025-07-15 | algorithms | #simd #xxhash #wyhash #avx2 #hash-function #vectorization

当你的数据以 GB/s 的速度涌入，哈希函数往往成为瓶颈。xxHash3 用 AVX2 把 8 个累加器打包成 256-bit 向量同时处理；wyhash 则用一条 128-bit 乘法做到几乎同样的吞吐。这篇文章拆解这两个顶级非密码学哈希的 SIMD 设计。

SIMD 加速字符串查找（strchr / strstr）系统指南

2025-11-13 | algorithms | #simd #sse2 #avx2 #avx-512 #string-algorithms #performance-optimization #vectorization #intrinsics #strchr #strstr #parallel-computing

面向工程实践的SIMD字符串查找优化完全指南：SSE2/AVX2/AVX-512并行比较原理，位掩码技巧，跨块与页边界安全处理，strchr/strstr高性能实现，包含完整代码示例和性能陷阱分析

并行排序：从归并网络到 GPU 双调排序

2025-07-15 | algorithms | #sorting #parallel #gpu #bitonic-sort #simd

当单核性能到达瓶颈，排序如何利用多核 CPU 和 GPU 的并行能力？从排序网络的理论优雅到工业级并行排序的工程妥协。

Swiss Table：Google 的 SIMD 加速哈希表

2026-04-07 | algorithms | #swiss-table #simd #hash-table #abseil #flat-hash-map

std::unordered_map 慢在哪里？每次查找跟着指针跳来跳去，缓存全部打飞。Google 的 Swiss Table 用 SSE2 一条指令并行比较 16 个槽位，把哈希表的探测从'逐个比较'变成了'批量筛选'。这篇文章从控制字节的位级设计讲到完整 C 实现，拆解这个替换了 Google 全部 C++ 哈希表的方案。

无分支编程：当 if 成为性能杀手

2025-07-15 | algorithms | #branchless #performance #cpu-pipeline #simd #optimization

现代 CPU 的分支预测器已经非常精准，但当预测失败时代价高昂。无分支编程用算术和位运算消除条件跳转，在特定场景下带来数倍加速。

SIMD 算法设计模式

2025-07-15 | algorithms | #simd #vectorization #avx #performance #patterns

SIMD 不只是'把标量操作变成向量操作'那么简单。从 SoA 布局到 pshufb 查表，掌握这些设计模式才能真正释放向量化的威力。

SIMD 字符串处理进阶

2026-05-26 | algorithms | #simd #sse #avx #string-processing #vectorization #simdjson

用向量指令重写字符串操作，性能提升 10 倍不是梦。