2026 Q1 OCR Small Model Explosion: A Practical Evaluation of 8 Small OCR Models on Table Parsing Capabilities

April 4, 2026

A year ago, while building a specialized RAG pipeline, OCR became my biggest roadblock. The technical tables in industry standards (like GB or DB) were an absolute nightmare—those complex headers and merged cells were impossible to parse accurately, making it a non-starter for downstream automation like compliance checking. Between the underwhelming performance of available multimodal and vision models, and the strict data privacy requirements that ruled out high-performance closed-source models like or Gemini, I eventually had to put the project on ice—especially considering the prohibitive token costs associated with processing massive document files

Now in Q1 2026—especially February and March, we’re seeing a massive surge of high-performance Small Language Models specifically for OCR. Witnessing this leap in tech, coupled with LlamaIndex’s recent post, “OCR for Tables: How to Extract Structured Data from Documents（https://x.com/llama_index/status/2037561042440687708）,” I felt compelled to dive back in. Despite a hectic schedule, I’ve been conducting preliminary tests, focusing specifically on the ultimate challenge: complex table structure recognition.


DATE	Model	Size	Team	Github	Huggingface	Technical Report	License
2026/1/29	PaddleOCR-VL-1.5	0.9B	PaddlePaddle(Baidu)	https://github.com/PaddlePaddle/PaddleOCR	https://huggingface.co/PaddlePaddle/PaddleOCR-VL-1.5	https://arxiv.org/abs/2601.21957	Apache-2.0
2026/2/3	GLM-OCR	0.9B	https://z.ai/	https://github.com/zai-org/GLM-OCR	https://huggingface.co/zai-org/GLM-OCR	https://arxiv.org/abs/2603.10910	Apache-2.0
2026/2/28	FireRed-OCR	2B	FireRedTeam(xiaohongshu)	https://github.com/FireRedTeam/FireRed-OCR	https://huggingface.co/FireRedTeam/FireRed-OCR	https://arxiv.org/abs/2603.01840	Apache-2.0
2026/3/9	Penguin-VL-2B	2B/8B	Tencent AI Lab	https://github.com/tencent-ailab/Penguin-VL	https://huggingface.co/tencent/Penguin-VL-2B	https://arxiv.org/abs/2603.06569	Apache-2.0
2026/3/18	Qianfan-OCR	4B	Baidu Qianfan Team	https://github.com/baidubce/Qianfan-VL	https://huggingface.co/baidu/Qianfan-OCR	https://arxiv.org/abs/2603.13398	MIT license
2026/3/19	Chandra OCR 2	5B	Datalab		https://huggingface.co/spaces/victor/chandra-ocr-2		Apache 2.0
2026/3/19	dots.mocr（dots.ocr-1.5）	3B	rednote-hilab(xiaohongshu)	https://github.com/rednote-hilab/dots.ocr	https://huggingface.co/rednote-hilab/dots.mocr	https://arxiv.org/abs/2603.13032v1	MIT license
2026/3/24	MinerU-Diffusion-V1	2.5B	OpenDataLab	https://github.com/opendatalab/MinerU-Diffusion	https://huggingface.co/opendatalab/MinerU-Diffusion-V1-0320-2.5B	https://arxiv.org/pdf/2603.22458	MIT license
2026/3/31	Sarashina2.2-OCR	3B	SB Intuitions		https://huggingface.co/sbintuitions/sarashina2.2-ocr		MIT license
	LlamaParse			https://cloud.llamaindex.ai/

Test Methodology

Test Content

Three technical specification tables sourced from the Management Specification of Municipal Urban Bridge Engineering.

Table 1: A structured table consisting of a caption, multi-level hierarchical headers with complex cell merging, and localized nesting, followed by supplementary footnotes.

Table 2: This table features a hierarchical structure with significant row-spanning (rowspan) in the primary and secondary category columns. It exhibits localized logical nesting where main material types are subdivided into specific equipment conditions, followed by detailed sequential process flows containing mathematical notations and parenthetical annotations.

Table 3: A structured table with a caption, hierarchical headers, and non-uniform row-splitting, characterized by embedded radical expressions, subscripts, and technical symbols.

Testing Models & Platforms

Each model undergoes three independent trials for the same content. For conversational OCR models, four distinct prompts are utilized to constrain the output, with each trial conducted in a fresh session to eliminate contextual bias.

Model	platfrom
Qianfan-OCR 4B	https://console.bce.baidu.com
GLM-OCR 0.9B	M1+Ollama+AnythingLLM
Penguin-VL 2B	https://huggingface.co/spaces/tencent/Penguin-VL
PaddleOCR-VL-1.5 0.9B	https://aistudio.baidu.com/paddleocr
FireRed-OCR 2B	https://www.modelscope.cn/studios/FireRedTeam/FireRed-OCR
Chandra OCR 2 5B	https://huggingface.co/spaces/victor/chandra-ocr-2
MinerU-Diffusion-V1-0320 2.5B	https://huggingface.co/spaces/opendatalab/MinerU-Diffusion-V1-0320-2.5B
dots.mocr 3B	https://dotsocr.xiaohongshu.com/
LlamaParse	https://cloud.llamaindex.ai/

Visualization & Validation:

Markdown: Rendered via Markdown Live Preview to check table reconstruction.
JSON: Verified via JSON Grid for structural integrity.

Evaluation Criteria

The evaluation is conducted across three core dimensions: Table Structure Recognition, Content Recognition, and Output Stability.

Table Structure Recognition：Capable of accurately parsing complex headers and merged cells, while faithfully reconstructing the hierarchical and logical data structure of the document.
Content Recognition：High-precision recognition of text, symbols, and characters, ensuring comprehensive content extraction with zero information loss
Output Stability：Ensures high consistency and stability in output for the same input, eliminating stochastic variations or discrepancies.

A strict binary scoring system is applied to ensure the highest level of precision. The performance is rated as follows:

Perfect (✅): The output is entirely accurate, with 100% fidelity in structure, content, and formatting.
Failed (⚠️): Any error, discrepancy, or formatting glitch—no matter how minor—is flagged as a failure.

Conclusion

GLM-OCR 0.9B delivers flawless structure and stability in Markdown format, though it necessitates prompt tuning to govern the specific content output


MODLE		TABLE 1			TABLE 2			TABLE 3
		Table Structure Recognition	OCR Performance	Output Stability	Table Structure Recognition	OCR Performance	Output Stability	Table Structure Recognition	OCR Performance	Output Stability
Qianfan-OCR 4B	Parse table from image to Markdown	✅	⚠️	✅	✅	✅	✅	⚠️	✅	✅
	Please recognize all text in the table from the image.	⚠️	✅	✅	⚠️	⚠️	✅	⚠️	✅	✅
	Parse table from image to JSON	⚠️	⚠️	✅	⚠️	✅	✅	✅	✅	✅
	Parse this image and return the table data as a JSON array of objects.	⚠️	⚠️	⚠️	⚠️	✅	⚠️	✅	⚠️	✅
GLM-OCR 0.9B	Parse table from image to Markdown	✅	⚠️	✅	✅	✅	✅	✅	⚠️	✅
	Please recognize all text in the table from the image.	✅	⚠️	✅	✅	✅	✅	✅	⚠️	✅
	Parse table from image to JSON	⚠️	⚠️	✅	⚠️	⚠️	✅	⚠️	⚠️	⚠️
	Parse this image and return the table data as a JSON array of objects.	⚠️	⚠️	✅	⚠️	⚠️	✅	⚠️	✅	⚠️
Penguin-VL 2B	Parse table from image to Markdown	⚠️	⚠️	✅	⚠️	⚠️	✅	⚠️	⚠️	✅
	Please recognize all text in the table from the image.	⚠️	⚠️	✅	⚠️	⚠️	⚠️	⚠️	⚠️	✅
	Parse table from image to JSON	⚠️	✅	⚠️	⚠️	⚠️	⚠️	⚠️	✅	✅
	Parse this image and return the table data as a JSON array of objects.	⚠️	⚠️	⚠️	⚠️	⚠️	⚠️	⚠️	⚠️	✅
PaddleOCR-VL-1.5 0.9B	/	⚠️	⚠️	✅	✅	✅	✅	✅	⚠️	✅
FireRed-OCR 2B	/	✅	⚠️	✅	✅	⚠️	⚠️	⚠️	⚠️	✅
Chandra OCR 2 5B	/	✅	✅	✅	⚠️	⚠️	✅	✅	⚠️	✅
MinerU-Diffusion-V1-0320 2.5B	/	⚠️	✅	⚠️	⚠️	⚠️	✅	⚠️	✅	⚠️
dots.mocr 3B	/	✅	✅	✅	⚠️	✅	✅	⚠️	✅	✅
LlamaParse	/	⚠️	✅	✅	⚠️	⚠️	✅	⚠️	⚠️	✅

The details

Ignore spaces and the distinction between full-width and half-width characters; also, use the Chinese character “一” as a placeholder instead of “—”.

Table 1

Qianfan-OCR 4B

PROMPT: Parse table from image to Markdown

The symbol for seconds (“) was incorrectly rendered as minutes (‘)

PROMPT: Please recognize all text in the table from the image.

Markdown

表4.2.2-1 三角测量的主要技术要求
|等级|平均边长（km）|测角中误差（″）|起始边边长相对中误差|最弱边长相对中误差|测回数|三角形最大闭合差（″）| | | |
|---|---|---|---|---|---|---|---|---|---|
| | | | | |DJ1|DJ2|DJ6| | |
|三等|首级|4.5|1.8|≤1/150000|≤1/70000|6|9|—|7|
| |加密| | |≤1/120000| | | | | |
|四等|首级|2|2.5|≤1/100000|≤1/40000|4|6|—|9|
| |加密| | |≤1/70000| | | | | |
|一级小三角|1|5|≤1/40000|≤1/20000|—|2|4|15| |
|二级小三角|0.5|10|≤1/20000|≤1/10000|—|1|2|30| |
注：①本规范表格、公式及条文叙述中的中误差、闭合差、限差及较差均为正负值；

表4.2.2-1 三角测量的主要技术要求
|等级|平均边长（km）|测角中误差（″）|起始边边长相对中误差|最弱边长相对中误差|测回数|三角形最大闭合差（″）| | | |
|---|---|---|---|---|---|---|---|---|---|
| | | | | |DJ1|DJ2|DJ6| | |
|三等|首级|4.5|1.8|≤1/150000|≤1/70000|6|9|—|7|
| |加密| | |≤1/120000| | | | | |
|四等|首级|2|2.5|≤1/100000|≤1/40000|4|6|—|9|
| |加密| | |≤1/70000| | | | | |
|一级小三角|1|5|≤1/40000|≤1/20000|—|2|4|15| |
|二级小三角|0.5|10|≤1/20000|≤1/10000|—|1|2|30| |
注：①本规范表格、公式及条文叙述中的中误差、闭合差、限差及较差均为正负值；

Failed to reconstruct the complex table hierarchy, resulting in vertical data shifting and corrupted logical attribution.

PROMPT: Parse table from image to JSON

The symbol for seconds (“) was incorrectly rendered as minutes (‘)
Incorrect table parsing leading to structural collapse and data misalignment.

PROMPT: Parse this image and return the table data as a JSON array of objects.

The symbol for seconds (“) was incorrectly rendered as minutes (‘)
Incorrect table parsing leading to structural collapse and data misalignment.
JSON code throws an error.

GLM-OCR 0.9B

PROMPT: Parse table from image to Markdown

Extract only the table content.

PROMPT: Please recognize all text in the table from the image.

The note at the bottom was overlooked by the OCR.

PROMPT: Parse table from image to JSON

JSONSyntax and Structural Errors

Quote Nesting: Line 7 contains an unescaped double quote within the key "测角中误差("). Suggest replacing with the proper symbol ″ or using escape characters.
Incomplete Arrays: The columns arrays in lines 17, 28, 37, and 46 are missing closing brackets ].
Invalid Termination: The JSON structure is malformed at the end; ]] should be corrected to }] to properly close the object and array hierarchy.

PROMPT: Parse this image and return the table data as a JSON array of objects.

The generated JSON fails validation due to malformed syntax (missing commas and duplicate keys). More critically, there is a severe data dimension mismatch where the number of values in data rows does not align with the 10-column header definition, leading to structural collapse and logical data corruption.

Penguin-VL-2b

PROMPT: Parse table from image to Markdown

Markdown

| 等级 | 平均边长 (km) | 测角中误差 (") | 起始边边长相对中误差差 | 最弱边长相对中误差差 | 测回数 | 三角形最大闭合差 (") |
| --- | --- | --- | --- | --- | --- | --- |
|  |  |  |  |  | DJ1 | DJ2 | DJ6 |
| 三等 | 4.5 | 1.8 | ≤1/150000 | ≤1/70000 | 6 | 9 | - | 7 |
| 加密 |  |  | ≤1/120000 |  |  |  |  |  |
| 四等 | 2 | 2.5 | ≤1/100000 | ≤1/40000 | 4 | 6 | - | 9 |
| 加密 |  |  | ≤1/70000 |  |  |  |  |  |
| 一级小三角 | 1 | 5 | ≤1/40000 | ≤1/20000 | - | 2 | 4 | 15 |
| 二级小三角 | 0.5 | 10 | ≤1/20000 | ≤1/10000 | - | 1 | 2 | 30 |

| 等级 | 平均边长 (km) | 测角中误差 (") | 起始边边长相对中误差差 | 最弱边长相对中误差差 | 测回数 | 三角形最大闭合差 (") |
| --- | --- | --- | --- | --- | --- | --- |
|  |  |  |  |  | DJ1 | DJ2 | DJ6 |
| 三等 | 4.5 | 1.8 | ≤1/150000 | ≤1/70000 | 6 | 9 | - | 7 |
| 加密 |  |  | ≤1/120000 |  |  |  |  |  |
| 四等 | 2 | 2.5 | ≤1/100000 | ≤1/40000 | 4 | 6 | - | 9 |
| 加密 |  |  | ≤1/70000 |  |  |  |  |  |
| 一级小三角 | 1 | 5 | ≤1/40000 | ≤1/20000 | - | 2 | 4 | 15 |
| 二级小三角 | 0.5 | 10 | ≤1/20000 | ≤1/10000 | - | 1 | 2 | 30 |

Incorrect table parsing leading to structural collapse and data misalignment.
Standard Markdown doesn’t support colspans

PROMPT: Please recognize all text in the table from the image

Markdown

| 等级 | 平均边长 (km) | 测角中误差 (") | 起始边边长相对中误差 | 最弱边长相对中误差 | 测回数 | 三角形最大闭合差 (") |
| --- | --- | --- | --- | --- | --- | --- |
|  |  |  |  |  | DJ1 | DJ2 | DJ6 |
| 三等 | 4.5 | 1.8 | ≤1/150000 | ≤1/70000 | 6 | 9 | - | 7 |
| 加密 |  |  | ≤1/120000 |  |  |  |  |  |
| 四等 | 2 | 2.5 | ≤1/100000 | ≤1/40000 | 4 | 6 | - | 9 |
| 加密 |  |  | ≤1/70000 |  |  |  |  |  |
| 一级小三角 | 1 | 5 | ≤1/40000 | ≤1/20000 | - | 2 | 4 | 15 |
| 二级小三角 | 0.5 | 10 | ≤1/20000 | ≤1/10000 | - | 1 | 2 | 30 |

注：①本规范表格、公式及条文叙述中的中误差、闭合差、限差及较差均为正负值；

| 等级 | 平均边长 (km) | 测角中误差 (") | 起始边边长相对中误差 | 最弱边长相对中误差 | 测回数 | 三角形最大闭合差 (") |
| --- | --- | --- | --- | --- | --- | --- |
|  |  |  |  |  | DJ1 | DJ2 | DJ6 |
| 三等 | 4.5 | 1.8 | ≤1/150000 | ≤1/70000 | 6 | 9 | - | 7 |
| 加密 |  |  | ≤1/120000 |  |  |  |  |  |
| 四等 | 2 | 2.5 | ≤1/100000 | ≤1/40000 | 4 | 6 | - | 9 |
| 加密 |  |  | ≤1/70000 |  |  |  |  |  |
| 一级小三角 | 1 | 5 | ≤1/40000 | ≤1/20000 | - | 2 | 4 | 15 |
| 二级小三角 | 0.5 | 10 | ≤1/20000 | ≤1/10000 | - | 1 | 2 | 30 |

注：①本规范表格、公式及条文叙述中的中误差、闭合差、限差及较差均为正负值；

Incorrect table parsing leading to structural collapse and data misalignment.
Standard Markdown doesn’t support colspans

PROMPT: Parse table from image to JSON

While technically valid, the JSON is poorly structured for engineering use. The level field disrupts data consistency, causing the data schema to drift away from the header definitions.

PROMPT: Parse this image and return the table data as a JSON array of objects.

While technically valid, the JSON is poorly structured for engineering use. The level field disrupts data consistency, causing the data schema to drift away from the header definitions.
Incomplete table recognition.

PaddleOCR-VL-1.5 0.9B

Failed to reconstruct the complex table hierarchy, resulting in vertical data shifting and corrupted logical attribution.

FireRed-OCR 2B

The symbol for seconds (“) was incorrectly rendered as minutes (°)

Chandra OCR 2 5B ✅

MinerU-Diffusion-V1-0320 2.5B

Markdown

表4.2.2-1 三角测量的主要技术要求<lcel><lcel><lcel><lcel><lcel><lcel><lcel><lcel><lcel><nl>
等级<lcel><fcel>平均边长(km)<fcel>测角中误差("）<fcel>起始边边长相对中误差<fcel>最弱边长相对中误差<fcel>测回数<lcel><lcel><lcel><fcel>三角形最大闭合差("”)<nl>
<ucel><xcel><ucel><ucel><ucel><ucel><ucel><fcel>DJ1<fcel>DJ2<fcel>DJ6<ucel><nl>
<fcel>三等<fcel>首级<fcel>4.5<fcel>1.8<fcel>≤1/150000<fcel>≤1/70000<fcel>6<fcel>9<fcel>-<fcel>7<nl>
<ucel><fcel>加密<ucel><ucel><fcel>≤1/120000<ucel><ucel><ucel><ucel><ucel><nl>
<fcel>四等<fcel>首级<fcel>2<fcel>2.5<fcel>≤1/100000<fcel>≤1/40000<fcel>4<fcel>6<fcel>-<fcel>9<nl>
<ucel><fcel>加密<ucel><ucel><fcel>≤1/70000<ucel><ucel><ucel><ucel><ucel><nl>
<fcel>一级小三角<lcel><fcel>1<fcel>5<fcel>≤1/40000<fcel>≤1/20000<fcel>-<fcel>2<fcel>4<fcel>15<nl>
<fcel>二级小三角<lcel><fcel>0.5<fcel>10<fcel>≤1/20000<fcel>≤1/10000<fcel>-<fcel>1<fcel>2<fcel>30<nl>

表4.2.2-1 三角测量的主要技术要求<lcel><lcel><lcel><lcel><lcel><lcel><lcel><lcel><lcel><nl>
等级<lcel><fcel>平均边长(km)<fcel>测角中误差("）<fcel>起始边边长相对中误差<fcel>最弱边长相对中误差<fcel>测回数<lcel><lcel><lcel><fcel>三角形最大闭合差("”)<nl>
<ucel><xcel><ucel><ucel><ucel><ucel><ucel><fcel>DJ1<fcel>DJ2<fcel>DJ6<ucel><nl>
<fcel>三等<fcel>首级<fcel>4.5<fcel>1.8<fcel>≤1/150000<fcel>≤1/70000<fcel>6<fcel>9<fcel>-<fcel>7<nl>
<ucel><fcel>加密<ucel><ucel><fcel>≤1/120000<ucel><ucel><ucel><ucel><ucel><nl>
<fcel>四等<fcel>首级<fcel>2<fcel>2.5<fcel>≤1/100000<fcel>≤1/40000<fcel>4<fcel>6<fcel>-<fcel>9<nl>
<ucel><fcel>加密<ucel><ucel><fcel>≤1/70000<ucel><ucel><ucel><ucel><ucel><nl>
<fcel>一级小三角<lcel><fcel>1<fcel>5<fcel>≤1/40000<fcel>≤1/20000<fcel>-<fcel>2<fcel>4<fcel>15<nl>
<fcel>二级小三角<lcel><fcel>0.5<fcel>10<fcel>≤1/20000<fcel>≤1/10000<fcel>-<fcel>1<fcel>2<fcel>30<nl>

Failed to reconstruct the complex table hierarchy

dots.mocr 3B ✅

LlamaParse

Failed to reconstruct the complex table hierarchy

Table 2

Qianfan-OCR 4B

PROMPT: Parse table from image to Markdown ✅

PROMPT: Please recognize all text in the table from the image.

Incomplete table recognition
Merged cells were not handled.

PROMPT: Parse table from image to JSON

Incorrect content structuring

PROMPT: Parse this image and return the table data as a JSON array of objects.

Accurate character recognition, but with an overly complex schema/structure..

GLM-OCR 0.9B

PROMPT: Parse table from image to Markdown ✅

PROMPT: Please recognize all text in the table from the image. ✅

PROMPT: Parse table from image to JSON

There are errors in the JSON code.

PROMPT: Parse this image and return the table data as a JSON array of objects.

There are errors in the JSON code.

Penguin-VL 2B

PROMPT: Parse table from image to Markdown

Standard Markdown doesn’t support colspans. The table is defined with a 2-column header, whereas the actual data rows contain 3 columns for hierarchical mapping.
Incomplete table recognition

PROMPT: Please recognize all text in the table from the image.

Standard Markdown doesn’t support colspans. The table is defined with a 2-column header, whereas the actual data rows contain 3 columns for hierarchical mapping.
Incomplete table recognition
Text identification errors
Incorrect recognition of special symbols：$\sigma_{con}$ was parsed as σ_con

PROMPT: Parse table from image to JSON

Structural Induction Error
Failed to Handle Merged Cells
Incorrect recognition of special symbols：$\sigma_{con}$ was parsed as σ_con

PROMPT: Parse this image and return the table data as a JSON array of objects.

Markdown

[
  {
    "预应力筋种类": "钢绞线束",
    "张拉程序": "对夹片式等具有自锚性能力的锚具\n0→初应力→1.03σ_con（锚固）",
    "张拉程序": "对夹片式等具有自锚性能力的锚具\n0→初应力→σ_con（持荷2min锚固）",
    "张拉程序": "其他锚具\n0→初应力→1.05σ_con（持荷2min）→σ_con（锚固）"
  },
  {
    "预应力筋种类": "钢丝束",
    "张拉程序": "对夹片式等具有自锚性能力的锚具\n0→初应力→1.03σ_con（锚固）",
    "张拉程序": "对夹片式等具有自锚性能力的锚具\n0→初应力→σ_con（持荷2min锚固）",
    "张拉程序": "其他锚具\n0→初应力→1.05σ_con（持荷2min）→0→σ_con（锚固）"
  },
  {
    "预应力筋种类": "精扎螺纹钢",
    "张拉程序": "直线配筋时\n0→初应力→σ_con（持荷2min锚固）",
    "张拉程序": "曲线配筋时\n0→σ_con（持荷2min）→0（上述程序可反复几次）→初应力→σ_con（持荷2min锚固）"
  }
]

[
  {
    "预应力筋种类": "钢绞线束",
    "张拉程序": "对夹片式等具有自锚性能力的锚具\n0→初应力→1.03σ_con（锚固）",
    "张拉程序": "对夹片式等具有自锚性能力的锚具\n0→初应力→σ_con（持荷2min锚固）",
    "张拉程序": "其他锚具\n0→初应力→1.05σ_con（持荷2min）→σ_con（锚固）"
  },
  {
    "预应力筋种类": "钢丝束",
    "张拉程序": "对夹片式等具有自锚性能力的锚具\n0→初应力→1.03σ_con（锚固）",
    "张拉程序": "对夹片式等具有自锚性能力的锚具\n0→初应力→σ_con（持荷2min锚固）",
    "张拉程序": "其他锚具\n0→初应力→1.05σ_con（持荷2min）→0→σ_con（锚固）"
  },
  {
    "预应力筋种类": "精扎螺纹钢",
    "张拉程序": "直线配筋时\n0→初应力→σ_con（持荷2min锚固）",
    "张拉程序": "曲线配筋时\n0→σ_con（持荷2min）→0（上述程序可反复几次）→初应力→σ_con（持荷2min锚固）"
  }
]

Structural Induction Error
Failed to Handle Merged Cells
Incorrect recognition of special symbols：$\sigma_{con}$ was parsed as σ_con
JSONSyntax and Structural Errors

PaddleOCR-VL-1.5 0.9B ✅

FireRed-OCR 2B

Incorrect recognition of special symbols like $\sigma$, con, and $\rightarrow$.

Chandra OCR 2 5B

Markdown

表 8.7.3-1 后张法预应力筋张拉程序表

<table border="1"><thead><tr><th colspan="2">预应力筋种类</th><th>张拉程序</th></tr></thead><tbody><tr><td rowspan="3">钢绞线束</td><td>对夹片式等具有自锚性能的锚具</td><td>普通松弛预应力筋 <math>0 \rightarrow</math> 初应力 <math>\rightarrow 1.03 \sigma_{\text{con}}</math> (锚固)<br/>低松弛预应力筋 <math>0 \rightarrow</math> 初应力 <math>\rightarrow \sigma_{\text{con}}</math> (持荷 2min 锚固)</td></tr><tr><td>其他锚具</td><td><math>0 \rightarrow</math> 初应力 <math>\rightarrow 1.05 \sigma_{\text{con}}</math> (持荷 2min) <math>\rightarrow \sigma_{\text{con}}</math> (锚固)</td></tr><tr><td>对夹片式等具有自锚性能的锚具</td><td>普通松弛预应力筋 <math>0 \rightarrow</math> 初应力 <math>\rightarrow 1.03 \sigma_{\text{con}}</math> (锚固)<br/>低松弛预应力筋 <math>0 \rightarrow</math> 初应力 <math>\rightarrow \sigma_{\text{con}}</math> (持荷 2min 锚固)</td></tr><tr><td rowspan="2">钢丝束</td><td>其他锚具</td><td><math>0 \rightarrow</math> 初应力 <math>\rightarrow 1.05 \sigma_{\text{con}}</math> (持荷 2min) <math>\rightarrow 0 \rightarrow \sigma_{\text{con}}</math> (锚固)</td></tr><tr><td>直线配筋时</td><td><math>0 \rightarrow</math> 初应力 <math>\rightarrow \sigma_{\text{con}}</math> (持荷 2min 锚固)</td></tr><tr><td rowspan="2">精轧螺纹钢筋</td><td>曲线配筋时</td><td><math>0 \rightarrow \sigma_{\text{con}}</math> (持荷 2min) <math>\rightarrow 0</math> (上述程序可反复几次) <math>\rightarrow</math> 初应力 <math>\rightarrow \sigma_{\text{con}}</math> (持荷 2min 锚固)</td></tr></tbody></table>

表 8.7.3-1 后张法预应力筋张拉程序表

<table border="1"><thead><tr><th colspan="2">预应力筋种类</th><th>张拉程序</th></tr></thead><tbody><tr><td rowspan="3">钢绞线束</td><td>对夹片式等具有自锚性能的锚具</td><td>普通松弛预应力筋 <math>0 \rightarrow</math> 初应力 <math>\rightarrow 1.03 \sigma_{\text{con}}</math> (锚固)<br/>低松弛预应力筋 <math>0 \rightarrow</math> 初应力 <math>\rightarrow \sigma_{\text{con}}</math> (持荷 2min 锚固)</td></tr><tr><td>其他锚具</td><td><math>0 \rightarrow</math> 初应力 <math>\rightarrow 1.05 \sigma_{\text{con}}</math> (持荷 2min) <math>\rightarrow \sigma_{\text{con}}</math> (锚固)</td></tr><tr><td>对夹片式等具有自锚性能的锚具</td><td>普通松弛预应力筋 <math>0 \rightarrow</math> 初应力 <math>\rightarrow 1.03 \sigma_{\text{con}}</math> (锚固)<br/>低松弛预应力筋 <math>0 \rightarrow</math> 初应力 <math>\rightarrow \sigma_{\text{con}}</math> (持荷 2min 锚固)</td></tr><tr><td rowspan="2">钢丝束</td><td>其他锚具</td><td><math>0 \rightarrow</math> 初应力 <math>\rightarrow 1.05 \sigma_{\text{con}}</math> (持荷 2min) <math>\rightarrow 0 \rightarrow \sigma_{\text{con}}</math> (锚固)</td></tr><tr><td>直线配筋时</td><td><math>0 \rightarrow</math> 初应力 <math>\rightarrow \sigma_{\text{con}}</math> (持荷 2min 锚固)</td></tr><tr><td rowspan="2">精轧螺纹钢筋</td><td>曲线配筋时</td><td><math>0 \rightarrow \sigma_{\text{con}}</math> (持荷 2min) <math>\rightarrow 0</math> (上述程序可反复几次) <math>\rightarrow</math> 初应力 <math>\rightarrow \sigma_{\text{con}}</math> (持荷 2min 锚固)</td></tr></tbody></table>

Significant Content Omission
Structural Misalignment，Line-break induced logic fragmentation

MinerU-Diffusion-V1-0320 2.5B

Markdown

预应力筋种类<lcel><fcel>张拉程序<nl>
钢绞线束<fcel>对夹片式等具有自锚性能的锚具<fcel>普通松弛预应力筋0→初应力→1.03 σcon（锚固）低松弛预应力筋0→初应力→σcon（持荷2min锚固）<nl>
<fcel>其他锚具<fcel>0→初应力→1.05 σcon（持荷2min）→σcon（锚固）<nl>
钢丝束<fcel>对夹片式等具有自锚性能的锚具<fcel>普通松弛预应力筋0→初应力→1.03 σcon（锚固）低松弛预应力筋0→初应力→σcon（持荷2min锚固）<nl>
<fcel>其他锚具<fcel>0→初应力→1.05 σcon（持荷2min）→0→σcon（锚固）<nl>
精轧螺纹钢筋<fcel>直线配筋时<fcel>0→初应力→σcon（持荷2min锚固）<nl>
<ucel><fcel>曲线配筋时<fcel>0→σcon（持荷2min）→0（上述程序可反复几次）→初应力→σcon（持荷2min锚固）<nl>

预应力筋种类<lcel><fcel>张拉程序<nl>
钢绞线束<fcel>对夹片式等具有自锚性能的锚具<fcel>普通松弛预应力筋0→初应力→1.03 σcon（锚固）低松弛预应力筋0→初应力→σcon（持荷2min锚固）<nl>
<fcel>其他锚具<fcel>0→初应力→1.05 σcon（持荷2min）→σcon（锚固）<nl>
钢丝束<fcel>对夹片式等具有自锚性能的锚具<fcel>普通松弛预应力筋0→初应力→1.03 σcon（锚固）低松弛预应力筋0→初应力→σcon（持荷2min锚固）<nl>
<fcel>其他锚具<fcel>0→初应力→1.05 σcon（持荷2min）→0→σcon（锚固）<nl>
精轧螺纹钢筋<fcel>直线配筋时<fcel>0→初应力→σcon（持荷2min锚固）<nl>
<ucel><fcel>曲线配筋时<fcel>0→σcon（持荷2min）→0（上述程序可反复几次）→初应力→σcon（持荷2min锚固）<nl>

Failed to reconstruct the complex table hierarchy
Unit Symbols: Version 1 uses ("") for units, while Version 2 uses (" ) (including an extra space).
Incorrect recognition of special symbols：$\sigma_{con}$ was parsed as σcon

dots.mocr 3B

Failed to reconstruct the complex table hierarchy

LlamaParse

Markdown

# 表 8.7.3-1 后张法预应力筋张拉程序表

<table>
  <thead>
    <tr>
<th colspan="2">预应力筋种类</th>
<th>张拉程序</th>
    </tr>
  </thead>
  <tbody>
    <tr>
<td rowspan="3">钢绞线束</td>
<td>对夹片式等具有自锚性能的锚具</td>
<td>普通松弛预应力筋 $0 \rightarrow$ 初应力 $\rightarrow 1.03 \sigma_{con}$（锚固）<br/>低松弛预应力筋 $0 \rightarrow$ 初应力 $\rightarrow \sigma_{con}$（持荷 2min 锚固）</td>
    </tr>
    <tr>
<td>其他锚具</td>
<td>$0 \rightarrow$ 初应力 $\rightarrow 1.05 \sigma_{con}$（持荷 2min）$\rightarrow \sigma_{con}$（锚固）</td>
    </tr>
    <tr>
<td rowspan="3">钢丝束</td>
<td>对夹片式等具有自锚性能的锚具</td>
<td>普通松弛预应力筋 $0 \rightarrow$ 初应力 $\rightarrow 1.03 \sigma_{con}$（锚固）<br/>低松弛预应力筋 $0 \rightarrow$ 初应力 $\rightarrow \sigma_{con}$（持荷 2min 锚固）</td>
    </tr>
    <tr>
<td>其他锚具</td>
<td>$0 \rightarrow$ 初应力 $\rightarrow 1.05 \sigma_{con}$（持荷 2min）$\rightarrow 0 \rightarrow \sigma_{con}$（锚固）</td>
    </tr>
    <tr>
<td rowspan="2">精轧螺纹钢筋</td>
<td>直线配筋时</td>
<td>$0 \rightarrow$ 初应力 $\rightarrow \sigma_{con}$（持荷 2min 锚固）</td>
    </tr>
    <tr>
<td>曲线配筋时</td>
<td>$0 \rightarrow \sigma_{con}$（持荷 2min）$\rightarrow 0$（上述程序可反复几次）$\rightarrow$ 初应力 $\rightarrow \sigma_{con}$（持荷 2min 锚固）</td>
    </tr>
  </tbody>
</table>

# 表 8.7.3-1 后张法预应力筋张拉程序表

<table>
  <thead>
    <tr>
<th colspan="2">预应力筋种类</th>
<th>张拉程序</th>
    </tr>
  </thead>
  <tbody>
    <tr>
<td rowspan="3">钢绞线束</td>
<td>对夹片式等具有自锚性能的锚具</td>
<td>普通松弛预应力筋 $0 \rightarrow$ 初应力 $\rightarrow 1.03 \sigma_{con}$（锚固）<br/>低松弛预应力筋 $0 \rightarrow$ 初应力 $\rightarrow \sigma_{con}$（持荷 2min 锚固）</td>
    </tr>
    <tr>
<td>其他锚具</td>
<td>$0 \rightarrow$ 初应力 $\rightarrow 1.05 \sigma_{con}$（持荷 2min）$\rightarrow \sigma_{con}$（锚固）</td>
    </tr>
    <tr>
<td rowspan="3">钢丝束</td>
<td>对夹片式等具有自锚性能的锚具</td>
<td>普通松弛预应力筋 $0 \rightarrow$ 初应力 $\rightarrow 1.03 \sigma_{con}$（锚固）<br/>低松弛预应力筋 $0 \rightarrow$ 初应力 $\rightarrow \sigma_{con}$（持荷 2min 锚固）</td>
    </tr>
    <tr>
<td>其他锚具</td>
<td>$0 \rightarrow$ 初应力 $\rightarrow 1.05 \sigma_{con}$（持荷 2min）$\rightarrow 0 \rightarrow \sigma_{con}$（锚固）</td>
    </tr>
    <tr>
<td rowspan="2">精轧螺纹钢筋</td>
<td>直线配筋时</td>
<td>$0 \rightarrow$ 初应力 $\rightarrow \sigma_{con}$（持荷 2min 锚固）</td>
    </tr>
    <tr>
<td>曲线配筋时</td>
<td>$0 \rightarrow \sigma_{con}$（持荷 2min）$\rightarrow 0$（上述程序可反复几次）$\rightarrow$ 初应力 $\rightarrow \sigma_{con}$（持荷 2min 锚固）</td>
    </tr>
  </tbody>
</table>

Inconsistent LaTeX usage, risking parsing failure in downstream systems. → was parsed as\rightarrow
Malformed rowspan logic, causing structural collapse due to invalid nesting.

Table 3

Qianfan-OCR 4B

PROMPT: Parse table from image to Markdown

Failure to preserve merged cell structures.

PROMPT: Please recognize all text in the table from the image.

Failure to preserve merged cell structures

PROMPT: Parse table from image to JSON ✅

PROMPT: Parse this image and return the table data as a JSON array of objects.

Extract only the table content.

GLM-OCR 0.9B

PROMPT: Parse table from image to Markdown

Extract only the table content.

PROMPT: Please recognize all text in the table from the image.

Extract only the table content.

PROMPT: Parse table from image to JSON

Markdown

{
  "table": [
    {
      "row": 1,
      "columns": [
        "等级",
        "每千米高差全中误差(mm)",
        "路线长度(km)",
        "水准仪的型号",
        "水准尺",
        "与已知点联测",
        "附合或环线",
        "平地(mm)",
        "山地(mm)"
      ]
    },
    {
      "row": 2,
      "columns": [
        "二等",
        "2",
        "一",
        "DS₁",
        "因瓦",
        "往返各一次",
        "往返各一次",
        "4√L",
        "一"
      ],
      "data": [
        {
          "等级": "二等",
          "每千米高差全中误差(mm):": "2",
          "路线长度(km):": "一",
          "水准仪的型号": "DS₁",
          "水准尺": "因瓦",
          "与已知点联测": "往返各一次",
          "附合或环线": "往返各一次",
          "平地(mm):": "4√L",
          "山地(mm):": "一"
        },
        {
          "等级": "三等",
          "每千米高差全中误差(mm):": "6",
          "路线长度(km):": "≤50",
          "水准仪的型号": "DS₁",
          "水准尺": "因瓦",
          "往返各一次": "往一次",
          "往返各一次": "12√L",
          "平地(mm):": "4√n",
          "山地(mm):": "4√n"
        },
        {
          "等级": "四等",
          "每千米高差全中误差(mm):": "10",
          "路线长度(km):": "≤16",
          "水准仪的型号": "DS₃",
          "水准尺": "双面",
          "往返各一次": "往一次",
          "往返各一次": "20√L",
          "平地(mm):": "6√n",
          "山地(mm):": "6√n"
        }
      ]
    ]
  ]
}

{
  "table": [
    {
      "row": 1,
      "columns": [
        "等级",
        "每千米高差全中误差(mm)",
        "路线长度(km)",
        "水准仪的型号",
        "水准尺",
        "与已知点联测",
        "附合或环线",
        "平地(mm)",
        "山地(mm)"
      ]
    },
    {
      "row": 2,
      "columns": [
        "二等",
        "2",
        "一",
        "DS₁",
        "因瓦",
        "往返各一次",
        "往返各一次",
        "4√L",
        "一"
      ],
      "data": [
        {
          "等级": "二等",
          "每千米高差全中误差(mm):": "2",
          "路线长度(km):": "一",
          "水准仪的型号": "DS₁",
          "水准尺": "因瓦",
          "与已知点联测": "往返各一次",
          "附合或环线": "往返各一次",
          "平地(mm):": "4√L",
          "山地(mm):": "一"
        },
        {
          "等级": "三等",
          "每千米高差全中误差(mm):": "6",
          "路线长度(km):": "≤50",
          "水准仪的型号": "DS₁",
          "水准尺": "因瓦",
          "往返各一次": "往一次",
          "往返各一次": "12√L",
          "平地(mm):": "4√n",
          "山地(mm):": "4√n"
        },
        {
          "等级": "四等",
          "每千米高差全中误差(mm):": "10",
          "路线长度(km):": "≤16",
          "水准仪的型号": "DS₃",
          "水准尺": "双面",
          "往返各一次": "往一次",
          "往返各一次": "20√L",
          "平地(mm):": "6√n",
          "山地(mm):": "6√n"
        }
      ]
    ]
  ]
}

Error at line#: 48 Duplicate key “往返各一次”

PROMPT: Parse this image and return the table data as a JSON array of objects.

Failed to reconstruct the complex table hierarchy

Penguin-VL 2B

PROMPT: Parse table from image to Markdown

structural restoration error.
Extract only the table content.
Standard Markdown doesn’t support colspans.

PROMPT: Please recognize all text in the table from the image.

structural restoration error.
Extract only the table content.
Standard Markdown doesn’t support colspans.
Content is incorrectly identified and incomplete.

PROMPT: Parse table from image to JSON

Structural Induction Error
Failed to Handle Merged Cells

PROMPT: Parse this image and return the table data as a JSON array of objects.

Structural Induction Error
Failed to Handle Merged Cells
Content is incorrectly identified and incomplete.

PaddleOCR-VL-1.5 0.9B

Text recognition error

FireRed-OCR 2B

Text recognition error: “已” was mistyped as “己”;
structural restoration error.

Chandra OCR 2 5B

Content is incomplete.

MinerU-Diffusion-V1-0320 2.5B

Markdown

表4.2.5-1水准测量的主要技术要求<lcel><lcel><lcel><lcel><lcel><lcel><lcel><lcel><nl>
等级<fcel>每千米高差中共中央误差(mm)<fcel>路线长度(km)<fcel>水准仪的型号<fcel>水准尺<fcel>观测次数<lcel><fcel>往返较差、附合或环闭合差<lcel><nl>
<ucel><ucel><ucel><ucel><ucel><fcel>与已知点联测<fcel>附合或环线<fcel>平地 (mm)<fcel>山地 (mm)<nl>
<fcel>二等<fcel>2<fcel>-<fcel>DS1<fcel>因瓦<fcel>往返各一次<fcel>往返各一次<fcel>4√L<fcel>-<nl>
<fcel>三等<fcel>6<fcel>≤50<fcel>DS1<fcel>因瓦<fcel>往返各一次<fcel>往一次<fcel>12√L<fcel>4√n<nl>
<ucel><ucel><ucel><fcel>DS3<fcel>双面<fcel>往返各一次<fcel>往返各一次<fcel>20√L<fcel>6√n<nl>
<fcel>四等<fcel>10<fcel>≤16<fcel>DS3<fcel>双面<ucel><fcel>往一次<ucel><ucel><nl>

表4.2.5-1水准测量的主要技术要求<lcel><lcel><lcel><lcel><lcel><lcel><lcel><lcel><nl>
等级<fcel>每千米高差中共中央误差(mm)<fcel>路线长度(km)<fcel>水准仪的型号<fcel>水准尺<fcel>观测次数<lcel><fcel>往返较差、附合或环闭合差<lcel><nl>
<ucel><ucel><ucel><ucel><ucel><fcel>与已知点联测<fcel>附合或环线<fcel>平地 (mm)<fcel>山地 (mm)<nl>
<fcel>二等<fcel>2<fcel>-<fcel>DS1<fcel>因瓦<fcel>往返各一次<fcel>往返各一次<fcel>4√L<fcel>-<nl>
<fcel>三等<fcel>6<fcel>≤50<fcel>DS1<fcel>因瓦<fcel>往返各一次<fcel>往一次<fcel>12√L<fcel>4√n<nl>
<ucel><ucel><ucel><fcel>DS3<fcel>双面<fcel>往返各一次<fcel>往返各一次<fcel>20√L<fcel>6√n<nl>
<fcel>四等<fcel>10<fcel>≤16<fcel>DS3<fcel>双面<ucel><fcel>往一次<ucel><ucel><nl>

Failed to reconstruct the complex table hierarchy

dots.mocr 3B

Failed to reconstruct the complex table hierarchy

LlamaParse

Structural Induction Error
Content is incorrectly identified and incomplete.

Beyond Layout: 3 Deep Challenges in Table OCR：

The Symbology Dilemma: LaTeX vs. Plain Text: Engineering standards are packed with symbols like DS₁or $\sqrt{L}$ . Should the model output pixel-perfect LaTeX code for mathematical precision, or simplified plain text (e.g., “ $sqrt(L)$ “) for better readability and system compatibility?
The Punctuation Trap: Full-width vs. Half-width: Chinese documents often mix Full-width (CJK) and Half-width (ASCII) symbols—like （ vs (. If an OCR model can’t unify these encodings, it creates a “logic break” in downstream data cleaning and automated validation.
Layout Extremes: When Robustness Breaks: In tight table cells, text gets squeezed, wrapped, or even vertically stacked. This “typesetting chaos” confuses the reading order, leading to truncated content or logical misalignment within the table schema.

2026 Q1 OCR Small Model Explosion: A Practical Evaluation of 8 Small OCR Models on Table Parsing Capabilities

Test Methodology

Test Content

Testing Models & Platforms

Evaluation Criteria

Conclusion

The details

Table 1

Qianfan-OCR 4B

GLM-OCR 0.9B

Penguin-VL-2b

PaddleOCR-VL-1.5 0.9B

FireRed-OCR 2B

Chandra OCR 2 5B ✅

MinerU-Diffusion-V1-0320 2.5B

dots.mocr 3B ✅

LlamaParse

Table 2

Qianfan-OCR 4B

GLM-OCR 0.9B

Penguin-VL 2B

PaddleOCR-VL-1.5 0.9B ✅

FireRed-OCR 2B

Chandra OCR 2 5B

MinerU-Diffusion-V1-0320 2.5B

dots.mocr 3B

LlamaParse

Table 3

Qianfan-OCR 4B

GLM-OCR 0.9B

Penguin-VL 2B

PaddleOCR-VL-1.5 0.9B

FireRed-OCR 2B

Chandra OCR 2 5B

MinerU-Diffusion-V1-0320 2.5B

dots.mocr 3B

LlamaParse

Tags:

Categories

Tags

Prev

Animating SVGs with Recraft V4 and Gemini 3.1 Pro: An Experiment

Back to Blog