2026 Q1 OCR Small Model Explosion: A Practical Evaluation of 8 Small OCR Models on Table Parsing Capabilities

A year ago, while building a specialized RAG pipeline, OCR became my biggest roadblock. The technical tables in industry standards (like GB or DB) were an absolute nightmare—those complex headers and merged cells were impossible to parse accurately, making it a non-starter for downstream automation like compliance checking. Between the underwhelming performance of available multimodal and vision models, and the strict data privacy requirements that ruled out high-performance closed-source models like or Gemini, I eventually had to put the project on ice—especially considering the prohibitive token costs associated with processing massive document files

Now in Q1 2026—especially February and March, we’re seeing a massive surge of high-performance Small Language Models specifically for OCR. Witnessing this leap in tech, coupled with LlamaIndex’s recent post, “OCR for Tables: How to Extract Structured Data from Documents(https://x.com/llama_index/status/2037561042440687708),” I felt compelled to dive back in. Despite a hectic schedule, I’ve been conducting preliminary tests, focusing specifically on the ultimate challenge: complex table structure recognition.

Table Plugin

Test Methodology

Test Content

Three technical specification tables sourced from the Management Specification of Municipal Urban Bridge Engineering.

Table 1: A structured table consisting of a caption, multi-level hierarchical headers with complex cell merging, and localized nesting, followed by supplementary footnotes.

Table 2: This table features a hierarchical structure with significant row-spanning (rowspan) in the primary and secondary category columns. It exhibits localized logical nesting where main material types are subdivided into specific equipment conditions, followed by detailed sequential process flows containing mathematical notations and parenthetical annotations.

Table 3: A structured table with a caption, hierarchical headers, and non-uniform row-splitting, characterized by embedded radical expressions, subscripts, and technical symbols.

Testing Models & Platforms

Each model undergoes three independent trials for the same content. For conversational OCR models, four distinct prompts are utilized to constrain the output, with each trial conducted in a fresh session to eliminate contextual bias.

Modelplatfrom
Qianfan-OCR 4Bhttps://console.bce.baidu.com
GLM-OCR 0.9BM1+Ollama+AnythingLLM
Penguin-VL 2Bhttps://huggingface.co/spaces/tencent/Penguin-VL
PaddleOCR-VL-1.5 0.9Bhttps://aistudio.baidu.com/paddleocr
FireRed-OCR 2Bhttps://www.modelscope.cn/studios/FireRedTeam/FireRed-OCR
Chandra OCR 2 5Bhttps://huggingface.co/spaces/victor/chandra-ocr-2
MinerU-Diffusion-V1-0320 2.5Bhttps://huggingface.co/spaces/opendatalab/MinerU-Diffusion-V1-0320-2.5B
dots.mocr 3Bhttps://dotsocr.xiaohongshu.com/
LlamaParsehttps://cloud.llamaindex.ai/

Visualization & Validation:

Evaluation Criteria

The evaluation is conducted across three core dimensions: Table Structure Recognition, Content Recognition, and Output Stability.

  • Table Structure Recognition:Capable of accurately parsing complex headers and merged cells, while faithfully reconstructing the hierarchical and logical data structure of the document.
  • Content Recognition:High-precision recognition of text, symbols, and characters, ensuring comprehensive content extraction with zero information loss
  • Output Stability:Ensures high consistency and stability in output for the same input, eliminating stochastic variations or discrepancies.

A strict binary scoring system is applied to ensure the highest level of precision. The performance is rated as follows:

  • Perfect (✅): The output is entirely accurate, with 100% fidelity in structure, content, and formatting.
  • Failed (⚠️): Any error, discrepancy, or formatting glitch—no matter how minor—is flagged as a failure.

Conclusion

GLM-OCR 0.9B delivers flawless structure and stability in Markdown format, though it necessitates prompt tuning to govern the specific content output

Table Plugin

The details

Ignore spaces and the distinction between full-width and half-width characters; also, use the Chinese character “一” as a placeholder instead of “—”.

Table 1

Qianfan-OCR 4B

PROMPT: Parse table from image to Markdown

The symbol for seconds (“) was incorrectly rendered as minutes (‘)

PROMPT: Please recognize all text in the table from the image.

Markdown
表4.2.2-1 三角测量的主要技术要求
|等级|平均边长(km)|测角中误差(″)|起始边边长相对中误差|最弱边长相对中误差|测回数|三角形最大闭合差(″)| | | |
|---|---|---|---|---|---|---|---|---|---|
| | | | | |DJ1|DJ2|DJ6| | |
|三等|首级|4.5|1.8|≤1/150000|≤1/70000|6|9|—|7|
| |加密| | |≤1/120000| | | | | |
|四等|首级|2|2.5|≤1/100000|≤1/40000|4|6|—|9|
| |加密| | |≤1/70000| | | | | |
|一级小三角|1|5|≤1/40000|≤1/20000|—|2|4|15| |
|二级小三角|0.5|10|≤1/20000|≤1/10000|—|1|2|30| |
注:①本规范表格、公式及条文叙述中的中误差、闭合差、限差及较差均为正负值;

Failed to reconstruct the complex table hierarchy, resulting in vertical data shifting and corrupted logical attribution.

PROMPT: Parse table from image to JSON

  • The symbol for seconds (“) was incorrectly rendered as minutes (‘)
  • Incorrect table parsing leading to structural collapse and data misalignment.

PROMPT: Parse this image and return the table data as a JSON array of objects.

  • The symbol for seconds (“) was incorrectly rendered as minutes (‘)
  • Incorrect table parsing leading to structural collapse and data misalignment.
  • JSON code throws an error.

GLM-OCR 0.9B

PROMPT: Parse table from image to Markdown

Extract only the table content.

PROMPT: Please recognize all text in the table from the image.

The note at the bottom was overlooked by the OCR.

PROMPT: Parse table from image to JSON

JSONSyntax and Structural Errors

  1. Quote Nesting: Line 7 contains an unescaped double quote within the key "测角中误差("). Suggest replacing with the proper symbol or using escape characters.
  2. Incomplete Arrays: The columns arrays in lines 17, 28, 37, and 46 are missing closing brackets ].
  3. Invalid Termination: The JSON structure is malformed at the end; ]] should be corrected to }] to properly close the object and array hierarchy.

PROMPT: Parse this image and return the table data as a JSON array of objects.

The generated JSON fails validation due to malformed syntax (missing commas and duplicate keys). More critically, there is a severe data dimension mismatch where the number of values in data rows does not align with the 10-column header definition, leading to structural collapse and logical data corruption.

Penguin-VL-2b

PROMPT: Parse table from image to Markdown

Markdown
| 等级 | 平均边长 (km) | 测角中误差 (") | 起始边边长相对中误差差 | 最弱边长相对中误差差 | 测回数 | 三角形最大闭合差 (") |
| --- | --- | --- | --- | --- | --- | --- |
|  |  |  |  |  | DJ1 | DJ2 | DJ6 |
| 三等 | 4.5 | 1.8 | ≤1/150000 | ≤1/70000 | 6 | 9 | - | 7 |
| 加密 |  |  | ≤1/120000 |  |  |  |  |  |
| 四等 | 2 | 2.5 | ≤1/100000 | ≤1/40000 | 4 | 6 | - | 9 |
| 加密 |  |  | ≤1/70000 |  |  |  |  |  |
| 一级小三角 | 1 | 5 | ≤1/40000 | ≤1/20000 | - | 2 | 4 | 15 |
| 二级小三角 | 0.5 | 10 | ≤1/20000 | ≤1/10000 | - | 1 | 2 | 30 |
  • Incorrect table parsing leading to structural collapse and data misalignment.
  • Standard Markdown doesn’t support colspans

PROMPT: Please recognize all text in the table from the image

Markdown
| 等级 | 平均边长 (km) | 测角中误差 (") | 起始边边长相对中误差 | 最弱边长相对中误差 | 测回数 | 三角形最大闭合差 (") |
| --- | --- | --- | --- | --- | --- | --- |
|  |  |  |  |  | DJ1 | DJ2 | DJ6 |
| 三等 | 4.5 | 1.8 | ≤1/150000 | ≤1/70000 | 6 | 9 | - | 7 |
| 加密 |  |  | ≤1/120000 |  |  |  |  |  |
| 四等 | 2 | 2.5 | ≤1/100000 | ≤1/40000 | 4 | 6 | - | 9 |
| 加密 |  |  | ≤1/70000 |  |  |  |  |  |
| 一级小三角 | 1 | 5 | ≤1/40000 | ≤1/20000 | - | 2 | 4 | 15 |
| 二级小三角 | 0.5 | 10 | ≤1/20000 | ≤1/10000 | - | 1 | 2 | 30 |

注:①本规范表格、公式及条文叙述中的中误差、闭合差、限差及较差均为正负值;
  • Incorrect table parsing leading to structural collapse and data misalignment.
  • Standard Markdown doesn’t support colspans

PROMPT: Parse table from image to JSON

While technically valid, the JSON is poorly structured for engineering use. The level field disrupts data consistency, causing the data schema to drift away from the header definitions.

PROMPT: Parse this image and return the table data as a JSON array of objects.

  • While technically valid, the JSON is poorly structured for engineering use. The level field disrupts data consistency, causing the data schema to drift away from the header definitions.
  • Incomplete table recognition.

PaddleOCR-VL-1.5 0.9B

Failed to reconstruct the complex table hierarchy, resulting in vertical data shifting and corrupted logical attribution.

FireRed-OCR 2B

The symbol for seconds (“) was incorrectly rendered as minutes (°)

Chandra OCR 2 5B ✅

MinerU-Diffusion-V1-0320 2.5B

Markdown
表4.2.2-1 三角测量的主要技术要求<lcel><lcel><lcel><lcel><lcel><lcel><lcel><lcel><lcel><nl>
等级<lcel><fcel>平均边长(km)<fcel>测角中误差(")<fcel>起始边边长相对中误差<fcel>最弱边长相对中误差<fcel>测回数<lcel><lcel><lcel><fcel>三角形最大闭合差("”)<nl>
<ucel><xcel><ucel><ucel><ucel><ucel><ucel><fcel>DJ1<fcel>DJ2<fcel>DJ6<ucel><nl>
<fcel>三等<fcel>首级<fcel>4.5<fcel>1.8<fcel>≤1/150000<fcel>≤1/70000<fcel>6<fcel>9<fcel>-<fcel>7<nl>
<ucel><fcel>加密<ucel><ucel><fcel>≤1/120000<ucel><ucel><ucel><ucel><ucel><nl>
<fcel>四等<fcel>首级<fcel>2<fcel>2.5<fcel>≤1/100000<fcel>≤1/40000<fcel>4<fcel>6<fcel>-<fcel>9<nl>
<ucel><fcel>加密<ucel><ucel><fcel>≤1/70000<ucel><ucel><ucel><ucel><ucel><nl>
<fcel>一级小三角<lcel><fcel>1<fcel>5<fcel>≤1/40000<fcel>≤1/20000<fcel>-<fcel>2<fcel>4<fcel>15<nl>
<fcel>二级小三角<lcel><fcel>0.5<fcel>10<fcel>≤1/20000<fcel>≤1/10000<fcel>-<fcel>1<fcel>2<fcel>30<nl>

Failed to reconstruct the complex table hierarchy

dots.mocr 3B ✅

LlamaParse

Failed to reconstruct the complex table hierarchy

Table 2

Qianfan-OCR 4B

PROMPT: Parse table from image to Markdown

PROMPT: Please recognize all text in the table from the image.

  • Incomplete table recognition
  • Merged cells were not handled.

PROMPT: Parse table from image to JSON

Incorrect content structuring

PROMPT: Parse this image and return the table data as a JSON array of objects.

Accurate character recognition, but with an overly complex schema/structure..

GLM-OCR 0.9B

PROMPT: Parse table from image to Markdown

PROMPT: Please recognize all text in the table from the image.

PROMPT: Parse table from image to JSON

There are errors in the JSON code.

PROMPT: Parse this image and return the table data as a JSON array of objects.

There are errors in the JSON code.

Penguin-VL 2B

PROMPT: Parse table from image to Markdown

  • Standard Markdown doesn’t support colspans. The table is defined with a 2-column header, whereas the actual data rows contain 3 columns for hierarchical mapping.
  • Incomplete table recognition

PROMPT: Please recognize all text in the table from the image.

  • Standard Markdown doesn’t support colspans. The table is defined with a 2-column header, whereas the actual data rows contain 3 columns for hierarchical mapping.
  • Incomplete table recognition
  • Text identification errors
  • Incorrect recognition of special symbols:$\sigma_{con}$ was parsed as σ_con

PROMPT: Parse table from image to JSON

  • Structural Induction Error
  • Failed to Handle Merged Cells
  • Incorrect recognition of special symbols:$\sigma_{con}$ was parsed as σ_con

PROMPT: Parse this image and return the table data as a JSON array of objects.

Markdown
[
  {
    "预应力筋种类": "钢绞线束",
    "张拉程序": "对夹片式等具有自锚性能力的锚具\n0→初应力→1.03σ_con(锚固)",
    "张拉程序": "对夹片式等具有自锚性能力的锚具\n0→初应力→σ_con(持荷2min锚固)",
    "张拉程序": "其他锚具\n0→初应力→1.05σ_con(持荷2min)→σ_con(锚固)"
  },
  {
    "预应力筋种类": "钢丝束",
    "张拉程序": "对夹片式等具有自锚性能力的锚具\n0→初应力→1.03σ_con(锚固)",
    "张拉程序": "对夹片式等具有自锚性能力的锚具\n0→初应力→σ_con(持荷2min锚固)",
    "张拉程序": "其他锚具\n0→初应力→1.05σ_con(持荷2min)→0→σ_con(锚固)"
  },
  {
    "预应力筋种类": "精扎螺纹钢",
    "张拉程序": "直线配筋时\n0→初应力→σ_con(持荷2min锚固)",
    "张拉程序": "曲线配筋时\n0→σ_con(持荷2min)→0(上述程序可反复几次)→初应力→σ_con(持荷2min锚固)"
  }
]
  • Structural Induction Error
  • Failed to Handle Merged Cells
  • Incorrect recognition of special symbols:$\sigma_{con}$ was parsed as σ_con
  • JSONSyntax and Structural Errors

PaddleOCR-VL-1.5 0.9B ✅

FireRed-OCR 2B

Incorrect recognition of special symbols like $\sigma$, con, and $\rightarrow$.

Chandra OCR 2 5B

Markdown
表 8.7.3-1 后张法预应力筋张拉程序表

<table border="1"><thead><tr><th colspan="2">预应力筋种类</th><th>张拉程序</th></tr></thead><tbody><tr><td rowspan="3">钢绞线束</td><td>对夹片式等具有自锚性能的锚具</td><td>普通松弛预应力筋 <math>0 \rightarrow</math> 初应力 <math>\rightarrow 1.03 \sigma_{\text{con}}</math> (锚固)<br/>低松弛预应力筋 <math>0 \rightarrow</math> 初应力 <math>\rightarrow \sigma_{\text{con}}</math> (持荷 2min 锚固)</td></tr><tr><td>其他锚具</td><td><math>0 \rightarrow</math> 初应力 <math>\rightarrow 1.05 \sigma_{\text{con}}</math> (持荷 2min) <math>\rightarrow \sigma_{\text{con}}</math> (锚固)</td></tr><tr><td>对夹片式等具有自锚性能的锚具</td><td>普通松弛预应力筋 <math>0 \rightarrow</math> 初应力 <math>\rightarrow 1.03 \sigma_{\text{con}}</math> (锚固)<br/>低松弛预应力筋 <math>0 \rightarrow</math> 初应力 <math>\rightarrow \sigma_{\text{con}}</math> (持荷 2min 锚固)</td></tr><tr><td rowspan="2">钢丝束</td><td>其他锚具</td><td><math>0 \rightarrow</math> 初应力 <math>\rightarrow 1.05 \sigma_{\text{con}}</math> (持荷 2min) <math>\rightarrow 0 \rightarrow \sigma_{\text{con}}</math> (锚固)</td></tr><tr><td>直线配筋时</td><td><math>0 \rightarrow</math> 初应力 <math>\rightarrow \sigma_{\text{con}}</math> (持荷 2min 锚固)</td></tr><tr><td rowspan="2">精轧螺纹钢筋</td><td>曲线配筋时</td><td><math>0 \rightarrow \sigma_{\text{con}}</math> (持荷 2min) <math>\rightarrow 0</math> (上述程序可反复几次) <math>\rightarrow</math> 初应力 <math>\rightarrow \sigma_{\text{con}}</math> (持荷 2min 锚固)</td></tr></tbody></table>
  • Significant Content Omission
  • Structural Misalignment,Line-break induced logic fragmentation

MinerU-Diffusion-V1-0320 2.5B

Markdown
预应力筋种类<lcel><fcel>张拉程序<nl>
钢绞线束<fcel>对夹片式等具有自锚性能的锚具<fcel>普通松弛预应力筋0→初应力→1.03 σcon(锚固)低松弛预应力筋0→初应力→σcon(持荷2min锚固)<nl>
<fcel>其他锚具<fcel>0→初应力→1.05 σcon(持荷2min)→σcon(锚固)<nl>
钢丝束<fcel>对夹片式等具有自锚性能的锚具<fcel>普通松弛预应力筋0→初应力→1.03 σcon(锚固)低松弛预应力筋0→初应力→σcon(持荷2min锚固)<nl>
<fcel>其他锚具<fcel>0→初应力→1.05 σcon(持荷2min)→0→σcon(锚固)<nl>
精轧螺纹钢筋<fcel>直线配筋时<fcel>0→初应力→σcon(持荷2min锚固)<nl>
<ucel><fcel>曲线配筋时<fcel>0→σcon(持荷2min)→0(上述程序可反复几次)→初应力→σcon(持荷2min锚固)<nl>
  • Failed to reconstruct the complex table hierarchy
  • Unit Symbols: Version 1 uses ("") for units, while Version 2 uses (" ) (including an extra space).
  • Incorrect recognition of special symbols:$\sigma_{con}$ was parsed as σcon

dots.mocr 3B

Failed to reconstruct the complex table hierarchy

LlamaParse

Markdown
# 表 8.7.3-1 后张法预应力筋张拉程序表

<table>
  <thead>
    <tr>
<th colspan="2">预应力筋种类</th>
<th>张拉程序</th>
    </tr>
  </thead>
  <tbody>
    <tr>
<td rowspan="3">钢绞线束</td>
<td>对夹片式等具有自锚性能的锚具</td>
<td>普通松弛预应力筋 $0 \rightarrow$ 初应力 $\rightarrow 1.03 \sigma_{con}$(锚固)<br/>低松弛预应力筋 $0 \rightarrow$ 初应力 $\rightarrow \sigma_{con}$(持荷 2min 锚固)</td>
    </tr>
    <tr>
<td>其他锚具</td>
<td>$0 \rightarrow$ 初应力 $\rightarrow 1.05 \sigma_{con}$(持荷 2min)$\rightarrow \sigma_{con}$(锚固)</td>
    </tr>
    <tr>
<td rowspan="3">钢丝束</td>
<td>对夹片式等具有自锚性能的锚具</td>
<td>普通松弛预应力筋 $0 \rightarrow$ 初应力 $\rightarrow 1.03 \sigma_{con}$(锚固)<br/>低松弛预应力筋 $0 \rightarrow$ 初应力 $\rightarrow \sigma_{con}$(持荷 2min 锚固)</td>
    </tr>
    <tr>
<td>其他锚具</td>
<td>$0 \rightarrow$ 初应力 $\rightarrow 1.05 \sigma_{con}$(持荷 2min)$\rightarrow 0 \rightarrow \sigma_{con}$(锚固)</td>
    </tr>
    <tr>
<td rowspan="2">精轧螺纹钢筋</td>
<td>直线配筋时</td>
<td>$0 \rightarrow$ 初应力 $\rightarrow \sigma_{con}$(持荷 2min 锚固)</td>
    </tr>
    <tr>
<td>曲线配筋时</td>
<td>$0 \rightarrow \sigma_{con}$(持荷 2min)$\rightarrow 0$(上述程序可反复几次)$\rightarrow$ 初应力 $\rightarrow \sigma_{con}$(持荷 2min 锚固)</td>
    </tr>
  </tbody>
</table>
  • Inconsistent LaTeX usage, risking parsing failure in downstream systems. → was parsed as\rightarrow
  • Malformed rowspan logic, causing structural collapse due to invalid nesting.

Table 3

Qianfan-OCR 4B

PROMPT: Parse table from image to Markdown

Failure to preserve merged cell structures.

PROMPT: Please recognize all text in the table from the image.

Failure to preserve merged cell structures

PROMPT: Parse table from image to JSON ✅

PROMPT: Parse this image and return the table data as a JSON array of objects.

Extract only the table content.

GLM-OCR 0.9B

PROMPT: Parse table from image to Markdown

Extract only the table content.

PROMPT: Please recognize all text in the table from the image.

Extract only the table content.

PROMPT: Parse table from image to JSON

Markdown
{
  "table": [
    {
      "row": 1,
      "columns": [
        "等级",
        "每千米高差全中误差(mm)",
        "路线长度(km)",
        "水准仪的型号",
        "水准尺",
        "与已知点联测",
        "附合或环线",
        "平地(mm)",
        "山地(mm)"
      ]
    },
    {
      "row": 2,
      "columns": [
        "二等",
        "2",
        "一",
        "DS₁",
        "因瓦",
        "往返各一次",
        "往返各一次",
        "4√L",
        "一"
      ],
      "data": [
        {
          "等级": "二等",
          "每千米高差全中误差(mm):": "2",
          "路线长度(km):": "一",
          "水准仪的型号": "DS₁",
          "水准尺": "因瓦",
          "与已知点联测": "往返各一次",
          "附合或环线": "往返各一次",
          "平地(mm):": "4√L",
          "山地(mm):": "一"
        },
        {
          "等级": "三等",
          "每千米高差全中误差(mm):": "6",
          "路线长度(km):": "≤50",
          "水准仪的型号": "DS₁",
          "水准尺": "因瓦",
          "往返各一次": "往一次",
          "往返各一次": "12√L",
          "平地(mm):": "4√n",
          "山地(mm):": "4√n"
        },
        {
          "等级": "四等",
          "每千米高差全中误差(mm):": "10",
          "路线长度(km):": "≤16",
          "水准仪的型号": "DS₃",
          "水准尺": "双面",
          "往返各一次": "往一次",
          "往返各一次": "20√L",
          "平地(mm):": "6√n",
          "山地(mm):": "6√n"
        }
      ]
    ]
  ]
}

Error at line#: 48 Duplicate key “往返各一次”

PROMPT: Parse this image and return the table data as a JSON array of objects.

Failed to reconstruct the complex table hierarchy

Penguin-VL 2B

PROMPT: Parse table from image to Markdown

  • structural restoration error.
  • Extract only the table content.
  • Standard Markdown doesn’t support colspans.

PROMPT: Please recognize all text in the table from the image.

  • structural restoration error.
  • Extract only the table content.
  • Standard Markdown doesn’t support colspans.
  • Content is incorrectly identified and incomplete.

PROMPT: Parse table from image to JSON

  • Structural Induction Error
  • Failed to Handle Merged Cells

PROMPT: Parse this image and return the table data as a JSON array of objects.

  • Structural Induction Error
  • Failed to Handle Merged Cells
  • Content is incorrectly identified and incomplete.

PaddleOCR-VL-1.5 0.9B

Text recognition error

FireRed-OCR 2B

  • Text recognition error: “已” was mistyped as “己”;
  • structural restoration error.

Chandra OCR 2 5B

Content is incomplete.

MinerU-Diffusion-V1-0320 2.5B

Markdown
表4.2.5-1水准测量的主要技术要求<lcel><lcel><lcel><lcel><lcel><lcel><lcel><lcel><nl>
等级<fcel>每千米高差中共中央误差(mm)<fcel>路线长度(km)<fcel>水准仪的型号<fcel>水准尺<fcel>观测次数<lcel><fcel>往返较差、附合或环闭合差<lcel><nl>
<ucel><ucel><ucel><ucel><ucel><fcel>与已知点联测<fcel>附合或环线<fcel>平地 (mm)<fcel>山地 (mm)<nl>
<fcel>二等<fcel>2<fcel>-<fcel>DS1<fcel>因瓦<fcel>往返各一次<fcel>往返各一次<fcel>4√L<fcel>-<nl>
<fcel>三等<fcel>6<fcel>≤50<fcel>DS1<fcel>因瓦<fcel>往返各一次<fcel>往一次<fcel>12√L<fcel>4√n<nl>
<ucel><ucel><ucel><fcel>DS3<fcel>双面<fcel>往返各一次<fcel>往返各一次<fcel>20√L<fcel>6√n<nl>
<fcel>四等<fcel>10<fcel>≤16<fcel>DS3<fcel>双面<ucel><fcel>往一次<ucel><ucel><nl>

Failed to reconstruct the complex table hierarchy

dots.mocr 3B

Failed to reconstruct the complex table hierarchy

LlamaParse

  • Structural Induction Error
  • Content is incorrectly identified and incomplete.

Beyond Layout: 3 Deep Challenges in Table OCR

  1. The Symbology Dilemma: LaTeX vs. Plain Text: Engineering standards are packed with symbols like DS1 or L\sqrt{L}. Should the model output pixel-perfect LaTeX code for mathematical precision, or simplified plain text (e.g., “sqrt(L)sqrt(L)“) for better readability and system compatibility?
  2. The Punctuation Trap: Full-width vs. Half-width: Chinese documents often mix Full-width (CJK) and Half-width (ASCII) symbols—like vs (. If an OCR model can’t unify these encodings, it creates a “logic break” in downstream data cleaning and automated validation.
  3. Layout Extremes: When Robustness Breaks: In tight table cells, text gets squeezed, wrapped, or even vertically stacked. This “typesetting chaos” confuses the reading order, leading to truncated content or logical misalignment within the table schema.