
A year ago, while building a specialized RAG pipeline, OCR became my biggest roadblock. The technical tables in industry standards (like GB or DB) were an absolute nightmare—those complex headers and merged cells were impossible to parse accurately, making it a non-starter for downstream automation like compliance checking. Between the underwhelming performance of available multimodal and vision models, and the strict data privacy requirements that ruled out high-performance closed-source models like or Gemini, I eventually had to put the project on ice—especially considering the prohibitive token costs associated with processing massive document files
Now in Q1 2026—especially February and March, we’re seeing a massive surge of high-performance Small Language Models specifically for OCR. Witnessing this leap in tech, coupled with LlamaIndex’s recent post, “OCR for Tables: How to Extract Structured Data from Documents(https://x.com/llama_index/status/2037561042440687708),” I felt compelled to dive back in. Despite a hectic schedule, I’ve been conducting preliminary tests, focusing specifically on the ultimate challenge: complex table structure recognition.
| DATE | Model | Size | Team | Github | Huggingface | Technical Report | License |
| 2026/1/29 | PaddleOCR-VL-1.5 | 0.9B | PaddlePaddle(Baidu) | https://github.com/PaddlePaddle/PaddleOCR | https://huggingface.co/PaddlePaddle/PaddleOCR-VL-1.5 | https://arxiv.org/abs/2601.21957 | Apache-2.0 |
| 2026/2/3 | GLM-OCR | 0.9B | https://z.ai/ | https://github.com/zai-org/GLM-OCR | https://huggingface.co/zai-org/GLM-OCR | https://arxiv.org/abs/2603.10910 | Apache-2.0 |
| 2026/2/28 | FireRed-OCR | 2B | FireRedTeam(xiaohongshu) | https://github.com/FireRedTeam/FireRed-OCR | https://huggingface.co/FireRedTeam/FireRed-OCR | https://arxiv.org/abs/2603.01840 | Apache-2.0 |
| 2026/3/9 | Penguin-VL-2B | 2B/8B | Tencent AI Lab | https://github.com/tencent-ailab/Penguin-VL | https://huggingface.co/tencent/Penguin-VL-2B | https://arxiv.org/abs/2603.06569 | Apache-2.0 |
| 2026/3/18 | Qianfan-OCR | 4B | Baidu Qianfan Team | https://github.com/baidubce/Qianfan-VL | https://huggingface.co/baidu/Qianfan-OCR | https://arxiv.org/abs/2603.13398 | MIT license |
| 2026/3/19 | Chandra OCR 2 | 5B | Datalab | https://huggingface.co/spaces/victor/chandra-ocr-2 | Apache 2.0 | ||
| 2026/3/19 | dots.mocr(dots.ocr-1.5) | 3B | rednote-hilab(xiaohongshu) | https://github.com/rednote-hilab/dots.ocr | https://huggingface.co/rednote-hilab/dots.mocr | https://arxiv.org/abs/2603.13032v1 | MIT license |
| 2026/3/24 | MinerU-Diffusion-V1 | 2.5B | OpenDataLab | https://github.com/opendatalab/MinerU-Diffusion | https://huggingface.co/opendatalab/MinerU-Diffusion-V1-0320-2.5B | https://arxiv.org/pdf/2603.22458 | MIT license |
| 2026/3/31 | Sarashina2.2-OCR | 3B | SB Intuitions | https://huggingface.co/sbintuitions/sarashina2.2-ocr | MIT license | ||
| LlamaParse | https://cloud.llamaindex.ai/ |
Three technical specification tables sourced from the Management Specification of Municipal Urban Bridge Engineering.
Table 1: A structured table consisting of a caption, multi-level hierarchical headers with complex cell merging, and localized nesting, followed by supplementary footnotes.

Table 2: This table features a hierarchical structure with significant row-spanning (rowspan) in the primary and secondary category columns. It exhibits localized logical nesting where main material types are subdivided into specific equipment conditions, followed by detailed sequential process flows containing mathematical notations and parenthetical annotations.

Table 3: A structured table with a caption, hierarchical headers, and non-uniform row-splitting, characterized by embedded radical expressions, subscripts, and technical symbols.

Each model undergoes three independent trials for the same content. For conversational OCR models, four distinct prompts are utilized to constrain the output, with each trial conducted in a fresh session to eliminate contextual bias.
| Model | platfrom |
|---|---|
| Qianfan-OCR 4B | https://console.bce.baidu.com |
| GLM-OCR 0.9B | M1+Ollama+AnythingLLM |
| Penguin-VL 2B | https://huggingface.co/spaces/tencent/Penguin-VL |
| PaddleOCR-VL-1.5 0.9B | https://aistudio.baidu.com/paddleocr |
| FireRed-OCR 2B | https://www.modelscope.cn/studios/FireRedTeam/FireRed-OCR |
| Chandra OCR 2 5B | https://huggingface.co/spaces/victor/chandra-ocr-2 |
| MinerU-Diffusion-V1-0320 2.5B | https://huggingface.co/spaces/opendatalab/MinerU-Diffusion-V1-0320-2.5B |
| dots.mocr 3B | https://dotsocr.xiaohongshu.com/ |
| LlamaParse | https://cloud.llamaindex.ai/ |
Visualization & Validation:
The evaluation is conducted across three core dimensions: Table Structure Recognition, Content Recognition, and Output Stability.
A strict binary scoring system is applied to ensure the highest level of precision. The performance is rated as follows:
GLM-OCR 0.9B delivers flawless structure and stability in Markdown format, though it necessitates prompt tuning to govern the specific content output
| MODLE | TABLE 1 | TABLE 2 | TABLE 3 | |||||||
| Table Structure Recognition | OCR Performance | Output Stability | Table Structure Recognition | OCR Performance | Output Stability | Table Structure Recognition | OCR Performance | Output Stability | ||
| Qianfan-OCR 4B | Parse table from image to Markdown | ✅ | ⚠️ | ✅ | ✅ | ✅ | ✅ | ⚠️ | ✅ | ✅ |
| Please recognize all text in the table from the image. | ⚠️ | ✅ | ✅ | ⚠️ | ⚠️ | ✅ | ⚠️ | ✅ | ✅ | |
| Parse table from image to JSON | ⚠️ | ⚠️ | ✅ | ⚠️ | ✅ | ✅ | ✅ | ✅ | ✅ | |
| Parse this image and return the table data as a JSON array of objects. | ⚠️ | ⚠️ | ⚠️ | ⚠️ | ✅ | ⚠️ | ✅ | ⚠️ | ✅ | |
| GLM-OCR 0.9B | Parse table from image to Markdown | ✅ | ⚠️ | ✅ | ✅ | ✅ | ✅ | ✅ | ⚠️ | ✅ |
| Please recognize all text in the table from the image. | ✅ | ⚠️ | ✅ | ✅ | ✅ | ✅ | ✅ | ⚠️ | ✅ | |
| Parse table from image to JSON | ⚠️ | ⚠️ | ✅ | ⚠️ | ⚠️ | ✅ | ⚠️ | ⚠️ | ⚠️ | |
| Parse this image and return the table data as a JSON array of objects. | ⚠️ | ⚠️ | ✅ | ⚠️ | ⚠️ | ✅ | ⚠️ | ✅ | ⚠️ | |
| Penguin-VL 2B | Parse table from image to Markdown | ⚠️ | ⚠️ | ✅ | ⚠️ | ⚠️ | ✅ | ⚠️ | ⚠️ | ✅ |
| Please recognize all text in the table from the image. | ⚠️ | ⚠️ | ✅ | ⚠️ | ⚠️ | ⚠️ | ⚠️ | ⚠️ | ✅ | |
| Parse table from image to JSON | ⚠️ | ✅ | ⚠️ | ⚠️ | ⚠️ | ⚠️ | ⚠️ | ✅ | ✅ | |
| Parse this image and return the table data as a JSON array of objects. | ⚠️ | ⚠️ | ⚠️ | ⚠️ | ⚠️ | ⚠️ | ⚠️ | ⚠️ | ✅ | |
| PaddleOCR-VL-1.5 0.9B | / | ⚠️ | ⚠️ | ✅ | ✅ | ✅ | ✅ | ✅ | ⚠️ | ✅ |
| FireRed-OCR 2B | / | ✅ | ⚠️ | ✅ | ✅ | ⚠️ | ⚠️ | ⚠️ | ⚠️ | ✅ |
| Chandra OCR 2 5B | / | ✅ | ✅ | ✅ | ⚠️ | ⚠️ | ✅ | ✅ | ⚠️ | ✅ |
| MinerU-Diffusion-V1-0320 2.5B | / | ⚠️ | ✅ | ⚠️ | ⚠️ | ⚠️ | ✅ | ⚠️ | ✅ | ⚠️ |
| dots.mocr 3B | / | ✅ | ✅ | ✅ | ⚠️ | ✅ | ✅ | ⚠️ | ✅ | ✅ |
| LlamaParse | / | ⚠️ | ✅ | ✅ | ⚠️ | ⚠️ | ✅ | ⚠️ | ⚠️ | ✅ |
Ignore spaces and the distinction between full-width and half-width characters; also, use the Chinese character “一” as a placeholder instead of “—”.
PROMPT: Parse table from image to Markdown

The symbol for seconds (“) was incorrectly rendered as minutes (‘)
PROMPT: Please recognize all text in the table from the image.

表4.2.2-1 三角测量的主要技术要求
|等级|平均边长(km)|测角中误差(″)|起始边边长相对中误差|最弱边长相对中误差|测回数|三角形最大闭合差(″)| | | |
|---|---|---|---|---|---|---|---|---|---|
| | | | | |DJ1|DJ2|DJ6| | |
|三等|首级|4.5|1.8|≤1/150000|≤1/70000|6|9|—|7|
| |加密| | |≤1/120000| | | | | |
|四等|首级|2|2.5|≤1/100000|≤1/40000|4|6|—|9|
| |加密| | |≤1/70000| | | | | |
|一级小三角|1|5|≤1/40000|≤1/20000|—|2|4|15| |
|二级小三角|0.5|10|≤1/20000|≤1/10000|—|1|2|30| |
注:①本规范表格、公式及条文叙述中的中误差、闭合差、限差及较差均为正负值;Failed to reconstruct the complex table hierarchy, resulting in vertical data shifting and corrupted logical attribution.
PROMPT: Parse table from image to JSON

PROMPT: Parse this image and return the table data as a JSON array of objects.

PROMPT: Parse table from image to Markdown

Extract only the table content.
PROMPT: Please recognize all text in the table from the image.

The note at the bottom was overlooked by the OCR.
PROMPT: Parse table from image to JSON

JSONSyntax and Structural Errors
"测角中误差("). Suggest replacing with the proper symbol ″ or using escape characters.columns arrays in lines 17, 28, 37, and 46 are missing closing brackets ].]] should be corrected to }] to properly close the object and array hierarchy.PROMPT: Parse this image and return the table data as a JSON array of objects.

The generated JSON fails validation due to malformed syntax (missing commas and duplicate keys). More critically, there is a severe data dimension mismatch where the number of values in data rows does not align with the 10-column header definition, leading to structural collapse and logical data corruption.
PROMPT: Parse table from image to Markdown

| 等级 | 平均边长 (km) | 测角中误差 (") | 起始边边长相对中误差差 | 最弱边长相对中误差差 | 测回数 | 三角形最大闭合差 (") |
| --- | --- | --- | --- | --- | --- | --- |
| | | | | | DJ1 | DJ2 | DJ6 |
| 三等 | 4.5 | 1.8 | ≤1/150000 | ≤1/70000 | 6 | 9 | - | 7 |
| 加密 | | | ≤1/120000 | | | | | |
| 四等 | 2 | 2.5 | ≤1/100000 | ≤1/40000 | 4 | 6 | - | 9 |
| 加密 | | | ≤1/70000 | | | | | |
| 一级小三角 | 1 | 5 | ≤1/40000 | ≤1/20000 | - | 2 | 4 | 15 |
| 二级小三角 | 0.5 | 10 | ≤1/20000 | ≤1/10000 | - | 1 | 2 | 30 |PROMPT: Please recognize all text in the table from the image

| 等级 | 平均边长 (km) | 测角中误差 (") | 起始边边长相对中误差 | 最弱边长相对中误差 | 测回数 | 三角形最大闭合差 (") |
| --- | --- | --- | --- | --- | --- | --- |
| | | | | | DJ1 | DJ2 | DJ6 |
| 三等 | 4.5 | 1.8 | ≤1/150000 | ≤1/70000 | 6 | 9 | - | 7 |
| 加密 | | | ≤1/120000 | | | | | |
| 四等 | 2 | 2.5 | ≤1/100000 | ≤1/40000 | 4 | 6 | - | 9 |
| 加密 | | | ≤1/70000 | | | | | |
| 一级小三角 | 1 | 5 | ≤1/40000 | ≤1/20000 | - | 2 | 4 | 15 |
| 二级小三角 | 0.5 | 10 | ≤1/20000 | ≤1/10000 | - | 1 | 2 | 30 |
注:①本规范表格、公式及条文叙述中的中误差、闭合差、限差及较差均为正负值;PROMPT: Parse table from image to JSON

While technically valid, the JSON is poorly structured for engineering use. The level field disrupts data consistency, causing the data schema to drift away from the header definitions.
PROMPT: Parse this image and return the table data as a JSON array of objects.


Failed to reconstruct the complex table hierarchy, resulting in vertical data shifting and corrupted logical attribution.

The symbol for seconds (“) was incorrectly rendered as minutes (°)


表4.2.2-1 三角测量的主要技术要求<lcel><lcel><lcel><lcel><lcel><lcel><lcel><lcel><lcel><nl>
等级<lcel><fcel>平均边长(km)<fcel>测角中误差(")<fcel>起始边边长相对中误差<fcel>最弱边长相对中误差<fcel>测回数<lcel><lcel><lcel><fcel>三角形最大闭合差("”)<nl>
<ucel><xcel><ucel><ucel><ucel><ucel><ucel><fcel>DJ1<fcel>DJ2<fcel>DJ6<ucel><nl>
<fcel>三等<fcel>首级<fcel>4.5<fcel>1.8<fcel>≤1/150000<fcel>≤1/70000<fcel>6<fcel>9<fcel>-<fcel>7<nl>
<ucel><fcel>加密<ucel><ucel><fcel>≤1/120000<ucel><ucel><ucel><ucel><ucel><nl>
<fcel>四等<fcel>首级<fcel>2<fcel>2.5<fcel>≤1/100000<fcel>≤1/40000<fcel>4<fcel>6<fcel>-<fcel>9<nl>
<ucel><fcel>加密<ucel><ucel><fcel>≤1/70000<ucel><ucel><ucel><ucel><ucel><nl>
<fcel>一级小三角<lcel><fcel>1<fcel>5<fcel>≤1/40000<fcel>≤1/20000<fcel>-<fcel>2<fcel>4<fcel>15<nl>
<fcel>二级小三角<lcel><fcel>0.5<fcel>10<fcel>≤1/20000<fcel>≤1/10000<fcel>-<fcel>1<fcel>2<fcel>30<nl>Failed to reconstruct the complex table hierarchy


Failed to reconstruct the complex table hierarchy
PROMPT: Parse table from image to Markdown ✅

PROMPT: Please recognize all text in the table from the image.

PROMPT: Parse table from image to JSON

Incorrect content structuring
PROMPT: Parse this image and return the table data as a JSON array of objects.

Accurate character recognition, but with an overly complex schema/structure..
PROMPT: Parse table from image to Markdown ✅

PROMPT: Please recognize all text in the table from the image. ✅

PROMPT: Parse table from image to JSON

There are errors in the JSON code.
PROMPT: Parse this image and return the table data as a JSON array of objects.

There are errors in the JSON code.
PROMPT: Parse table from image to Markdown

PROMPT: Please recognize all text in the table from the image.

σ_conPROMPT: Parse table from image to JSON

σ_conPROMPT: Parse this image and return the table data as a JSON array of objects.

[
{
"预应力筋种类": "钢绞线束",
"张拉程序": "对夹片式等具有自锚性能力的锚具\n0→初应力→1.03σ_con(锚固)",
"张拉程序": "对夹片式等具有自锚性能力的锚具\n0→初应力→σ_con(持荷2min锚固)",
"张拉程序": "其他锚具\n0→初应力→1.05σ_con(持荷2min)→σ_con(锚固)"
},
{
"预应力筋种类": "钢丝束",
"张拉程序": "对夹片式等具有自锚性能力的锚具\n0→初应力→1.03σ_con(锚固)",
"张拉程序": "对夹片式等具有自锚性能力的锚具\n0→初应力→σ_con(持荷2min锚固)",
"张拉程序": "其他锚具\n0→初应力→1.05σ_con(持荷2min)→0→σ_con(锚固)"
},
{
"预应力筋种类": "精扎螺纹钢",
"张拉程序": "直线配筋时\n0→初应力→σ_con(持荷2min锚固)",
"张拉程序": "曲线配筋时\n0→σ_con(持荷2min)→0(上述程序可反复几次)→初应力→σ_con(持荷2min锚固)"
}
]σ_con

Incorrect recognition of special symbols like $\sigma$, con, and $\rightarrow$.

表 8.7.3-1 后张法预应力筋张拉程序表
<table border="1"><thead><tr><th colspan="2">预应力筋种类</th><th>张拉程序</th></tr></thead><tbody><tr><td rowspan="3">钢绞线束</td><td>对夹片式等具有自锚性能的锚具</td><td>普通松弛预应力筋 <math>0 \rightarrow</math> 初应力 <math>\rightarrow 1.03 \sigma_{\text{con}}</math> (锚固)<br/>低松弛预应力筋 <math>0 \rightarrow</math> 初应力 <math>\rightarrow \sigma_{\text{con}}</math> (持荷 2min 锚固)</td></tr><tr><td>其他锚具</td><td><math>0 \rightarrow</math> 初应力 <math>\rightarrow 1.05 \sigma_{\text{con}}</math> (持荷 2min) <math>\rightarrow \sigma_{\text{con}}</math> (锚固)</td></tr><tr><td>对夹片式等具有自锚性能的锚具</td><td>普通松弛预应力筋 <math>0 \rightarrow</math> 初应力 <math>\rightarrow 1.03 \sigma_{\text{con}}</math> (锚固)<br/>低松弛预应力筋 <math>0 \rightarrow</math> 初应力 <math>\rightarrow \sigma_{\text{con}}</math> (持荷 2min 锚固)</td></tr><tr><td rowspan="2">钢丝束</td><td>其他锚具</td><td><math>0 \rightarrow</math> 初应力 <math>\rightarrow 1.05 \sigma_{\text{con}}</math> (持荷 2min) <math>\rightarrow 0 \rightarrow \sigma_{\text{con}}</math> (锚固)</td></tr><tr><td>直线配筋时</td><td><math>0 \rightarrow</math> 初应力 <math>\rightarrow \sigma_{\text{con}}</math> (持荷 2min 锚固)</td></tr><tr><td rowspan="2">精轧螺纹钢筋</td><td>曲线配筋时</td><td><math>0 \rightarrow \sigma_{\text{con}}</math> (持荷 2min) <math>\rightarrow 0</math> (上述程序可反复几次) <math>\rightarrow</math> 初应力 <math>\rightarrow \sigma_{\text{con}}</math> (持荷 2min 锚固)</td></tr></tbody></table>
预应力筋种类<lcel><fcel>张拉程序<nl>
钢绞线束<fcel>对夹片式等具有自锚性能的锚具<fcel>普通松弛预应力筋0→初应力→1.03 σcon(锚固)低松弛预应力筋0→初应力→σcon(持荷2min锚固)<nl>
<fcel>其他锚具<fcel>0→初应力→1.05 σcon(持荷2min)→σcon(锚固)<nl>
钢丝束<fcel>对夹片式等具有自锚性能的锚具<fcel>普通松弛预应力筋0→初应力→1.03 σcon(锚固)低松弛预应力筋0→初应力→σcon(持荷2min锚固)<nl>
<fcel>其他锚具<fcel>0→初应力→1.05 σcon(持荷2min)→0→σcon(锚固)<nl>
精轧螺纹钢筋<fcel>直线配筋时<fcel>0→初应力→σcon(持荷2min锚固)<nl>
<ucel><fcel>曲线配筋时<fcel>0→σcon(持荷2min)→0(上述程序可反复几次)→初应力→σcon(持荷2min锚固)<nl>("") for units, while Version 2 uses (" ) (including an extra space).σcon
Failed to reconstruct the complex table hierarchy

# 表 8.7.3-1 后张法预应力筋张拉程序表
<table>
<thead>
<tr>
<th colspan="2">预应力筋种类</th>
<th>张拉程序</th>
</tr>
</thead>
<tbody>
<tr>
<td rowspan="3">钢绞线束</td>
<td>对夹片式等具有自锚性能的锚具</td>
<td>普通松弛预应力筋 $0 \rightarrow$ 初应力 $\rightarrow 1.03 \sigma_{con}$(锚固)<br/>低松弛预应力筋 $0 \rightarrow$ 初应力 $\rightarrow \sigma_{con}$(持荷 2min 锚固)</td>
</tr>
<tr>
<td>其他锚具</td>
<td>$0 \rightarrow$ 初应力 $\rightarrow 1.05 \sigma_{con}$(持荷 2min)$\rightarrow \sigma_{con}$(锚固)</td>
</tr>
<tr>
<td rowspan="3">钢丝束</td>
<td>对夹片式等具有自锚性能的锚具</td>
<td>普通松弛预应力筋 $0 \rightarrow$ 初应力 $\rightarrow 1.03 \sigma_{con}$(锚固)<br/>低松弛预应力筋 $0 \rightarrow$ 初应力 $\rightarrow \sigma_{con}$(持荷 2min 锚固)</td>
</tr>
<tr>
<td>其他锚具</td>
<td>$0 \rightarrow$ 初应力 $\rightarrow 1.05 \sigma_{con}$(持荷 2min)$\rightarrow 0 \rightarrow \sigma_{con}$(锚固)</td>
</tr>
<tr>
<td rowspan="2">精轧螺纹钢筋</td>
<td>直线配筋时</td>
<td>$0 \rightarrow$ 初应力 $\rightarrow \sigma_{con}$(持荷 2min 锚固)</td>
</tr>
<tr>
<td>曲线配筋时</td>
<td>$0 \rightarrow \sigma_{con}$(持荷 2min)$\rightarrow 0$(上述程序可反复几次)$\rightarrow$ 初应力 $\rightarrow \sigma_{con}$(持荷 2min 锚固)</td>
</tr>
</tbody>
</table>\rightarrowPROMPT: Parse table from image to Markdown

Failure to preserve merged cell structures.
PROMPT: Please recognize all text in the table from the image.

Failure to preserve merged cell structures
PROMPT: Parse table from image to JSON ✅

PROMPT: Parse this image and return the table data as a JSON array of objects.

Extract only the table content.
PROMPT: Parse table from image to Markdown

Extract only the table content.
PROMPT: Please recognize all text in the table from the image.

Extract only the table content.
PROMPT: Parse table from image to JSON

{
"table": [
{
"row": 1,
"columns": [
"等级",
"每千米高差全中误差(mm)",
"路线长度(km)",
"水准仪的型号",
"水准尺",
"与已知点联测",
"附合或环线",
"平地(mm)",
"山地(mm)"
]
},
{
"row": 2,
"columns": [
"二等",
"2",
"一",
"DS₁",
"因瓦",
"往返各一次",
"往返各一次",
"4√L",
"一"
],
"data": [
{
"等级": "二等",
"每千米高差全中误差(mm):": "2",
"路线长度(km):": "一",
"水准仪的型号": "DS₁",
"水准尺": "因瓦",
"与已知点联测": "往返各一次",
"附合或环线": "往返各一次",
"平地(mm):": "4√L",
"山地(mm):": "一"
},
{
"等级": "三等",
"每千米高差全中误差(mm):": "6",
"路线长度(km):": "≤50",
"水准仪的型号": "DS₁",
"水准尺": "因瓦",
"往返各一次": "往一次",
"往返各一次": "12√L",
"平地(mm):": "4√n",
"山地(mm):": "4√n"
},
{
"等级": "四等",
"每千米高差全中误差(mm):": "10",
"路线长度(km):": "≤16",
"水准仪的型号": "DS₃",
"水准尺": "双面",
"往返各一次": "往一次",
"往返各一次": "20√L",
"平地(mm):": "6√n",
"山地(mm):": "6√n"
}
]
]
]
}
Error at line#: 48 Duplicate key “往返各一次”
PROMPT: Parse this image and return the table data as a JSON array of objects.

Failed to reconstruct the complex table hierarchy
PROMPT: Parse table from image to Markdown

PROMPT: Please recognize all text in the table from the image.

PROMPT: Parse table from image to JSON

PROMPT: Parse this image and return the table data as a JSON array of objects.


Text recognition error


Content is incomplete.

表4.2.5-1水准测量的主要技术要求<lcel><lcel><lcel><lcel><lcel><lcel><lcel><lcel><nl>
等级<fcel>每千米高差中共中央误差(mm)<fcel>路线长度(km)<fcel>水准仪的型号<fcel>水准尺<fcel>观测次数<lcel><fcel>往返较差、附合或环闭合差<lcel><nl>
<ucel><ucel><ucel><ucel><ucel><fcel>与已知点联测<fcel>附合或环线<fcel>平地 (mm)<fcel>山地 (mm)<nl>
<fcel>二等<fcel>2<fcel>-<fcel>DS1<fcel>因瓦<fcel>往返各一次<fcel>往返各一次<fcel>4√L<fcel>-<nl>
<fcel>三等<fcel>6<fcel>≤50<fcel>DS1<fcel>因瓦<fcel>往返各一次<fcel>往一次<fcel>12√L<fcel>4√n<nl>
<ucel><ucel><ucel><fcel>DS3<fcel>双面<fcel>往返各一次<fcel>往返各一次<fcel>20√L<fcel>6√n<nl>
<fcel>四等<fcel>10<fcel>≤16<fcel>DS3<fcel>双面<ucel><fcel>往一次<ucel><ucel><nl>Failed to reconstruct the complex table hierarchy

Failed to reconstruct the complex table hierarchy

Beyond Layout: 3 Deep Challenges in Table OCR:
( vs (. If an OCR model can’t unify these encodings, it creates a “logic break” in downstream data cleaning and automated validation.