As discussed in a previous post, existing OCR benchmarks are not especially useful for discriminating between models on the kinds of documents that social scientists actually work with. Most benchmarks, like OmniDocBench v1.5, over-index on modern printed text, clean scans, and well-resourced languages. Handwritten census records, historical logbooks, degraded administrative forms, and other ``messy" real-world data are not well represented.
socOCRbench is a small (private) benchmark designed with this gap in mind. It evaluates OCR models on samples across handwriting recognition, table extraction, and printed text recognition. The overall score is the mean of three metrics: NES (Normalized Edit Similarity), chrF (character n-gram F-score) for text, and TEDS (Tree Edit Distance Similarity) for tables. Each ranges from 0 to 1, where 1.0 is perfect.
You can read more about socOCRbench and the motivation behind it in the corresponding working paper.
| Model | socOCRbench | NES Region | W. Europe | E. Europe | E. Asia | S. Asia | MENA | NES Format | HW Text | Print Text | HW Table | chrF | W. Europe | E. Europe | E. Asia | S. Asia | MENA | HW Text | Print Text | TEDS | $/M In | $/M Out |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Gemini 3.1 Pro (low) VLM Proprietary | 0.6357 | 0.6577 | 0.6891 | 0.6605 | 0.5695 | 0.7315 | 0.6377 | 0.6450 | 0.6486 | 0.7022 | 0.5843 | 0.6054 | 0.6843 | 0.5302 | 0.5475 | 0.7428 | 0.5221 | 0.5752 | 0.6835 | 0.6502 | 2.00 | 12.00 |
| Gemini 3 Pro (low) VLM Proprietary | 0.6249 | 0.6888 | 0.7627 | 0.7210 | 0.5704 | 0.7355 | 0.6545 | 0.6350 | 0.6798 | 0.8011 | 0.4241 | 0.6479 | 0.8108 | 0.5612 | 0.5720 | 0.7662 | 0.5291 | 0.6013 | 0.7964 | 0.5650 | 2.00 | 12.00 |
| Gemini 3.1 Flash Lite (minimal) VLM Proprietary | 0.6214 | 0.6502 | 0.7428 | 0.6490 | 0.5266 | 0.7341 | 0.5987 | 0.6424 | 0.6391 | 0.7487 | 0.5395 | 0.5822 | 0.7505 | 0.4898 | 0.5095 | 0.7216 | 0.4393 | 0.5364 | 0.7312 | 0.6356 | 0.25 | 1.50 |
| Gemini 3.5 Flash (low) VLM Proprietary | 0.6096 | 0.6566 | 0.7163 | 0.6861 | 0.5427 | 0.6990 | 0.6388 | 0.6129 | 0.6493 | 0.7468 | 0.4424 | 0.6122 | 0.7564 | 0.5389 | 0.5149 | 0.7403 | 0.5104 | 0.5754 | 0.7358 | 0.5819 | 1.50 | 9.00 |
| Gemini 3.5 Flash (minimal) VLM Proprietary | 0.6022 | 0.6441 | 0.7081 | 0.6258 | 0.5594 | 0.7083 | 0.6190 | 0.6032 | 0.6268 | 0.7507 | 0.4321 | 0.6001 | 0.7490 | 0.4872 | 0.5316 | 0.7244 | 0.5084 | 0.5627 | 0.7282 | 0.5828 | 1.50 | 9.00 |
| Gemini 3 Flash (low) VLM Proprietary | 0.5995 | 0.6453 | 0.7063 | 0.6143 | 0.5446 | 0.7156 | 0.6456 | 0.6076 | 0.6392 | 0.7300 | 0.4537 | 0.6068 | 0.7707 | 0.4820 | 0.5211 | 0.7340 | 0.5261 | 0.5731 | 0.7378 | 0.5652 | 0.50 | 3.00 |
| Claude Sonnet 4.6 VLM Proprietary | 0.5980 | 0.5628 | 0.6938 | 0.6056 | 0.3389 | 0.6704 | 0.5052 | 0.5764 | 0.5358 | 0.7020 | 0.4914 | 0.5039 | 0.7070 | 0.4403 | 0.3760 | 0.6470 | 0.3492 | 0.4607 | 0.6674 | 0.7205 | 3.00 | 15.00 |
| Gemini 3 Flash (minimal) VLM Proprietary | 0.5920 | 0.6390 | 0.7287 | 0.6080 | 0.5227 | 0.7209 | 0.6145 | 0.6145 | 0.6334 | 0.7387 | 0.4712 | 0.5903 | 0.7766 | 0.4620 | 0.4889 | 0.7377 | 0.4863 | 0.5613 | 0.7258 | 0.5590 | 0.50 | 3.00 |
| Qwen3.7 Plus VLM Proprietary | 0.5830 | 0.6140 | 0.7118 | 0.5977 | 0.5656 | 0.7012 | 0.4936 | 0.5919 | 0.5906 | 0.7453 | 0.4399 | 0.5402 | 0.7220 | 0.4260 | 0.5565 | 0.6634 | 0.3330 | 0.4893 | 0.7052 | 0.6059 | 0.32 | 1.28 |
| Gemini 3.1 Flash Lite (low) VLM Proprietary | 0.5819 | 0.6359 | 0.7307 | 0.6373 | 0.5140 | 0.7203 | 0.5772 | 0.6137 | 0.6238 | 0.7475 | 0.4698 | 0.5767 | 0.7587 | 0.4865 | 0.5118 | 0.7054 | 0.4210 | 0.5295 | 0.7357 | 0.5443 | 0.25 | 1.50 |
| Qwen3.5 122B VLM Open Source | 0.5753 | 0.6078 | 0.7133 | 0.6014 | 0.5500 | 0.6860 | 0.4884 | 0.5929 | 0.5894 | 0.7342 | 0.4551 | 0.5398 | 0.7174 | 0.4232 | 0.5483 | 0.6462 | 0.3640 | 0.5013 | 0.6864 | 0.5858 | 0.40 | 3.20 |
| Seed 2.0 Pro VLM Proprietary | 0.5631 | 0.6010 | 0.6531 | 0.6272 | 0.5554 | 0.6993 | 0.4697 | 0.5643 | 0.5536 | 0.7354 | 0.4039 | 0.5513 | 0.6813 | 0.4707 | 0.5712 | 0.6911 | 0.3422 | 0.4894 | 0.7019 | 0.5554 | 0.47 | 2.37 |
| Qwen3.6 Plus VLM | 0.5623 | 0.5964 | 0.6784 | 0.5566 | 0.5368 | 0.7000 | 0.5102 | 0.5727 | 0.5821 | 0.7003 | 0.4355 | 0.5335 | 0.6827 | 0.4015 | 0.5301 | 0.6781 | 0.3753 | 0.4960 | 0.6608 | 0.5689 | 0.33 | 1.95 |
| Qwen3.5 397B VLM Open Source | 0.5616 | 0.6353 | 0.7399 | 0.6068 | 0.5781 | 0.7412 | 0.5105 | 0.6152 | 0.6140 | 0.7671 | 0.4644 | 0.5716 | 0.7468 | 0.4496 | 0.5705 | 0.7075 | 0.3838 | 0.5282 | 0.7219 | 0.4879 | 0.60 | 3.60 |
| Qwen3.5 Plus VLM Proprietary | 0.5576 | 0.6279 | 0.7299 | 0.5812 | 0.5693 | 0.7298 | 0.5293 | 0.6069 | 0.6101 | 0.7533 | 0.4574 | 0.5663 | 0.7388 | 0.4418 | 0.5699 | 0.7017 | 0.3789 | 0.5204 | 0.7186 | 0.4891 | 0.80 | 2.00 |
| Qwen3.5 397B (thinking) VLM Open Source | 0.5504 | 0.5935 | 0.6935 | 0.5623 | 0.5707 | 0.6698 | 0.4710 | 0.5809 | 0.5630 | 0.7321 | 0.4477 | 0.5436 | 0.7206 | 0.4135 | 0.5652 | 0.6700 | 0.3487 | 0.4892 | 0.7112 | 0.5204 | 0.60 | 3.60 |
| Qwen3 VL 235B VLM Open Source | 0.5478 | 0.5967 | 0.7115 | 0.5943 | 0.5540 | 0.6905 | 0.4330 | 0.6021 | 0.5627 | 0.7396 | 0.5040 | 0.5236 | 0.6893 | 0.4302 | 0.5447 | 0.6678 | 0.2859 | 0.4637 | 0.6908 | 0.5204 | 0.20 | 0.88 |
| Gemini 2.5 Flash VLM Proprietary | 0.5446 | 0.5833 | 0.6581 | 0.5489 | 0.4987 | 0.6978 | 0.5131 | 0.5720 | 0.5352 | 0.7219 | 0.4587 | 0.5471 | 0.7012 | 0.4054 | 0.5000 | 0.7128 | 0.4163 | 0.4785 | 0.7198 | 0.5091 | 0.30 | 2.50 |
| Qwen3.5 27B VLM Open Source | 0.5417 | 0.5926 | 0.6984 | 0.5574 | 0.5615 | 0.6686 | 0.4770 | 0.5786 | 0.5831 | 0.7062 | 0.4465 | 0.5155 | 0.6953 | 0.3946 | 0.5460 | 0.5998 | 0.3418 | 0.4841 | 0.6548 | 0.5242 | 0.30 | 2.40 |
| Claude Opus 4.6 VLM Proprietary | 0.5415 | 0.5568 | 0.6827 | 0.6021 | 0.3496 | 0.6569 | 0.4927 | 0.5537 | 0.5296 | 0.7038 | 0.4279 | 0.5057 | 0.7109 | 0.4527 | 0.3853 | 0.6499 | 0.3295 | 0.4568 | 0.6781 | 0.5637 | 5.00 | 25.00 |
| Qwen3.5 Plus (2026-04-20) VLM Proprietary | 0.5392 | 0.5802 | 0.6988 | 0.5149 | 0.5561 | 0.6294 | 0.5019 | 0.5756 | 0.5700 | 0.7028 | 0.4539 | 0.5246 | 0.7179 | 0.3772 | 0.5347 | 0.6216 | 0.3718 | 0.4897 | 0.6758 | 0.5150 | 0.30 | 1.80 |
| Gemini 2.0 Flash VLM Proprietary | 0.5295 | 0.5777 | 0.6356 | 0.5577 | 0.4815 | 0.6977 | 0.5162 | 0.5562 | 0.5218 | 0.7192 | 0.4276 | 0.5342 | 0.6656 | 0.4104 | 0.4772 | 0.7100 | 0.4080 | 0.4516 | 0.7132 | 0.4872 | 0.10 | 0.40 |
| Datalab (accurate) VLM Proprietary | 0.5213 | 0.5018 | 0.5394 | 0.5543 | 0.4674 | 0.6355 | 0.3122 | 0.4413 | 0.4155 | 0.6943 | 0.2141 | 0.4961 | 0.6127 | 0.4307 | 0.5082 | 0.6941 | 0.2347 | 0.3972 | 0.6874 | 0.5962 | 8.65 | 8.65 |
| Datalab (balanced) VLM Proprietary | 0.5167 | 0.5146 | 0.5440 | 0.5703 | 0.4813 | 0.6413 | 0.3362 | 0.4491 | 0.4421 | 0.6831 | 0.2221 | 0.4971 | 0.5986 | 0.4207 | 0.5221 | 0.6990 | 0.2451 | 0.4018 | 0.6750 | 0.5713 | 5.91 | 5.91 |
| Seed 2.0 Lite VLM Proprietary | 0.5160 | 0.5747 | 0.6447 | 0.5948 | 0.5307 | 0.6022 | 0.5013 | 0.5347 | 0.5592 | 0.6854 | 0.3594 | 0.5417 | 0.6835 | 0.4465 | 0.5361 | 0.6561 | 0.3865 | 0.4977 | 0.6755 | 0.4516 | 0.09 | 0.53 |
| Datalab (fast) VLM Proprietary | 0.5124 | 0.5151 | 0.5456 | 0.5856 | 0.4780 | 0.6374 | 0.3288 | 0.4492 | 0.4419 | 0.6858 | 0.2199 | 0.4952 | 0.5940 | 0.4289 | 0.5166 | 0.6964 | 0.2400 | 0.3988 | 0.6731 | 0.5598 | 5.87 | 5.87 |
| Mistral OCR 4 VLM Proprietary | 0.5055 | 0.5608 | 0.6790 | 0.5485 | 0.4889 | 0.6334 | 0.4545 | 0.5701 | 0.5203 | 0.7159 | 0.4740 | 0.5002 | 0.6670 | 0.3831 | 0.4430 | 0.6456 | 0.3622 | 0.4332 | 0.6783 | 0.4509 | 3.70 | 3.70 |
| Seed 2.0 Mini VLM Proprietary | 0.4974 | 0.5448 | 0.5831 | 0.5323 | 0.5283 | 0.6261 | 0.4542 | 0.5041 | 0.5082 | 0.6557 | 0.3483 | 0.5244 | 0.6198 | 0.3937 | 0.5614 | 0.6872 | 0.3600 | 0.4641 | 0.6532 | 0.4433 | 0.10 | 0.40 |
| Qwen3.6 Flash VLM Open Source | 0.4882 | 0.5347 | 0.6561 | 0.4680 | 0.5234 | 0.6008 | 0.4251 | 0.5392 | 0.5191 | 0.6603 | 0.4381 | 0.4669 | 0.6484 | 0.3220 | 0.5083 | 0.5515 | 0.3043 | 0.4333 | 0.6105 | 0.4609 | 0.19 | 1.12 |
| dots.ocr 1.5 VLM Open Source | 0.4778 | 0.5249 | 0.6408 | 0.5801 | 0.4369 | 0.6913 | 0.2754 | 0.4387 | 0.4705 | 0.7549 | 0.0907 | 0.4956 | 0.7172 | 0.4244 | 0.4757 | 0.6861 | 0.1746 | 0.4067 | 0.7308 | 0.4560 | 0.03 | 0.03 |
| Qwen3.5 Flash VLM Open Source | 0.4778 | 0.5893 | 0.6830 | 0.5540 | 0.5463 | 0.6840 | 0.4790 | 0.5489 | 0.5758 | 0.7145 | 0.3563 | 0.5289 | 0.7096 | 0.3979 | 0.5624 | 0.6294 | 0.3453 | 0.4890 | 0.6797 | 0.3353 | 0.10 | 0.40 |
| Kimi K2.5 VLM Proprietary | 0.4775 | 0.5080 | 0.6847 | 0.5107 | 0.4367 | 0.5512 | 0.3569 | 0.5275 | 0.4814 | 0.6899 | 0.4113 | 0.4456 | 0.6989 | 0.3877 | 0.4128 | 0.4852 | 0.2437 | 0.4143 | 0.6266 | 0.4689 | 0.40 | 1.90 |
| Qwen3 VL 8B VLM Open Source | 0.4725 | 0.5233 | 0.6322 | 0.5469 | 0.4940 | 0.6270 | 0.3165 | 0.4794 | 0.4803 | 0.7068 | 0.2512 | 0.4533 | 0.6434 | 0.3780 | 0.4921 | 0.5515 | 0.2014 | 0.3996 | 0.6277 | 0.4628 | 0.08 | 0.50 |
| Qwen3.5 35B VLM Open Source | 0.4720 | 0.5898 | 0.6859 | 0.5496 | 0.5479 | 0.6885 | 0.4772 | 0.5472 | 0.5703 | 0.7275 | 0.3440 | 0.5288 | 0.7187 | 0.3877 | 0.5587 | 0.6348 | 0.3442 | 0.4868 | 0.6874 | 0.3186 | 0.25 | 2.00 |
| Step 3.7 Flash (medium) VLM Proprietary | 0.4668 | 0.4705 | 0.6492 | 0.5121 | 0.4277 | 0.4794 | 0.2842 | 0.5016 | 0.4441 | 0.6477 | 0.4129 | 0.4237 | 0.6580 | 0.3858 | 0.4550 | 0.4002 | 0.2197 | 0.4002 | 0.5871 | 0.4906 | 0.20 | 1.15 |
| GPT-5.5 (auto res., med. reason.) VLM Proprietary | 0.4609 | 0.5469 | 0.6725 | 0.5804 | 0.4864 | 0.5772 | 0.4178 | 0.5302 | 0.5187 | 0.7102 | 0.3617 | 0.5126 | 0.7124 | 0.4534 | 0.4856 | 0.5838 | 0.3276 | 0.4596 | 0.6927 | 0.3315 | 5.00 | 30.00 |
| MiniMax M3 VLM Open Source | 0.4605 | 0.4482 | 0.5544 | 0.4943 | 0.3544 | 0.5128 | 0.3250 | 0.4537 | 0.4187 | 0.5778 | 0.3647 | 0.4021 | 0.6256 | 0.3558 | 0.3385 | 0.4601 | 0.2308 | 0.3660 | 0.5722 | 0.5284 | 0.30 | 1.20 |
| Infinity-Parser2 Pro VLM Open Source | 0.4592 | 0.5191 | 0.6586 | 0.5125 | 0.4711 | 0.5942 | 0.3589 | 0.4559 | 0.4963 | 0.7157 | 0.1558 | 0.4815 | 0.7421 | 0.3756 | 0.4867 | 0.5444 | 0.2586 | 0.4331 | 0.6898 | 0.4086 | / | / |
| dots.ocr VLM Open Source | 0.4588 | 0.5159 | 0.5773 | 0.5230 | 0.4955 | 0.6040 | 0.3799 | 0.4143 | 0.4726 | 0.6981 | 0.0723 | 0.4718 | 0.6320 | 0.3856 | 0.5012 | 0.5914 | 0.2489 | 0.4006 | 0.6535 | 0.4393 | 0.03 | 0.03 |
| Llama 4 Maverick VLM Open Source | 0.4501 | 0.4585 | 0.5922 | 0.4831 | 0.1957 | 0.6497 | 0.3717 | 0.4980 | 0.3890 | 0.6400 | 0.4651 | 0.4200 | 0.5613 | 0.3448 | 0.2668 | 0.6231 | 0.3041 | 0.3347 | 0.6055 | 0.4521 | 0.15 | 0.60 |
| Surya OCR 2 (quantized) VLM Open Source | 0.4476 | 0.5801 | 0.6147 | 0.5800 | 0.5105 | 0.6653 | 0.5299 | 0.4704 | 0.5543 | 0.7180 | 0.1389 | 0.5281 | 0.6563 | 0.4121 | 0.5171 | 0.6846 | 0.3707 | 0.4607 | 0.6851 | 0.2895 | / | / |
| Mistral OCR VLM Proprietary | 0.4467 | 0.4819 | 0.6288 | 0.4404 | 0.2473 | 0.6114 | 0.4815 | 0.5104 | 0.4275 | 0.6666 | 0.4372 | 0.3843 | 0.5985 | 0.3076 | 0.1363 | 0.5739 | 0.3052 | 0.3070 | 0.5993 | 0.4597 | 3.07 | 3.07 |
| GPT-5 Mini (min. reason.) VLM Proprietary | 0.4450 | 0.5065 | 0.6369 | 0.5220 | 0.4056 | 0.5430 | 0.4252 | 0.5147 | 0.4872 | 0.6437 | 0.4133 | 0.4321 | 0.6422 | 0.3592 | 0.4027 | 0.4755 | 0.2807 | 0.4057 | 0.5825 | 0.3922 | 0.25 | 2.00 |
| ERNIE 4.5 VL 424B VLM Proprietary | 0.4448 | 0.4575 | 0.5868 | 0.4170 | 0.4901 | 0.4971 | 0.2964 | 0.4932 | 0.4373 | 0.5771 | 0.4653 | 0.3830 | 0.5466 | 0.2874 | 0.4864 | 0.3887 | 0.2057 | 0.3781 | 0.4806 | 0.4762 | 0.42 | 1.25 |
| Qwen3.5 9B VLM Open Source | 0.4269 | 0.4492 | 0.6242 | 0.3933 | 0.4587 | 0.5438 | 0.2260 | 0.4731 | 0.4052 | 0.6498 | 0.3644 | 0.3920 | 0.6337 | 0.2803 | 0.4355 | 0.4660 | 0.1446 | 0.3435 | 0.5896 | 0.4276 | 0.04 | 0.15 |
| Qwen3.5 4B VLM Open Source | 0.4264 | 0.4840 | 0.6218 | 0.5122 | 0.4395 | 0.5380 | 0.3086 | 0.4962 | 0.4522 | 0.6421 | 0.3943 | 0.3991 | 0.6045 | 0.3426 | 0.4386 | 0.4260 | 0.1840 | 0.3603 | 0.5654 | 0.3901 | / | / |
| Qwen3 VL 30B VLM Open Source | 0.4261 | 0.5129 | 0.6461 | 0.5245 | 0.4915 | 0.5930 | 0.3097 | 0.5053 | 0.4620 | 0.7074 | 0.3466 | 0.4438 | 0.6439 | 0.3739 | 0.4785 | 0.5372 | 0.1852 | 0.3788 | 0.6398 | 0.3253 | 0.13 | 0.52 |
| Qwen3.6 27B VLM Open Source | 0.3949 | 0.4414 | 0.5113 | 0.3387 | 0.5042 | 0.5395 | 0.3131 | 0.4383 | 0.3949 | 0.5705 | 0.3496 | 0.3781 | 0.5066 | 0.2405 | 0.4758 | 0.4636 | 0.2040 | 0.3260 | 0.5186 | 0.3669 | 0.32 | 3.20 |
| PaddleOCR-VL-1.6 VLM Open Source | 0.3944 | 0.4332 | 0.5801 | 0.4138 | 0.5291 | 0.4139 | 0.2293 | 0.3728 | 0.3998 | 0.6584 | 0.0602 | 0.4007 | 0.6257 | 0.2892 | 0.5870 | 0.3662 | 0.1355 | 0.3326 | 0.6237 | 0.3796 | 0.03 | 0.03 |
| GPT-5.5 (high res., low reason.) VLM Proprietary | 0.3924 | 0.5328 | 0.6309 | 0.5062 | 0.5294 | 0.5974 | 0.4001 | 0.5034 | 0.5045 | 0.6790 | 0.3268 | 0.4841 | 0.6515 | 0.3918 | 0.5043 | 0.6090 | 0.2640 | 0.4316 | 0.6431 | 0.1749 | 5.00 | 30.00 |
| Qwen3.6 35B VLM Open Source | 0.3896 | 0.4427 | 0.5841 | 0.3174 | 0.4503 | 0.5362 | 0.3256 | 0.4377 | 0.4414 | 0.5704 | 0.3015 | 0.3950 | 0.6052 | 0.2210 | 0.4353 | 0.4878 | 0.2254 | 0.3756 | 0.5342 | 0.3335 | 0.15 | 1.00 |
| OlmOCR-2 VLM Open Source | 0.3875 | 0.4850 | 0.6267 | 0.4742 | 0.4277 | 0.5938 | 0.3026 | 0.4469 | 0.4433 | 0.6923 | 0.2052 | 0.4296 | 0.6731 | 0.3418 | 0.4430 | 0.5240 | 0.1660 | 0.3733 | 0.6373 | 0.2668 | 0.09 | 0.19 |
| Infinity-Parser2 Flash VLM Open Source | 0.3867 | 0.4740 | 0.6354 | 0.4820 | 0.4209 | 0.5450 | 0.2870 | 0.4264 | 0.4473 | 0.6858 | 0.1460 | 0.4060 | 0.6851 | 0.3375 | 0.4062 | 0.4472 | 0.1540 | 0.3603 | 0.6215 | 0.3040 | / | / |
| Gemma 3 27B VLM Open Source | 0.3844 | 0.3640 | 0.4775 | 0.4097 | 0.2436 | 0.4269 | 0.2626 | 0.4109 | 0.3176 | 0.4976 | 0.4176 | 0.3039 | 0.4661 | 0.3088 | 0.1579 | 0.3811 | 0.2054 | 0.2679 | 0.4380 | 0.4618 | / | / |
| Qwen3.5 2B VLM Open Source | 0.3827 | 0.4618 | 0.5982 | 0.4505 | 0.4283 | 0.5314 | 0.3008 | 0.4622 | 0.4313 | 0.6245 | 0.3309 | 0.3818 | 0.5819 | 0.3022 | 0.4029 | 0.4073 | 0.2146 | 0.3544 | 0.5295 | 0.3044 | / | / |
| FireRed-OCR VLM Open Source | 0.3782 | 0.3912 | 0.5607 | 0.3649 | 0.4135 | 0.4645 | 0.1524 | 0.3665 | 0.3440 | 0.6257 | 0.1297 | 0.3445 | 0.5876 | 0.2543 | 0.4105 | 0.3992 | 0.0708 | 0.2807 | 0.5652 | 0.4114 | / | / |
| PaddleOCR-VL-1.5 VLM Proprietary | 0.3733 | 0.4172 | 0.5503 | 0.3822 | 0.5033 | 0.4134 | 0.2366 | 0.3622 | 0.3647 | 0.6539 | 0.0681 | 0.3801 | 0.5975 | 0.2779 | 0.5358 | 0.3596 | 0.1297 | 0.3020 | 0.6120 | 0.3502 | 0.03 | 0.03 |
| GLM-OCR VLM Open Source | 0.3679 | 0.3776 | 0.6155 | 0.4788 | 0.4891 | 0.2522 | 0.0522 | 0.3584 | 0.3792 | 0.6095 | 0.0867 | 0.3272 | 0.6625 | 0.3028 | 0.4739 | 0.1768 | 0.0200 | 0.3133 | 0.5401 | 0.4085 | 0.03 | 0.03 |
| Gemma 4 31B VLM | 0.3643 | 0.4447 | 0.4969 | 0.4005 | 0.3331 | 0.5558 | 0.4371 | 0.4174 | 0.4143 | 0.5479 | 0.2899 | 0.3582 | 0.4864 | 0.2695 | 0.2704 | 0.4873 | 0.2776 | 0.3153 | 0.4819 | 0.3036 | 0.12 | 0.37 |
| Gemma 4 26B VLM | 0.3560 | 0.3702 | 0.4524 | 0.3352 | 0.2158 | 0.4989 | 0.3489 | 0.3883 | 0.3191 | 0.4989 | 0.3469 | 0.2992 | 0.4331 | 0.2191 | 0.1580 | 0.4464 | 0.2391 | 0.2441 | 0.4419 | 0.3896 | 0.06 | 0.33 |
| GPT-5.4 (low res., med. reason.) VLM Proprietary | 0.3486 | 0.5151 | 0.6240 | 0.5321 | 0.4656 | 0.5788 | 0.3750 | 0.4788 | 0.5037 | 0.6492 | 0.2834 | 0.4539 | 0.6489 | 0.3775 | 0.4493 | 0.5428 | 0.2509 | 0.4249 | 0.5969 | 0.0948 | 2.50 | 15.00 |
| GPT-5.4 (high res., med. reason.) VLM Proprietary | 0.3406 | 0.5342 | 0.6309 | 0.5476 | 0.5024 | 0.5986 | 0.3915 | 0.4922 | 0.5196 | 0.6662 | 0.2909 | 0.4641 | 0.6572 | 0.3926 | 0.4588 | 0.5633 | 0.2484 | 0.4275 | 0.6159 | 0.0444 | 2.50 | 15.00 |
| Qwen-VL-OCR VLM Proprietary | 0.3366 | 0.4702 | 0.6062 | 0.5184 | 0.3844 | 0.5906 | 0.2513 | 0.4419 | 0.4207 | 0.6751 | 0.2298 | 0.4019 | 0.6093 | 0.3491 | 0.3988 | 0.5117 | 0.1406 | 0.3315 | 0.6081 | 0.1519 | 0.07 | 0.16 |
| GPT-5.4 (auto res., med. reason.) VLM Proprietary | 0.3352 | 0.5213 | 0.6391 | 0.5543 | 0.4520 | 0.5870 | 0.3739 | 0.4861 | 0.5127 | 0.6579 | 0.2877 | 0.4560 | 0.6570 | 0.3953 | 0.4438 | 0.5551 | 0.2286 | 0.4246 | 0.6047 | 0.0458 | 2.50 | 15.00 |
| ERNIE 4.5 VL 28B VLM Proprietary | 0.3321 | 0.4077 | 0.6119 | 0.2804 | 0.4788 | 0.4005 | 0.2667 | 0.4453 | 0.4300 | 0.5353 | 0.3706 | 0.3200 | 0.5664 | 0.1318 | 0.4754 | 0.2346 | 0.1916 | 0.3532 | 0.4168 | 0.2498 | 0.14 | 0.56 |
| GPT-5.4 (high res., low reason.) VLM Proprietary | 0.3261 | 0.5137 | 0.6220 | 0.5546 | 0.4673 | 0.5507 | 0.3738 | 0.4762 | 0.5063 | 0.6437 | 0.2787 | 0.4526 | 0.6389 | 0.4056 | 0.4464 | 0.5419 | 0.2302 | 0.4221 | 0.5925 | 0.0307 | 2.50 | 15.00 |
| GPT-5.4 (orig. res., med. reason.) VLM Proprietary | 0.3196 | 0.4963 | 0.6403 | 0.5473 | 0.3635 | 0.5677 | 0.3626 | 0.4697 | 0.4887 | 0.6484 | 0.2719 | 0.4236 | 0.6540 | 0.3845 | 0.3304 | 0.5223 | 0.2269 | 0.3921 | 0.5882 | 0.0522 | 2.50 | 15.00 |
| Mistral Small 2603 VLM | 0.3187 | 0.2729 | 0.4825 | 0.3551 | 0.0575 | 0.3246 | 0.1449 | 0.3550 | 0.2248 | 0.4712 | 0.3689 | 0.2312 | 0.4776 | 0.2728 | 0.0378 | 0.2868 | 0.0808 | 0.1772 | 0.4374 | 0.4110 | 0.15 | 0.60 |
| GPT-5.2 (low res., med. reason.) VLM Proprietary | 0.3132 | 0.4488 | 0.5790 | 0.4787 | 0.3031 | 0.5686 | 0.3147 | 0.4349 | 0.4148 | 0.6160 | 0.2738 | 0.4020 | 0.5920 | 0.3492 | 0.3295 | 0.5153 | 0.2238 | 0.3577 | 0.5618 | 0.0959 | 1.75 | 14.00 |
| GPT-5.2 (high res., med. reason.) VLM Proprietary | 0.3097 | 0.4407 | 0.5701 | 0.4467 | 0.3318 | 0.5460 | 0.3090 | 0.4350 | 0.4039 | 0.6075 | 0.2935 | 0.3968 | 0.5759 | 0.3301 | 0.3698 | 0.5122 | 0.1958 | 0.3461 | 0.5597 | 0.0943 | 1.75 | 14.00 |
| Qwen3.5 0.8B VLM Open Source | 0.3043 | 0.3851 | 0.5220 | 0.3983 | 0.3692 | 0.4353 | 0.2008 | 0.3864 | 0.3421 | 0.5674 | 0.2496 | 0.3080 | 0.5137 | 0.2680 | 0.3166 | 0.3090 | 0.1329 | 0.2764 | 0.4656 | 0.2190 | / | / |
| GPT-5.2 (auto res., med. reason.) VLM Proprietary | 0.3035 | 0.4509 | 0.5827 | 0.4853 | 0.3225 | 0.5388 | 0.3250 | 0.4371 | 0.4329 | 0.5977 | 0.2807 | 0.4113 | 0.5887 | 0.3470 | 0.3669 | 0.5111 | 0.2428 | 0.3752 | 0.5537 | 0.0553 | 1.75 | 14.00 |
| OCRVerse VLM Open Source | 0.3020 | 0.2820 | 0.5360 | 0.2086 | 0.3801 | 0.2721 | 0.0133 | 0.2906 | 0.2940 | 0.4886 | 0.0892 | 0.2625 | 0.5627 | 0.1124 | 0.4043 | 0.2315 | 0.0015 | 0.2612 | 0.4330 | 0.3572 | / | / |
| Nemotron-3-Nano-Omni VLM | 0.2973 | 0.3138 | 0.5214 | 0.3297 | 0.2999 | 0.2937 | 0.1244 | 0.3439 | 0.2979 | 0.5047 | 0.2292 | 0.2638 | 0.5165 | 0.2125 | 0.3247 | 0.1892 | 0.0759 | 0.2564 | 0.4180 | 0.2992 | ? | ? |
| GPT-5.4 Mini (auto res., no reason.) VLM | 0.2764 | 0.4385 | 0.5998 | 0.4999 | 0.2594 | 0.5196 | 0.3139 | 0.4416 | 0.4087 | 0.6182 | 0.2979 | 0.3737 | 0.6007 | 0.3460 | 0.2888 | 0.4364 | 0.1965 | 0.3371 | 0.5456 | 0.0155 | 0.75 | 4.50 |
| GPT-5 Nano (min. reason.) VLM Proprietary | 0.2742 | 0.2799 | 0.4491 | 0.3653 | 0.1649 | 0.2459 | 0.1740 | 0.3291 | 0.2567 | 0.4337 | 0.2970 | 0.2238 | 0.4327 | 0.2418 | 0.1196 | 0.1930 | 0.1318 | 0.2027 | 0.3692 | 0.2944 | 0.05 | 0.40 |
| Qianfan-OCR Fast Classical OCR Open Source | 0.2683 | 0.4244 | 0.4748 | 0.4420 | 0.5338 | 0.4336 | 0.2379 | 0.3505 | 0.3897 | 0.5711 | 0.0908 | 0.3405 | 0.4789 | 0.2884 | 0.4951 | 0.3120 | 0.1280 | 0.3051 | 0.4687 | 0.0768 | / | / |
| Gemma 3 4B VLM Open Source | 0.2596 | 0.2390 | 0.4000 | 0.2782 | 0.0522 | 0.2726 | 0.1918 | 0.3215 | 0.2233 | 0.3472 | 0.3939 | 0.1920 | 0.3746 | 0.2064 | 0.0303 | 0.2063 | 0.1425 | 0.1837 | 0.3023 | 0.3066 | / | / |
| LightOnOCR-2 VLM Open Source | 0.2587 | 0.2708 | 0.3978 | 0.3282 | 0.1161 | 0.3201 | 0.1915 | 0.2733 | 0.2062 | 0.4748 | 0.1389 | 0.2430 | 0.4263 | 0.2227 | 0.1127 | 0.3318 | 0.1215 | 0.1715 | 0.4383 | 0.2610 | / | / |
| GPT-5 Mini (med. reason.) VLM Proprietary | 0.2550 | 0.3845 | 0.5529 | 0.3737 | 0.2781 | 0.5596 | 0.1583 | 0.3886 | 0.3744 | 0.5344 | 0.2571 | 0.3347 | 0.5708 | 0.2406 | 0.2728 | 0.4968 | 0.0925 | 0.3135 | 0.4847 | 0.0438 | 0.25 | 2.00 |
| GPT-5.4 Nano (auto res., no reason.) VLM | 0.1939 | 0.2693 | 0.4787 | 0.3908 | 0.0923 | 0.2004 | 0.1841 | 0.3271 | 0.2597 | 0.4349 | 0.2865 | 0.2187 | 0.4607 | 0.2729 | 0.0709 | 0.1623 | 0.1266 | 0.1985 | 0.3820 | 0.0650 | 0.20 | 1.25 |
| Grok 4.2 Fast VLM Proprietary | 0.1937 | 0.2452 | 0.2988 | 0.2651 | 0.2303 | 0.3076 | 0.1240 | 0.2407 | 0.2048 | 0.3514 | 0.1657 | 0.2065 | 0.3018 | 0.1609 | 0.2227 | 0.2550 | 0.0918 | 0.1793 | 0.2945 | 0.1317 | 2.00 | 10.00 |
| DeepSeek-OCR2 VLM Open Source | 0.1759 | 0.2034 | 0.3637 | 0.3022 | 0.0420 | 0.1277 | 0.1815 | 0.2675 | 0.1711 | 0.3559 | 0.2755 | 0.1550 | 0.3519 | 0.1852 | 0.0252 | 0.0971 | 0.1157 | 0.1273 | 0.3047 | 0.1373 | 0.03 | 0.03 |
| MinerU2.5-Pro VLM Open Source | 0.1647 | 0.2222 | 0.3868 | 0.2127 | 0.2240 | 0.2686 | 0.0186 | 0.2473 | 0.1880 | 0.4005 | 0.1533 | 0.2066 | 0.3952 | 0.1258 | 0.2622 | 0.2453 | 0.0044 | 0.1650 | 0.3671 | 0.0529 | / | / |
| Nemotron-Nano-12B VLM Open Source | 0.1296 | 0.1464 | 0.3528 | 0.1813 | 0.0265 | 0.1078 | 0.0637 | 0.2225 | 0.1333 | 0.3017 | 0.2324 | 0.1209 | 0.3592 | 0.1159 | 0.0279 | 0.0760 | 0.0254 | 0.1109 | 0.2681 | 0.0835 | 0.20 | 0.20 |
| PP-OCRv5 Classical OCR Proprietary | 0.1170 | 0.1623 | 0.4115 | 0.1980 | 0.1372 | 0.0448 | 0.0203 | 0.2138 | 0.1932 | 0.3132 | 0.1349 | 0.1630 | 0.4269 | 0.0427 | 0.3367 | 0.0086 | 0.0002 | 0.1853 | 0.2880 | 0.0000 | 0.03 | 0.03 |
| PaddleOCR-VL-0.9B VLM Open Source | 0.1083 | 0.1422 | 0.3165 | 0.1529 | 0.0917 | 0.0686 | 0.0815 | 0.1986 | 0.1343 | 0.2778 | 0.1838 | 0.1007 | 0.3003 | 0.0738 | 0.0839 | 0.0269 | 0.0187 | 0.0857 | 0.2362 | 0.0537 | 0.13 | 0.13 |
| Tesseract v5 Classical OCR Open Source | 0.0980 | 0.1461 | 0.4004 | 0.0868 | 0.0434 | 0.1023 | 0.0979 | 0.2039 | 0.1349 | 0.3568 | 0.1199 | 0.0813 | 0.3941 | 0.0035 | 0.0005 | 0.0080 | 0.0004 | 0.0549 | 0.2956 | 0.0376 | / | / |
| DeepSeek-OCR VLM Open Source | 0.0855 | 0.1033 | 0.2182 | 0.1080 | 0.0295 | 0.0771 | 0.0835 | 0.1601 | 0.0947 | 0.1824 | 0.2032 | 0.0738 | 0.2172 | 0.0577 | 0.0212 | 0.0443 | 0.0288 | 0.0663 | 0.1650 | 0.0510 | 0.03 | 0.03 |
| Molmo 2 8B VLM Open Source | 0.0741 | 0.0939 | 0.2623 | 0.0739 | 0.0069 | 0.0721 | 0.0542 | 0.1625 | 0.0941 | 0.2001 | 0.1934 | 0.0544 | 0.2313 | 0.0167 | 0.0009 | 0.0205 | 0.0025 | 0.0488 | 0.1619 | 0.0396 | / | / |
| GPT-5 Nano (med. reason.) VLM Proprietary | 0.0288 | 0.0451 | 0.1253 | 0.0321 | 0.0049 | 0.0373 | 0.0259 | 0.0640 | 0.0513 | 0.0958 | 0.0450 | 0.0240 | 0.1153 | 0.0017 | 0.0002 | 0.0028 | 0.0001 | 0.0224 | 0.0779 | 0.0077 | 0.05 | 0.40 |
| Model | socOCRbench | Region | Europe | E. Asia | S. Asia | SE Asia | MENA | E. Africa | Period | Pre-mod. | Historical | Contemp. | Format | HW Text | Print Text | Print Tbl | HW Tbl | $/M In | $/M Out |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Gemini 3.1 Pro (low) VLM Proprietary | 0.7009 | 0.7009 | 0.7070 | 0.6447 | 0.7189 | 0.7136 | 0.7204 | 0.6654 | 0.6952 | 0.5641 | 0.6762 | 0.8271 | 0.7024 | 0.6703 | 0.9415 | 0.8295 | 0.3561 | 2.00 | 12.00 |
| Gemini 3 Pro (low) VLM Proprietary | 0.7134 | 0.7134 | 0.6721 | 0.6908 | 0.7544 | 0.7463 | 0.7033 | 0.5976 | 0.6781 | 0.5882 | 0.6307 | 0.8107 | 0.6706 | 0.6551 | 0.9440 | 0.7914 | 0.2862 | 2.00 | 12.00 |
| Gemini 3 Flash (low) VLM Proprietary | 0.6967 | 0.6967 | 0.6754 | 0.6587 | 0.7460 | 0.6632 | 0.7400 | 0.6643 | 0.6738 | 0.5635 | 0.6400 | 0.8165 | 0.6736 | 0.6572 | 0.9218 | 0.8232 | 0.2916 | 0.50 | 3.00 |
| Gemini 3 Pro (high) VLM Proprietary | 0.6894 | 0.6894 | 0.6783 | 0.7297 | 0.6642 | 0.7088 | 0.6662 | 0.6650 | 0.6757 | 0.5682 | 0.6352 | 0.8189 | 0.6746 | 0.6586 | 0.9177 | 0.8472 | 0.2691 | 2.00 | 12.00 |
| Gemini 3 Flash (minimal) VLM Proprietary | 0.6903 | 0.6903 | 0.6700 | 0.6606 | 0.7571 | 0.6743 | 0.6893 | 0.6575 | 0.6705 | 0.5788 | 0.6325 | 0.7945 | 0.6619 | 0.6487 | 0.9253 | 0.7802 | 0.2946 | 0.50 | 3.00 |
| Gemini 3 Flash (high) VLM Proprietary | 0.6686 | 0.6686 | 0.6574 | 0.6451 | 0.7455 | 0.6079 | 0.6871 | 0.7298 | 0.6571 | 0.5689 | 0.6120 | 0.7782 | 0.6217 | 0.6450 | 0.9169 | 0.7360 | 0.1943 | 0.50 | 3.00 |
| Gemini 3.1 Flash Lite VLM Proprietary | 0.6606 | 0.6606 | 0.6551 | 0.6735 | 0.6492 | 0.7143 | 0.6109 | 0.6146 | 0.6425 | 0.5206 | 0.6018 | 0.8201 | 0.6737 | 0.6216 | 0.8842 | 0.8879 | 0.3088 | 0.25 | 1.50 |
| Nano Banana 2 VLM Proprietary | 0.5635 | 0.5635 | 0.6108 | 0.5993 | 0.5712 | 0.6090 | 0.4271 | 0.5348 | 0.5748 | 0.4408 | 0.5524 | 0.7446 | 0.6291 | 0.5483 | 0.8402 | 0.8738 | 0.2525 | 0.50 | 3.00 |
| Gemini 2.0 Flash VLM Proprietary | 0.6309 | 0.6309 | 0.6117 | 0.6283 | 0.7427 | 0.5819 | 0.5898 | 0.2326 | 0.5827 | 0.4916 | 0.5693 | 0.7055 | 0.6372 | 0.5339 | 0.8704 | 0.8543 | 0.2711 | 0.10 | 0.40 |
| Claude Sonnet 4.6 VLM Proprietary | 0.5791 | 0.5791 | 0.6708 | 0.6649 | 0.3461 | 0.6341 | 0.5795 | 0.3809 | 0.6064 | 0.4635 | 0.6182 | 0.7324 | 0.6568 | 0.5738 | 0.8076 | 0.8646 | 0.3806 | 3.00 | 15.00 |
| Seed 2.0 Mini VLM Proprietary | 0.5221 | 0.5221 | 0.5663 | 0.5928 | 0.5013 | 0.5326 | 0.4173 | 0.5695 | 0.5220 | 0.4067 | 0.5074 | 0.6967 | 0.5719 | 0.4882 | 0.8648 | 0.7324 | 0.2083 | 0.10 | 0.40 |
| Claude Opus 4.6 VLM Proprietary | 0.5590 | 0.5590 | 0.6571 | 0.6759 | 0.2823 | 0.5700 | 0.6099 | 0.2579 | 0.5905 | 0.4512 | 0.6110 | 0.6884 | 0.6242 | 0.5578 | 0.7852 | 0.8620 | 0.2860 | 5.00 | 25.00 |
| Seed 2.0 Pro VLM Proprietary | 0.5300 | 0.5300 | 0.5834 | 0.6921 | 0.4448 | 0.4795 | 0.4504 | 0.4309 | 0.5160 | 0.3857 | 0.5285 | 0.7038 | 0.6104 | 0.4760 | 0.8650 | 0.8314 | 0.2703 | 0.47 | 2.37 |
| Qwen3.5-397B (thinking) 397B VLM Open Source | 0.5452 | 0.5452 | 0.6111 | 0.6739 | 0.4912 | 0.4548 | 0.4951 | 0.1950 | 0.5444 | 0.4375 | 0.5711 | 0.6387 | 0.6089 | 0.4968 | 0.8072 | 0.8283 | 0.2826 | 0.60 | 3.60 |
| Gemini 2.5 Flash VLM Proprietary | 0.5260 | 0.5260 | 0.6019 | 0.6280 | 0.3612 | 0.4876 | 0.5514 | 0.1352 | 0.5238 | 0.3388 | 0.5678 | 0.6632 | 0.5933 | 0.4946 | 0.7361 | 0.8466 | 0.2777 | 0.30 | 2.50 |
| Qwen3.5-122B 122B VLM Open Source | 0.5386 | 0.5386 | 0.6181 | 0.6829 | 0.4321 | 0.4451 | 0.5150 | 0.1827 | 0.5502 | 0.4639 | 0.5653 | 0.6326 | 0.6179 | 0.4880 | 0.8292 | 0.8529 | 0.2792 | 0.40 | 3.20 |
| Qwen3.5-397B 397B VLM Open Source | 0.5261 | 0.5261 | 0.5899 | 0.6673 | 0.4787 | 0.4065 | 0.4881 | 0.1957 | 0.5216 | 0.4189 | 0.5453 | 0.6307 | 0.5964 | 0.4743 | 0.7949 | 0.8310 | 0.2698 | 0.60 | 3.60 |
| Seed 2.0 Lite VLM Proprietary | 0.4798 | 0.4798 | 0.5496 | 0.6361 | 0.3388 | 0.4753 | 0.3991 | 0.4287 | 0.4915 | 0.3606 | 0.4997 | 0.6577 | 0.5718 | 0.4521 | 0.8026 | 0.8220 | 0.2091 | 0.09 | 0.53 |
| Qwen3.5-35B 35B VLM Open Source | 0.5367 | 0.5367 | 0.5955 | 0.6737 | 0.4841 | 0.3780 | 0.5521 | 0.0690 | 0.5409 | 0.4638 | 0.5492 | 0.5927 | 0.5826 | 0.4823 | 0.7792 | 0.8047 | 0.2307 | 0.25 | 2.00 |
| Qwen3.5-Plus VLM Proprietary | 0.5205 | 0.5205 | 0.6303 | 0.6582 | 0.3997 | 0.4647 | 0.4494 | 0.1498 | 0.5491 | 0.4429 | 0.5736 | 0.6296 | 0.6062 | 0.4952 | 0.7976 | 0.8418 | 0.2684 | 0.80 | 2.00 |
| Qwen3.5-Flash VLM Open Source | 0.5342 | 0.5342 | 0.5930 | 0.6731 | 0.4865 | 0.3770 | 0.5416 | 0.0690 | 0.5395 | 0.4635 | 0.5461 | 0.5924 | 0.5821 | 0.4808 | 0.7750 | 0.8142 | 0.2251 | 0.10 | 0.40 |
| Qwen3-VL-235B 235B VLM Open Source | 0.5285 | 0.5285 | 0.6368 | 0.7218 | 0.3922 | 0.4066 | 0.4853 | 0.0976 | 0.5555 | 0.4760 | 0.5799 | 0.6138 | 0.6237 | 0.4934 | 0.7958 | 0.8817 | 0.2978 | 0.20 | 0.88 |
| Qwen3.5-9B (no reasoning) 9B VLM Open Source | 0.5149 | 0.5149 | 0.5972 | 0.7052 | 0.4288 | 0.3622 | 0.4812 | 0.0870 | 0.5273 | 0.4436 | 0.5470 | 0.5959 | 0.5930 | 0.4658 | 0.7838 | 0.8463 | 0.2506 | / | / |
| ERNIE 4.5 VL 424B 424B VLM Proprietary | 0.5016 | 0.5016 | 0.6435 | 0.6928 | 0.4044 | 0.2192 | 0.5481 | 0.0576 | 0.5556 | 0.4744 | 0.5866 | 0.5613 | 0.5802 | 0.5032 | 0.7246 | 0.8242 | 0.2327 | 0.42 | 1.25 |
| GPT-5.4 VLM Proprietary | 0.5026 | 0.5026 | 0.6277 | 0.7608 | 0.2120 | 0.4003 | 0.5124 | 0.0969 | 0.5165 | 0.3357 | 0.5844 | 0.6319 | 0.5746 | 0.5000 | 0.7649 | 0.8167 | 0.2331 | 2.50 | 15.00 |
| Qwen3.5-27B 27B VLM Open Source | 0.5162 | 0.5162 | 0.6080 | 0.6913 | 0.4533 | 0.3577 | 0.4705 | 0.0131 | 0.5299 | 0.4402 | 0.5574 | 0.5863 | 0.6082 | 0.4553 | 0.7902 | 0.8621 | 0.2903 | 0.30 | 2.40 |
| Qwen3-VL-8B 8B VLM Open Source | 0.5049 | 0.5049 | 0.5396 | 0.8490 | 0.3524 | 0.3390 | 0.4445 | 0.0856 | 0.4422 | 0.3718 | 0.4771 | 0.5851 | 0.5766 | 0.3941 | 0.8176 | 0.8530 | 0.2513 | 0.08 | 0.50 |
| Qwen3-VL-30B 30B VLM Open Source | 0.4848 | 0.4848 | 0.5953 | 0.7066 | 0.4107 | 0.2800 | 0.4316 | 0.1088 | 0.5111 | 0.4380 | 0.5441 | 0.5650 | 0.5843 | 0.4507 | 0.7529 | 0.8528 | 0.2570 | 0.13 | 0.52 |
| Qwen3.5-4B (no reasoning) 4B VLM Open Source | 0.4708 | 0.4708 | 0.5846 | 0.6837 | 0.3981 | 0.2815 | 0.4059 | 0.0857 | 0.4980 | 0.4048 | 0.5271 | 0.5698 | 0.5732 | 0.4372 | 0.7472 | 0.8279 | 0.2538 | / | / |
| GPT-5.2 (high) VLM Proprietary | 0.4320 | 0.4320 | 0.5834 | 0.6594 | 0.0823 | 0.3493 | 0.4858 | 0.0839 | 0.4770 | 0.2951 | 0.5450 | 0.5833 | 0.5486 | 0.4467 | 0.6697 | 0.8264 | 0.2195 | 1.75 | 14.00 |
| ERNIE 4.5 VL 28B 28B VLM Proprietary | 0.4697 | 0.4697 | 0.5166 | 0.7791 | 0.3545 | 0.2767 | 0.4216 | 0.0619 | 0.4138 | 0.3437 | 0.4518 | 0.5560 | 0.5444 | 0.3677 | 0.8017 | 0.7952 | 0.2251 | 0.14 | 0.56 |
| Qwen3.5-2B (no reasoning) 2B VLM Open Source | 0.4502 | 0.4502 | 0.6182 | 0.6202 | 0.3731 | 0.2448 | 0.3948 | 0.0484 | 0.4766 | 0.3799 | 0.5760 | 0.5161 | 0.5951 | 0.4157 | 0.7189 | 0.8440 | 0.3788 | / | / |
| Sarvam Vision VLM Proprietary | 0.4533 | 0.4533 | 0.5746 | 0.4961 | 0.5608 | 0.1437 | 0.4912 | 0.0438 | 0.4785 | 0.4116 | 0.5095 | 0.5379 | 0.5367 | 0.4363 | 0.6899 | 0.7996 | 0.2181 | / | / |
| Llama 4 Maverick 17Bx128E VLM Open Source | 0.3827 | 0.3827 | 0.5484 | 0.4513 | 0.1262 | 0.4344 | 0.3533 | 0.3066 | 0.4350 | 0.2615 | 0.5102 | 0.5740 | 0.5383 | 0.3998 | 0.6849 | 0.8031 | 0.2665 | 0.15 | 0.60 |
| dots.ocr VLM Open Source | 0.4444 | 0.4444 | 0.5866 | 0.6893 | 0.4162 | 0.2657 | 0.2643 | 0.0438 | 0.4974 | 0.4834 | 0.5066 | 0.5329 | 0.5556 | 0.4328 | 0.7678 | 0.8259 | 0.2145 | / | / |
| OlmOCR-2 7B VLM Open Source | 0.4083 | 0.4083 | 0.6170 | 0.6604 | 0.1519 | 0.2531 | 0.3589 | 0.0632 | 0.4868 | 0.3437 | 0.5558 | 0.5297 | 0.5282 | 0.4488 | 0.6704 | 0.7390 | 0.2195 | / | / |
| GPT-5.2 (low) VLM Proprietary | 0.4000 | 0.4000 | 0.5640 | 0.6379 | 0.0816 | 0.2801 | 0.4366 | 0.0187 | 0.4495 | 0.2591 | 0.5166 | 0.5553 | 0.5223 | 0.4189 | 0.6228 | 0.8011 | 0.2083 | 1.75 | 14.00 |
| Qwen-VL-OCR VLM Proprietary | 0.4103 | 0.4103 | 0.5620 | 0.6426 | 0.3564 | 0.1609 | 0.3294 | 0.0085 | 0.4562 | 0.3698 | 0.5010 | 0.4999 | 0.5186 | 0.3972 | 0.7102 | 0.7663 | 0.1660 | 0.07 | 0.16 |
| Qwen3.5-0.8B (no reasoning) 0.9B VLM Open Source | 0.3787 | 0.3787 | 0.5607 | 0.6072 | 0.2922 | 0.1575 | 0.2759 | 0.0388 | 0.4369 | 0.3458 | 0.5061 | 0.4617 | 0.4836 | 0.3887 | 0.6794 | 0.6650 | 0.1857 | / | / |
| GPT-5.3 Codex VLM Proprietary | 0.3459 | 0.3459 | 0.5164 | 0.6209 | 0.0751 | 0.2527 | 0.2643 | 0.0000 | 0.3956 | 0.1736 | 0.4717 | 0.5333 | 0.4995 | 0.3662 | 0.5521 | 0.8357 | 0.2000 | 1.75 | 14.00 |
| PaddleOCR-VL-1.5 VLM Open Source | 0.3097 | 0.3097 | 0.4845 | 0.4084 | 0.1493 | 0.1717 | 0.3347 | 0.0248 | 0.3688 | 0.2394 | 0.4318 | 0.3724 | 0.3284 | 0.3491 | 0.5913 | 0.2006 | 0.1280 | / | / |
| GLM-OCR 0.9B VLM Open Source | 0.2683 | 0.2683 | 0.5421 | 0.4205 | 0.0412 | 0.1746 | 0.1629 | 0.0504 | 0.3848 | 0.2346 | 0.4754 | 0.3641 | 0.3402 | 0.3662 | 0.5810 | 0.2202 | 0.1492 | 0.03 | 0.03 |
| FireRed-OCR 2B VLM Open Source | 0.2839 | 0.2839 | 0.5020 | 0.5785 | 0.0070 | 0.1078 | 0.2243 | 0.0082 | 0.3483 | 0.2041 | 0.4581 | 0.3563 | 0.3464 | 0.3266 | 0.5826 | 0.3131 | 0.1292 | / | / |
| Mistral OCR VLM Proprietary | 0.2816 | 0.2816 | 0.4644 | 0.4102 | 0.0356 | 0.1263 | 0.3713 | 0.0615 | 0.3205 | 0.1429 | 0.4526 | 0.4474 | 0.4668 | 0.2887 | 0.5483 | 0.8719 | 0.1809 | 1.00 | 1.00 |
| PaddleOCR-VL VLM Open Source | 0.2420 | 0.2420 | 0.3731 | 0.3815 | 0.0114 | 0.1036 | 0.3402 | 0.0448 | 0.2768 | 0.1299 | 0.3078 | 0.3657 | 0.2685 | 0.2842 | 0.4066 | 0.2336 | 0.1294 | / | / |
| Gemma 3 27B 27B VLM Open Source | 0.2761 | 0.2761 | 0.4593 | 0.3059 | 0.0943 | 0.2070 | 0.3139 | 0.0790 | 0.3229 | 0.1933 | 0.4050 | 0.4395 | 0.4440 | 0.2931 | 0.5009 | 0.7934 | 0.2201 | 0.20 | 0.40 |
| PP-OCRv5 Classical OCR Open Source | 0.2383 | 0.2383 | 0.4748 | 0.6559 | 0.0016 | 0.0348 | 0.0243 | 0.0296 | 0.3197 | 0.1596 | 0.4529 | 0.3930 | 0.4563 | 0.2669 | 0.5575 | 0.7961 | 0.1905 | / | / |
| Nemotron-Nano-12B 12B VLM Open Source | 0.2357 | 0.2357 | 0.4568 | 0.4845 | 0.0370 | 0.1029 | 0.0974 | 0.0160 | 0.2950 | 0.1036 | 0.4339 | 0.4087 | 0.4401 | 0.2542 | 0.5510 | 0.7066 | 0.2291 | 0.20 | 0.20 |
| Gemma 3 4B 4B VLM Open Source | 0.2428 | 0.2428 | 0.4442 | 0.2563 | 0.0643 | 0.1439 | 0.3055 | 0.0608 | 0.2962 | 0.1766 | 0.3905 | 0.3896 | 0.3987 | 0.2735 | 0.4705 | 0.6860 | 0.1974 | 0.07 | 0.14 |
| PaddleOCR-VL-0.9B 0.9B VLM Open Source | 0.1824 | 0.1824 | 0.3663 | 0.3914 | 0.0201 | 0.0415 | 0.0925 | 0.0338 | 0.2353 | 0.0814 | 0.3588 | 0.3019 | 0.3162 | 0.1987 | 0.5310 | 0.3901 | 0.1187 | / | / |
| DeepSeek-OCR 1.3B VLM Open Source | 0.1883 | 0.1883 | 0.2574 | 0.4116 | 0.1256 | 0.0670 | 0.0797 | 0.0254 | 0.1970 | 0.0991 | 0.2278 | 0.2725 | 0.2484 | 0.1768 | 0.3228 | 0.3155 | 0.1640 | 0.03 | 0.03 |
| DeepSeek-OCR2 VLM Open Source | 0.2412 | 0.2412 | 0.3680 | 0.5368 | 0.0739 | 0.0625 | 0.1649 | 0.0370 | 0.1823 | 0.0980 | 0.2878 | 0.3466 | 0.3319 | 0.1670 | 0.5477 | 0.5290 | 0.1293 | 0.03 | 0.03 |
| Kimi K2.5 VLM Proprietary | 0.1231 | 0.1231 | 0.2094 | 0.3004 | 0.0323 | 0.0497 | 0.0238 | 0.0000 | 0.1365 | 0.0347 | 0.1414 | 0.2715 | 0.1821 | 0.1448 | 0.2229 | 0.3720 | 0.0168 | 0.40 | 1.90 |
| LayoutParser Classical OCR Open Source | 0.1029 | 0.1029 | 0.3033 | 0.0510 | 0.0220 | 0.0536 | 0.0848 | 0.0732 | 0.1677 | 0.0892 | 0.2705 | 0.1650 | 0.2027 | 0.1372 | 0.3988 | 0.1187 | 0.1462 | / | / |
| Tesseract v5 Classical OCR Open Source | 0.1030 | 0.1030 | 0.3036 | 0.0510 | 0.0220 | 0.0536 | 0.0848 | 0.0732 | 0.1673 | 0.0892 | 0.2709 | 0.1627 | 0.2004 | 0.1372 | 0.3988 | 0.1081 | 0.1479 | / | / |
| Molmo 2 8B 8B VLM Open Source | 0.1162 | 0.1162 | 0.3275 | 0.1036 | 0.0188 | 0.0632 | 0.0681 | 0.0179 | 0.1863 | 0.0980 | 0.2757 | 0.2292 | 0.2544 | 0.1668 | 0.3405 | 0.3728 | 0.1495 | / | / |
| EfficientOCR Classical OCR Open Source | 0.0400 | 0.0400 | 0.0979 | 0.0186 | 0.0168 | 0.0292 | 0.0374 | 0.0213 | 0.0568 | 0.0294 | 0.1057 | 0.0474 | 0.0806 | 0.0438 | 0.1236 | 0.0628 | 0.0944 | / | / |
| Model | Overall | HW [A] | HW [B] | Tables [A] | Tables [B] | $/M In | $/M Out |
|---|---|---|---|---|---|---|---|
| Gemini 3.1 Pro (low) VLM Proprietary | 0.5965 | 0.6959 | 0.5260 | 0.9636 | 0.3640 | 2.00 | 12.00 |
| Gemini 3 Flash (low) VLM Proprietary | 0.5522 | 0.6369 | 0.5153 | 0.9717 | 0.2737 | 0.50 | 3.00 |
| Gemini 3 Pro (low) VLM Proprietary | 0.5472 | 0.6445 | 0.5289 | 0.8936 | 0.2667 | 2.00 | 12.00 |
| Gemini 3 Pro (high) VLM Proprietary | 0.5406 | 0.6589 | 0.5106 | 0.8989 | 0.2400 | 2.00 | 12.00 |
| Gemini 2.0 Flash VLM Proprietary | 0.5100 | 0.5849 | 0.4312 | 0.9471 | 0.2850 | 0.10 | 0.40 |
| Gemini 3 Flash (high) VLM Proprietary | 0.4808 | 0.6049 | 0.4750 | 0.7551 | 0.1862 | 0.50 | 3.00 |
| Gemini 2.5 Flash VLM Proprietary | 0.4670 | 0.5390 | 0.3689 | 0.9107 | 0.2650 | 0.30 | 2.50 |
| Qwen3.5-397B (thinking) 397B VLM Open Source | 0.4598 | 0.6128 | 0.2734 | 0.9016 | 0.2552 | 0.60 | 3.60 |
| Qwen3-VL-235B 235B VLM Open Source | 0.4581 | 0.6174 | 0.1969 | 0.9398 | 0.3132 | 0.20 | 0.88 |
| Qwen3.5-397B 397B VLM Open Source | 0.4484 | 0.6055 | 0.2409 | 0.9301 | 0.2432 | 0.60 | 3.60 |
| Qwen3.5-27B 27B VLM Open Source | 0.4340 | 0.5733 | 0.1976 | 0.9368 | 0.2760 | 0.30 | 2.40 |
| Qwen3.5-35B 35B VLM Open Source | 0.4296 | 0.5784 | 0.2273 | 0.9275 | 0.2210 | 0.25 | 2.00 |
| Qwen3.5-Flash VLM Proprietary | 0.4280 | 0.5763 | 0.2247 | 0.9247 | 0.2220 | 0.10 | 0.40 |
| Qwen3.5-122B 122B VLM Open Source | 0.4264 | 0.5637 | 0.1954 | 0.9195 | 0.2692 | 0.40 | 3.20 |
| Qwen3-VL-30B 30B VLM Open Source | 0.4194 | 0.5716 | 0.1699 | 0.9104 | 0.2653 | 0.13 | 0.52 |
| Qwen3-VL-8B 8B VLM Open Source | 0.4105 | 0.5705 | 0.1637 | 0.8456 | 0.2709 | 0.08 | 0.50 |
| ERNIE 4.5 VL 424B 424B VLM Proprietary | 0.4057 | 0.5468 | 0.1896 | 0.9062 | 0.2222 | 0.55 | 2.20 |
| Seed 2.0 Mini VLM Proprietary | 0.4882 | 0.5695 | 0.2083 | 0.8648 | 0.4023 | 0.10 | 0.40 |
| Seed 2.0 Pro VLM Proprietary | 0.4058 | 0.4582 | 0.2601 | 0.9513 | 0.2350 | 0.47 | 2.37 |
| GPT-5.2 (high) VLM Proprietary | 0.3932 | 0.5190 | 0.1959 | 0.8951 | 0.2074 | 1.75 | 14.00 |
| Llama 4 Maverick 400B VLM Open Source | 0.3911 | 0.4563 | 0.1884 | 0.9358 | 0.2704 | 0.15 | 0.60 |
| ERNIE 4.5 VL 28B 28B VLM Proprietary | 0.3837 | 0.5256 | 0.1537 | 0.8735 | 0.2212 | 0.55 | 2.20 |
| GPT-5.2 (low) VLM Proprietary | 0.3759 | 0.5140 | 0.1693 | 0.8445 | 0.2014 | 1.75 | 14.00 |
| Seed 2.0 Lite VLM Proprietary | 0.3724 | 0.4653 | 0.1918 | 0.9013 | 0.1970 | 0.09 | 0.53 |
| dots.ocr-1.5 3B VLM Open Source | 0.3720 | 0.5670 | 0.1423 | 0.7768 | 0.1810 | / | / |
| Qwen-VL-OCR VLM Proprietary | 0.3525 | 0.5082 | 0.1105 | 0.8696 | 0.1720 | 0.07 | 0.16 |
| Datalab (accurate) VLM Proprietary | 0.3477 | 0.3575 | 0.1854 | 0.9260 | 0.2363 | ? | ? |
| dots.ocr 3B VLM Open Source | 0.3328 | 0.4193 | 0.0751 | 0.9150 | 0.2299 | / | / |
| GLM-OCR 0.8B VLM Open Source | 0.3247 | 0.4369 | 0.0693 | 0.8200 | 0.2283 | 0.03 | 0.03 |
| PaddleOCR-VL-1.5 0.9B VLM Open Source | 0.3131 | 0.4066 | 0.1155 | 0.7949 | 0.1805 | / | / |
| Mistral OCR VLM Proprietary | 0.3006 | 0.3103 | 0.1063 | 0.9760 | 0.1784 | $2/1K pages | |
| Datalab (fast) Classical OCR Proprietary | 0.2975 | 0.3031 | 0.0837 | 0.9169 | 0.2317 | ? | ? |
| PaddleOCR-VL-0.9B 0.9B VLM Open Source | 0.2864 | 0.3231 | 0.0856 | 0.8047 | 0.2145 | / | / |
| Sarvam Vision VLM Proprietary | 0.2735 | 0.2851 | 0.1899 | 0.7194 | 0.1335 | / | / |
| FireRed-OCR 2B VLM Open Source | 0.2585 | 0.3266 | 0.1292 | 0.5826 | 0.3131 | / | / |
| PP-OCRv5 Classical OCR Open Source | 0.2501 | 0.3400 | 0.0069 | 0.7319 | 0.1759 | / | / |
| Nemotron-Nano-12B 12B VLM Open Source | 0.2177 | 0.2568 | 0.0127 | 0.6273 | 0.2020 | 0.20 | 0.20 |
| OlmOCR-2 7B VLM Open Source | 0.2087 | 0.3405 | 0.0696 | 0.2791 | 0.1624 | / | / |
| LightOnOCR-2-1B 1B VLM Open Source | 0.2024 | 0.3138 | 0.1005 | 0.4828 | 0.0345 | / | / |
| DeepSeek-OCR 3B VLM Open Source | 0.1997 | 0.2534 | 0.0336 | 0.5004 | 0.1732 | 0.03 | 0.03 |
| Kimi K2.5 VLM Proprietary | 0.1336 | 0.2081 | 0.0415 | 0.3944 | 0.0129 | 0.40 | 1.90 |
| DeepSeek-OCR2 3B VLM Open Source | 0.1181 | 0.0995 | 0.0303 | 0.3715 | 0.1191 | 0.03 | 0.03 |
| Tesseract Classical OCR Open Source | 0.1091 | 0.1037 | 0.0285 | 0.1696 | 0.1808 | / | / |
| Layout Parser Classical OCR Open Source | 0.1082 | 0.1037 | 0.0285 | 0.1609 | 0.1815 | / | / |
| EfficientOCR Classical OCR Open Source | 0.0635 | 0.0509 | 0.0108 | 0.1104 | 0.1187 | / | / |
API Cost Note: Pricing shown per million tokens ($/M In = input cost, $/M Out = output cost). "/" indicates free/open-source models. "?" indicates proprietary pricing (contact vendor). Costs are approximate and may vary by provider or region. Data collected February 2026.
Cost vs Performance
Average token cost = (input + output price per 1M tokens) / 2. Only models with published API pricing are included. Log scale on x-axis.
Legend: • Google • Anthropic • OpenAI • Qwen • Mistral • Other • Traditional OCR
Methodology & Reproduction
All VLM models receive the same prompts. For handwriting samples:
Transcribe all the text in this image exactly as written. Output ONLY the transcribed text, nothing else.
For table samples:
OCR this document image into a markdown table. Transcribe all text exactly as written. Output ONLY the markdown table, nothing else.
Image-only models (dots.ocr, OlmOCR-2) receive no text prompt. dots.ocr-1.5 uses its native
prompt_ocr prompt (Extract the text content from this image.).
DeepSeek-OCR2 uses its native prompts (Free OCR. for handwriting,
Convert the document to markdown. for tables).
The metric is Normalized Edit Similarity (NES):
1 - edit_distance(pred, gt) / max(len(pred), len(gt)).
v2 Scoring
The headline socOCRbench score is a region macro-average: the mean of per-region dataset means, so that each region counts equally regardless of sample count. Regions: Europe, East Asia, South Asia, Southeast Asia, MENA, East Africa. Period and format breakdowns are shown as supplementary columns. Periods: Pre-modern (<1700), Historical (1700–1950), Contemporary (post-1950). Formats: Handwritten text, Printed text, Printed tables, Handwritten tables.
v1 Scoring
A simpler weighted average across HW [A], HW [B], Tables [A], and Tables [B] categories.
| Model | Provider | Model ID / Notes |
|---|---|---|
| Gemini 3.1 Pro (low) | OpenRouter | google/gemini-3.1-pro-preview, reasoning_effort=low |
| Gemini 3 Pro (low) | OpenRouter | google/gemini-3-pro-preview, reasoning_effort=low |
| Gemini 3 Pro (high) | OpenRouter | google/gemini-3-pro-preview, reasoning_effort=high |
| Gemini 2.5 Flash | Google GenAI SDK | gemini-2.5-flash via genai.Client |
| Gemini 2.0 Flash | OpenRouter | google/gemini-2.0-flash-001 |
| Gemini 3 Flash (low) | OpenRouter | google/gemini-3-flash-preview, reasoning_effort=low |
| Gemini 3 Flash (minimal) | OpenRouter | google/gemini-3-flash-preview, reasoning_effort=minimal |
| Gemini 3 Flash (high) | OpenRouter | google/gemini-3-flash-preview, reasoning_effort=high |
| GPT-5.3 Codex | OpenRouter | openai/gpt-5.3-codex |
| GPT-5.2 (low) | OpenRouter | openai/gpt-5.2, reasoning_effort=low |
| GPT-5.2 (high) | OpenRouter | openai/gpt-5.2, reasoning_effort=high |
| Claude Sonnet 4.6 | OpenRouter | anthropic/claude-sonnet-4.6 |
| Claude Opus 4.6 | OpenRouter | anthropic/claude-opus-4.6 |
| Qwen3-VL-235B | OpenRouter | qwen/qwen3-vl-235b-a22b-instruct |
| Qwen3-VL-30B | DeepInfra | Qwen/Qwen3-VL-30B-A3B-Instruct |
| Seed 2.0 Pro | ZenMux | volcengine/doubao-seed-2.0-pro |
| Seed 2.0 Lite | ZenMux | volcengine/doubao-seed-2.0-lite |
| Seed 2.0 Mini | OpenRouter | bytedance/seed-2.0-mini |
| dots.ocr-1.5 | Local (vLLM) | rednote-hilab/dots.ocr-1.5, native prompt_ocr |
| dots.ocr | Replicate | sljeff/dots.ocr, image-only input |
| GLM-OCR | Zhipu AI | glm-ocr via layout parsing API |
| PaddleOCR-VL-1.5 | PaddlePaddle Cloud | Layout parsing API (aistudio-app.com) |
| PaddleOCR-VL-0.9B | DeepInfra | PaddlePaddle/PaddleOCR-VL-0.9B |
| Nemotron-Nano-12B | DeepInfra | nvidia/NVIDIA-Nemotron-Nano-12B-v2-VL |
| OlmOCR-2 | DeepInfra | allenai/olmOCR-2-7B-1025, image-only input |
| LightOnOCR-2-1B | Local (transformers) | lightonai/LightOnOCR-2-1B, bfloat16, CUDA |
| LightOnOCR-2-1B | Local (Ollama) | maternion/LightOnOCR-2, Q4 GGUF via Ollama |
| Qwen3.5-2B (no reasoning) | Local (Ollama) | qwen3.5:2b, Q4 GGUF via Ollama, thinking disabled |
| FireRed-OCR | HF Spaces (T4) | FireRedTeam/FireRed-OCR, Gradio API |
| DeepSeek-OCR | DeepInfra | deepseek-ai/DeepSeek-OCR |
| DeepSeek-OCR2 | Novita AI | deepseek/deepseek-ocr-2, native prompts |
| Qwen3-VL-8B | Novita AI | qwen/qwen3-vl-8b-instruct |
| Llama 4 Maverick | Novita AI | meta-llama/llama-4-maverick-17b-128e-instruct-fp8 |
| ERNIE 4.5 VL 424B | Novita AI | baidu/ernie-4.5-vl-424b-a47b |
| ERNIE 4.5 VL 28B | Novita AI | baidu/ernie-4.5-vl-28b-a3b |
| Sarvam Vision | Sarvam AI | Document Intelligence API, output_format=md |
| PP-OCRv5 | PaddlePaddle Cloud | OCR API (aistudio-app.com) |
| Tesseract | Local | Tesseract 5, eng+fra+deu+nor, psm 6 |
| Layout Parser | Local | Detectron2 (PubLayNet Faster R-CNN) + Tesseract |
| EfficientOCR | Local | Tiled word-level FAISS KNN recognition |
| Qwen3.5-397B | Novita AI / OpenRouter | qwen/qwen3.5-397b-a17b, thinking disabled |
| Qwen3.5-397B (thinking) | Novita AI / OpenRouter | qwen/qwen3.5-397b-a17b, thinking enabled, max_tokens=32768 |
| Qwen3.5-27B | Alibaba Cloud Model Studio | qwen3.5-27b, thinking disabled |
| Qwen3.5-35B | Alibaba Cloud Model Studio | qwen3.5-35b-a3b, thinking disabled |
| Qwen3.5-Flash | Alibaba Cloud Model Studio | qwen3.5-flash, thinking disabled |
| Qwen3.5-122B | Alibaba Cloud Model Studio | qwen3.5-122b-a10b, thinking disabled |
| Qwen-VL-OCR | Alibaba Cloud Model Studio | qwen-vl-ocr, min_pixels=3072, max_pixels=8388608 |
| Kimi K2.5 | OpenRouter | moonshotai/kimi-k2.5 |
| Datalab (fast) | Datalab API | datalab-python-sdk, mode=fast |
| Datalab (balanced) | Datalab API | datalab-python-sdk, mode=balanced |
| Datalab (accurate) | Datalab API | datalab-python-sdk, mode=accurate |
| Infinity-Parser2 Flash / Pro | HF Space (infly) | infly/Infinity-Parser2-Demo via gradio_client, doc2json task |
| MinerU2.5-Pro | Local (A100) | opendatalab/MinerU2.5-Pro-2605-1.2B via vLLM |
| Mistral OCR | Mistral AI | mistral-ocr-latest via mistralai SDK, image-only input |
| Mistral OCR 4 | Mistral AI | mistral-ocr-4-0 via mistralai SDK, image-only input |
| Mistral Small 2603 | OpenRouter | mistralai/mistral-small-2603 |
| GPT-5.4 Mini (auto res., no reason.) | OpenAI | gpt-5.4-mini, detail=auto, reasoning_effort=none |
| GPT-5.4 Nano (auto res., no reason.) | OpenAI | gpt-5.4-nano, detail=auto, reasoning_effort=none |
| MiniMax M3 | OpenRouter | minimax/minimax-m3 |
| Qwen3.7 Plus | OpenRouter | qwen/qwen3.7-plus |
| Step 3.7 Flash (medium) | OpenRouter | stepfun/step-3.7-flash, reasoning mandatory (default medium), max_tokens up to 65536 |
Changelog
June 24, 2026: Added MinerU2.5-Pro (0.1647), a 1.2B local parser that
tops OmniDocBench v1.6 but handles only Latin and CJK print, returning almost nothing on
the non-Latin historical scripts that fill much of this benchmark. Corrected Kimi K2.5
(0.0969 to 0.4775): its old score was an artifact of provider timeouts that blanked 229 of
280 pages; re-running with throughput routing and reasoning off filled every page. The same
routing fix raised Qwen3.6 27B (0.2810 to 0.3949), though it still returns blank on many
non-Latin pages. Added an
eval step that strips structural markup tokens (e.g. MinerU's <|txt_start|>)
before scoring; no ground-truth text contains them.
June 23, 2026: Added Mistral OCR 4 (0.5055) via the Mistral API at $4/1000 pages, a clear gain over the previous Mistral OCR (0.4467), with much stronger tables (TEDS 0.45 vs 0.13). Strong on printed text (0.7159) and the best Mistral result on handwriting to date.
June 18, 2026: Added MiniMax M3 (0.4605) via OpenRouter at $0.30/$1.20 per million tokens. Also hardened the OCR prompt and added an eval-time preamble stripper: some vision models prefix their output with meta-commentary (“This is a Hebrew manuscript page… Here’s my best attempt:”) before the actual transcription, which inflated edit distance. The stripper removes a leading commentary block only when a substantial transcription follows (≥200 chars), so genuine refusals still score near zero. No benchmark ground-truth text begins with a trigger phrase, so it never removes real OCR. Net leaderboard effect was negligible (preambles cluster on the hardest manuscripts, where even cleaned output scores low), but it makes the scoring fair to models that did the work.
June 11, 2026: Added Qwen3.7 Plus (0.5830) via OpenRouter at $0.32/$1.28 per million tokens, ninth overall and the best Qwen result to date. Added Step 3.7 Flash (0.4668), StepFun's 196B-A11B MoE at $0.20/$1.15, scored at its default (medium) reasoning effort; reasoning cannot be disabled, and half the samples needed a 32K completion budget (~30 needed 64K) to emit any output.
June 8, 2026: Corrected the Infinity-Parser2 scores. Both models had
been run through the infly Gradio Space in doc2md mode, which linearizes
every document to plain prose with no table structure, so their near-zero table
(TEDS) scores were a data-collection artifact, not a model limitation. Re-ran both in
doc2json mode, where table blocks carry <table> HTML that
the scorer reads: Infinity-Parser2 Pro 0.3541 → 0.4592 (TEDS 0.04 → 0.41) and
Flash 0.3040 → 0.3867 (TEDS 0.06 → 0.30). A handful of non-Latin handwriting
scans where the layout detector finds only the stamped folio number still score near
zero; those are genuine model misses.
May 29, 2026: Fixed a TEDS scoring bug that systematically penalized
OCR models which emit tables as HTML <table> rather than markdown pipes:
the parser previously scored their tables near zero even when read correctly. This raised
11 open-source OCR specialists on the table category: dots.ocr 1.5 (+0.145, to 0.478),
FireRed-OCR (+0.137), dots.ocr (+0.134), GLM-OCR (+0.121), PaddleOCR-VL-1.6 (+0.111),
OCRVerse, PaddleOCR-VL-1.5, OlmOCR-2, LightOnOCR-2, Nemotron-3-Nano-Omni, and Qianfan-OCR.
General VLMs were unaffected (they were prompted to output markdown tables).
Added Surya OCR 2 (0.4476), run locally on an RTX 4090 via llama.cpp with quantized GGUF
weights: strong on printed text (0.7180), weak on handwritten tables and right-to-left
scripts. Added PaddleOCR-VL-1.6 (0.3944) via the Baidu AIStudio job API; handles printed
text (0.6584) far better than non-Latin scripts.
Added Infinity-Parser2 Pro (0.3541) and Flash (0.3040), run through the public infly
Gradio Space; both read printed text well (Pro 0.7321) but linearize handwritten tables
in markdown mode, so their table (TEDS) scores are near zero.
Re-ran Datalab on a new subscription key and added the balanced mode: fast 0.5124,
balanced 0.5167, accurate 0.5213, marginally below the prior key (fast 0.5374,
accurate 0.5391) across all modes.
May 23, 2026: Added GPT-5.5 with auto resolution + medium reasoning
(0.4609) and high resolution + low reasoning (0.3924). Both clear the best GPT-5.4
variant (0.3424). Added Gemini 3.1 Flash Lite at minimal (0.6214) and low (0.5819)
reasoning, replacing the previous medium-reasoning entry (0.6067). Audited every
OpenRouter-backed model's pricing against the live OpenRouter API: corrected
Claude Opus 4.6 ($15/$75 → $5/$25), Gemini 2.5 Flash ($0.15/$0.60 → $0.30/$2.50),
Kimi K2.5 ($0.60/$2.50 → $0.40/$1.90), Llama 4 Maverick ($0.27/$0.85 → $0.15/$0.60),
Qwen3 VL 30B, Qwen3.5 9B, Qwen3.5 Plus (2026-04-20), and the Qwen3.6 27B/35B/Flash
line. Fixed a long-standing bug in call_openai: reasoning-capable models
(GPT-5 / o3 family) were given only 4096 completion tokens, so medium-reasoning calls
burned the entire budget on hidden reasoning tokens and returned empty output; the
budget now scales to 32K when reasoning is on.
May 21, 2026: Added Gemini 3.5 Flash with minimal (0.6022) and low (0.6095) reasoning, a modest gain over Gemini 3 Flash (0.5920 / 0.5995). Pricing tripled to $1.50/$9.00 (from $0.50/$3.00), so cost-performance is meaningfully worse despite the score bump.
May 14, 2026: Various updates. Audited every model's API pricing against its native source (Google, Anthropic, OpenAI, OpenRouter, DashScope, xAI). Gemini 2.5 Flash $0.30/$2.50 (was $0.15/$0.60, per Google); Claude Opus 4.6 $5/$25 (was $15/$75); Kimi K2.5 $0.40/$1.90 (was $0.60/$2.50); Llama 4 Maverick $0.15/$0.60 (was $0.27/$0.85); Grok 4.2 Fast $0.20/$0.50 (was $2/$10); Qwen3 VL 30B $0.13/$0.52 (was $0.10/$0.40). Filled previously-unknown prices: Qwen3.6 Plus $0.33/$1.95, Mistral Small 2603 $0.15/$0.60, Gemma 4 31B $0.12/$0.37, Gemma 4 26B $0.06/$0.33, GPT-5.4 Mini $0.75/$4.50, GPT-5.4 Nano $0.20/$1.25. Cost-performance chart entries refreshed accordingly.
March 17, 2026: 64 models evaluated. Added Mistral Small 2603 (0.3140), GPT-5.4 Mini (0.4401) and GPT-5.4 Nano (0.2982) with auto resolution and no reasoning, Datalab fast (0.4849) and Datalab accurate (0.4908).
March 15, 2026: 59 models evaluated. Added GPT-5 Mini and GPT-5 Nano with both default (medium) and minimal reasoning variants. Clarified all GPT model names to show resolution and reasoning settings (e.g. "GPT-5.4 (high res., med. reason.)"). Reducing reasoning dramatically improves smaller models: GPT-5 Mini jumps from 0.3866 to 0.5106, GPT-5 Nano from 0.0546 to 0.3045. Added Seed 2.0 Pro (0.5831), dots.ocr (0.4651), Grok 4.2 Fast (0.2433). Added Qwen3.5 9B via OpenRouter (0.4612), Qwen3.5 4B/2B/0.8B via Ollama, and LightOnOCR-2 via Ollama (0.2720). Updated chart to auto-generate from scores. Fixed GPT pricing from official OpenAI docs.
March 12, 2026: Launched v3 with 280 full-page document images across 5 regions (W. Europe, E. Europe, E. Asia, S. Asia, MENA) and 3 formats (HW Text, Print Text, HW Table). 38 models evaluated. Added Mistral OCR, DeepSeek-OCR, DeepSeek-OCR2, OlmOCR-2, PaddleOCR-VL-0.9B, Nemotron-Nano-12B, ERNIE 4.5 VL 28B, and Qwen3 VL 30B. Added batch Gemini pricing to the cost-performance chart. Removed GPT-5.3 Codex (incompatible with chat completions API).
March 5, 2026: Added LightOnOCR-2-1B via Ollama (0.2185) and Qwen3.5-2B (no reasoning) via Ollama (0.3682). Both run locally with quantized GGUF weights. Small images padded to 224px minimum for Ollama compatibility.
March 3, 2026: Added Gemini 3.1 Flash Lite (0.6546), Gemma 3 27B (0.2069), Gemma 3 4B (0.1808), and Nemotron-Nano 12B (0.2071). Re-benchmarked OlmOCR-2 with correct prompt (0.3678, up from 0.223). Re-benchmarked Nemotron-Nano with reasoning disabled (0.2071, up from 0.1359).
March 2, 2026: Added Qwen3.5 small models via HuggingFace Inference Endpoints: Qwen3.5-9B (0.4469), Qwen3.5-4B (0.4108), and Qwen3.5-0.8B (0.3294). All run with reasoning disabled.
March 1, 2026: Added Nano Banana 2 (0.5727), FireRed-OCR (0.2592). Added Gemini 3 Flash (minimal reasoning) variant.
February 27, 2026: Added Seed 2.0 Mini (0.5257), ERNIE 4.5 VL 424B (0.4424), Llama 4 Maverick (0.3730), PaddleOCR-VL (0.2933), GLM-OCR (0.2660), Molmo-2 8B (0.0944). Added LLM post-processing (cleanup) experiments for several models.
February 26, 2026: Expanded benchmark with new handwriting and table samples. Europe share reduced from 58% to 44% of samples. Added Tesseract v5, LayoutParser, EfficientOCR, PP-OCRv5, and Mistral OCR. Added cost vs performance scatter plot.
February 25, 2026: Expanded benchmark with new handwriting samples. Launched v2 scoring with macro-averaging across Region, Period, and Format axes. Added Claude Sonnet 4.6, Claude Opus 4.6, and GPT-5.3 Codex.
February 24, 2026: Added five Qwen models via Alibaba Cloud Model Studio: Qwen3.5-27B (0.4340), Qwen3.5-35B (0.4296), Qwen3.5-Flash (0.4280), Qwen3.5-122B (0.4264), and Qwen-VL-OCR (0.3525). All Qwen3.5 variants scored within a tight 0.426 to 0.434 range, with the smaller 27B model slightly outperforming the 122B. Qwen-VL-OCR, Alibaba's dedicated OCR model, underperformed the general-purpose Qwen3.5 models.
February 19, 2026: Added Gemini 3.1 Pro (low), which takes the #1 spot at 0.5965.
Fixed a bug in the evaluation script where models that wrapped output in markdown code fences
(e.g. ```markdown ... ```) had their entire response stripped instead of just the fence
markers.
February 16, 2026: Added Qwen3.5-397B in two variants: thinking disabled (0.4484) and thinking enabled (0.4598).
February 15, 2026: Added API cost columns, Gemini 2.5/2.0 Flash, ERNIE 4.5 VL, Llama 4 Maverick, dots.ocr-1.5, Qwen3-VL-8B, DeepSeek-OCR2, Kimi K2.5, Datalab, PP-OCRv5. Fixed max_tokens for thinking models.
Last updated: June 24, 2026
Comments