OleehyO/latex-formulas
Viewer โข Updated โข 1.56M โข 1.28k โข 99
How to use OleehyO/TexTeller with Transformers:
# Use a pipeline as a high-level helper
# Warning: Pipeline type "image-to-text" is no longer supported in transformers v5.
# You must load the model directly (see below) or downgrade to v4.x with:
# 'pip install "transformers<5.0.0'
from transformers import pipeline
pipe = pipeline("image-to-text", model="OleehyO/TexTeller") # Load model directly
from transformers import AutoTokenizer, AutoModelForImageTextToText
tokenizer = AutoTokenizer.from_pretrained("OleehyO/TexTeller")
model = AutoModelForImageTextToText.from_pretrained("OleehyO/TexTeller")There are more test images here and a horizontal comparison of recognition models from different companies.
TexTeller is a ViT-based model designed for end-to-end formula recognition. It can recognize formulas in natural images and convert them into LaTeX-style formulas.
TexTeller is trained on a larger dataset of image-formula pairs (a 550K dataset available here), exhibits superior generalization ability and higher accuracy compared to LaTeX-OCR, which uses approximately 100K data points. This larger dataset enables TexTeller to cover most usage scenarios more effectively.
For more details, please refer to the ๐๐๐ฑ๐๐๐ฅ๐ฅ๐๐ซ GitHub repository.