Leo is powered by a state-of-the-art AI model built on Transformer technology, the same architecture as general-purpose language models like ChatGPT and Claude, but fine-tuned to accurately extract text from images of historical documents. It runs as a program on a special computer called a GPU (Graphics Processing Unit), which interprets the input that the user uploads as generalized statistical data before sending meaningful text back to their computer. Guided by vast amounts of training data, it breaks down the image of a document into numerical patterns, recognizing relationships between different parts of the text. The key mechanism that enables this is called attention—specifically, "self-attention." For instance, a human reader might read a challenging manuscript by using context from the surrounding words to infer what an unclear letter or word should be. Transformers do something similar, mathematically weighing how important each part of the text is in relation to the rest. The final output is the model’s best estimate of what was written. Through repeated refinement, it will learn to recognize patterns more effectively and so improve its accuracy over time.
How does Leo work?
Updated over 3 months ago
