Tim Allison
Tim has been working in content/metadata extraction (and evaluation), advanced search and relevance tuning for more than 20 years. Tim currently works at elastic as a search relevance engineer. He is a member of the Apache Software Foundation (ASF), the chair/VP of Apache Tika, and a committer on Apache OpenNLP, Nutch, Stormcrawler, Lucene/Solr, PDFBox and Apache POI. Tim holds a Ph.D. in Classical Studies, and in a former life, he was a professor of Latin and Greek.
Session
Apache Tika 4.x is a generational leap — nearly two decades of battle-tested document processing, engineered for the modern AI stack. Purpose-built for RAG and hybrid search, it delivers scalable, fault-tolerant pipelines, VLM hooks for OCR and image understanding, and structure-aware chunking and embedding integrations for RAG-ready output.