I found this book helped me get my head around Solr: https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-beginner%E2%80%99s-guide. Chapter 8 explains indexing rich text formats including MS Word. Chris Gray Systems Analyst 519-888-4567, ext. 35764 [log in to unmask] University of Waterloo On 15-02-10 11:12 AM, Eric Lease Morgan wrote: > Can somebody point me to a good tutorial on how to index Word documents using Solr? > > I have a few hundred Microsoft Word documents I want to search. Through the use of the Tika library it seems as if I ought to be able to index my Word documents directly into Solr, but none of the tutorials I have found on the Web are complete. Missing directories. Missing files. Documentation for versions unreleased. Etc. > > Put another way, Tika can create a (nice) XHTML file complete with some useful metadata that can all be fed to Solr for indexing, but I can barely get out of the starting gate. Have you indexed Word documents using Solr, and if so, then how? > > — > Eric Morgan