I wrote a little hack called the Synonomizer, a Python-based CGI script allowing the reader to create a synonym file suitable for use in Solr. From the blog posting:[1]
Here is how Synonymizer works. First it reads a configured
database of previously generated synonyms.† In the beginning,
this file is empty but must be readable and writable by the HTTP
server. Second, Synonymizer reads the database and offers the
reader to: 1) create a new set of synonyms, 2) edit an existing
synonym, or 3) generate a synonym file. If Option #1 is chosen,
then input is garnered, and looked up in WordNet. The script will
then enable the reader to disambiguate the input through the
selection of apropos definitions. Upon selection, both WordNet
hyponyms and hypernyms will be returned. The reader then has the
opportunity to select desired words/phrase as well as enter any
of their own design. The result is saved to the database. The
process is similar if the reader chooses Option #2. If Option #3
is chosen, then the database is read, reformatted, and output to
the screen as a stream of text to be used on Solr or something
else that may require similar functionality. Because Option #3 is
generated with a single URL, it is possible to programmatically
incorporate the synonyms into your Solr indexing process
pipeline.
For a limited period of time, one can play with Synonomizer in a sandbox. [2]
[1] blog posting - http://blogs.nd.edu/emorgan/2017/01/synonymizer/
[2] sandbox - http://dh.crc.nd.edu/sandbox/synonymizer/
—
Eric Morgan
University of Notre Dame
|