Print

Print


---------- Forwarded message ----------
From: Yashar Moshfeghi <[log in to unmask]>
Date: Fri, Apr 14, 2017 at 6:49 AM
Subject: [Dbworld] CFP: Lucene for Information Access and Retrieval
Research (LIARR 2017)
To: [log in to unmask]


=======================================================
1st Workshop on
Lucene for Information Access and Retrieval Research (LIARR 2017)
in conjunction with SIGIR 2017: 40th Annual ACM SIGIR Conference

CALL FOR PAPERS

https://liarr2017.github.io/
11 August 2017, Tokyo, Japan
=======================================================


IMPORTANT DATES

Pitches and Demos submission: 20 June 2017
Notification of acceptance: 10 July 2017
Camera ready: 15 July 2017
Workshop: 11 August 2017


OVERVIEW

``less yaking, more hacking''

Open Source Information Retrieval (IR) has attracted a lot of attention and
in turn many toolkits. Over the years, Lucene and its and its expansions
PyLucene, Solr and Elasticsearch, have grown to be the dominant Open Source
Information Retrieval toolkits used in Industry. However, unlike the Open
Source IR toolkits developed by academics (e.g, Indri, Lemur, Terrier,
Wumpus and Zettair), Lucene et al. has been less focused on evaluation and
experimentation and so is not as developed to undertake and perform
Information Retrieval (IR) Research and Evaluation. For example, it is not
particularly clear how to undertake and perform TREC based evaluations
using such toolkits or how to modify the underlying code bases to
experiment with new methods and retrieval models.

However, there have been two recent initiatives: Anserini and Lucene4IR for
developing add-ons for IR researchers to work with Lucene along with a raft
of other independent code bases. So it is timely to bring the community
together and look to see how we can develop these resources
collaboratively. By working together and with the Open Source community
that supports Lucene, the IR community can have greater impact on industry,
because we will be able to transfer knowledge more efficiently, increase
the reproducibility of the methods we developed, and encourage greater
collaboration between academic and industry.

The purpose of this proposed workshop is to bring together the community of
researchers using Lucene and its derivatives like Solr and Elasticsearch
(referred to as simply Lucene'' below), and develop tools for IR research.
Rather than having a “mini-conference† the workshop will be more like a
hackathon where participants will learn about Lucene and work on code.
Presentations are meant only as a tool for structuring and guiding the
efforts of attendees.

The goals of this workshop are:

* to create a development plan and common codebase for IR research with
Lucene,
* to implement various information retrieval methods in
Lucene/Solr/Elasticsearch and
* to evaluate the quality of such methods and models.


SCOPE AND TOPICS

The aim is to take state of the art methods and develop prototype
implementations, where we will focus on:

* exposing the standard functions that we need to have access to when we
want to code up a retrieval model;
* getting some of the core retrieval functions in there (those that are not
there already);
* provide an understanding on how some of the functions are implemented in
Lucene and how they deviate from how people know them in IR (e.g., field
search in Elasticsearch);
* provide a roadmap & a set of guidelines to researchers and developers for
which models/algorithms/techniques should the community include next into
Lucene and how this should be done.


SUBMISSION GUIDELINES

We seek submissions and contributions that describe and detail how to
undertake and development various components, algorithms, etc using Lucene
based tools, e.g. how to guides, overviews of code developed, etc. And we
also seek pitches that outline and describe components that participants
would like to have in Lucene based tools, e.g. different parsers, learning
to rank, TREC indexers, etc.

Essentially, we would like to provide participants with the opportunity to
showcase some of the tools that they have been developing using Lucene et
al, providing training on how to use different functionality Lucene et al
provides, and to suggest directions on what we should hack.

* Pitches: 1-2 page outlines of the algorithms/components/features/etc to
be developed, brief rationale for creating them, sketch of how they might
be implemented, links to relevant papers, and existing code, along with
other relevant information, i.e. how it might lead to reproducibility
experiments, how it could be used, etc.

* Demos: 1-2 page outlines of demos/tools/code developed, what is does, how
it works, etc.
Submissions should be uploaded to EasyChair via https://easychair.org/
conferences/?conf=liarr2017, where they will be reviewed by the organizers,
and a coherent set will be chosen for presentation. However, we will be as
inclusive as possible, and include all acceptable works in the proceedings
to showcase the work being undertaken.

All pitches and demos should be in ACM format.


ORGANIZERS

* Leif Azzopardi (University of Strathclyde)
* Matt Crane (University of Waterloo)
* Hui Fang (University of Delaware)
* Grant Ingersoll (Lucidworks)
* Jimmy Lin (University of Waterloo)
* Yashar Moshfeghi (University of Glasgow)
* Harrisen Scells (Queensland University of Technology)
* Peilin Yang (University of Delaware)
* Guido Zuccon (Queensland University of Technology)
_______________________________________________
Please do not post msgs that are not relevant to the database community at
large.  Go to www.cs.wisc.edu/dbworld for guidelines and posting forms.
To unsubscribe, go to https://lists.cs.wisc.edu/mailman/listinfo/dbworld