next up previous
Next: 1. Introduction

An Evaluation of Linguistically-motivated Indexing Schemes

Avi Arampatzis    Th.P. van der Weide    C.H.A. Koster    P. van Bommel
Technical Report CSI-R9927, December 1999, Dept. of Information Systems and Information Retrieval,
University of Nijmegen, The Netherlands.

Proceedings of BCS-IRSG 2000 Colloquium on IR Research, 5th-7th April 2000, Sidney Sussex College, Cambridge, England. To appear.

January 19, 2000


In this article, we describe a number of indexing experiments based on indexing terms other than simple keywords. These experiments were conducted as one step in validating a linguistically-motivated indexing model. The problem is important but not new. What is new in this approach is the variety of schemes evaluated. It is important since it should not only help to overcome the well-known problems of bag-of-words representations, but also the difficulties raised by non-linguistic text simplification techniques such as stemming, stop-word deletion, and term selection. Our approach in the selection of terms is based on part-of-speech tagging and shallow parsing. The indexing schemes evaluated vary from simple keywords to nouns, verbs, adverbs, adjectives, adjacent word-pairs, and head-modifier pairs. Our findings apply to Information Retrieval and most of related areas.


avi (dot) arampatzis (at) gmail