DinoSearch
15th February 2001
What is DinoSearch?
DinoSearch was initially conceived as a ``proof of concept''
application for the SuNFiSH system. Its
goal is to allow users to find technical papers relating to
vertebrate palaeontology by means of a variety of different
searching techniques across a federated hierarchy of databases
maintained by different content providers.
What searches does it support?
To some extent, that's open to discussion: we want to focus our
development effort on functionality that people actually want and
will really use. I'll shortly post an on-line questionnaire on this
site to gather opinions on what's needed.
That said, we will definitely support:
- Full-text searching - for example, a search for ``Currie'' will
find papers in which that name occurrences of that name in the
body text or references section as well as papers of whom
Currie is the author.
- Field-specific searching, so that it's possible to search only
for papers written by Currie, or only for papers which refer
to his papers, or only for papers in which the name occurs in
the body text.
- Date-range searching - for example, limiting a search to papers
published between 1985 and 1992.
Some candidates for additional searching facilities:
- Search-term broadening - for example, when you search for
``prezygapophyses'' or ``postzygapophyses'', the system also
finds papers that refer to ``zygapophyses''.
- Search-term narrowing - for example, when you search for
``zygapophyses'', the system also finds papers that refer to
``prezygapophyses'' or ``postzygapophyses''.
- Search-term translation - for example, when you search for
``manus'', the system also finds papers that refer to ``hand''
and vice versa.
- Relevance feedback, or ``find more like this'': having located
one or more papers that interest you, you ask the system for
other which cover similar topics, and it uses a fuzzy matching
algorithm on the keywords, abstract and/or full text of the
selected papers to derive a suitable set of search terms.
What do you get back from a search?
Different repositories will provide different sorts of result.
- The ideal result is of course a downloadable soft copy of the
paper - some publications already provide this service, and
more are likely to do so in the future. Some may charge for
this, others may not.
- The next best result may be a copy of the abstract together
with a definitive reference, ideally with contact details for
the journal which published the paper.
- The ``backstop'' result should be a definitive reference.
How do organisations provide content?
Papers, abstracts and references are made available via the Z39.50
Information Retrieval protocol. Since many journals, museums and
other content providers do not run Z39.50 servers, The DinoSearch
project will provide help to organisations wishing to provide
content in any of the following ways:
- For organisations which already have their own literature
database, we will work with them to build a ``gateway'' that
interfaces with that database and presents the information via
Z39.50.
- For organisations which do not have a literature database but
would like one, we will provide software for building,
administrating and updating a database, and serving its
contents via Z39.50.
- For organisations with do not have or want their own database,
we will create, administer and update our own database of
their material.
A fee may be charged for some or all of these services; or they may
be provided for fee. Individual cases will be judged on their
merits.
How is DinoSearch implemented?
DinoSearch is an application of the
SuNFiSH
system, which is a hierarchical network of clients, brokers and
servers.
- Each server contains information about some papers.
- Each broker aggregates data from several sources: those sources
may be servers, other brokers from lower in the hierarchy, or
any mixture of servers and brokers.
- Clients provide a UI, allowing users to enter search criteria
and displaying the results of the work that the brokers and
servers do to find the information matching those criteria.
The SuNFiSH architecture is described in much more detail in the
document
Multi-Lingual Search - Overview. A
sample client/broker/server hierarchy is shown in the message
DinoSearch Architecture.