PubMed is one of the National Library of Medicine's free Web-based MEDLINE
alternatives and is viewed as an eventual replacement for Elhill command
language searching. This review examines many of PubMed's search features,
their advantages and disadvantages, and PubMed's suitability to replace
Elhill. Non-search features and comparisons with other MEDLINE products will
only briefly be addressed.
PubMed is available from the National Center for Biotechnology Information's
Web site
(www.ncbi.nlm.nih.gov/PubMed)
and provides access to MEDLINE, PreMEDLINE, and the molecular biology databases
included in NCBI's Entrez retrieval system. One of the major enhancements of
PubMed over Internet Grateful Med (IGM) is the linkage of MEDLINE and
PreMEDLINE records to medical publishers' Web sites in order to provide
access to the full text of journal articles. The references in these articles
would in turn link back to PubMed records to complete the circle. To date,
there are links to 72 full-text journals. However, only 10 are freely available
in their entirety; two others are available selectively, and the other 60
require a subscription. PubMed will include records for all articles of
participating publishers' journals, even if the journals are selectively
indexed by NLM, and the records are never included in MEDLINE.
Basic Searching
The PubMed home page features the basic search, which allows for
word searching in all available fields (including MeSH). Words are
automatically ANDed together and can include authors' last names and
initials, full journal titles, title abbreviations, or ISSNs. A limited
number of phrases are included in the basic index and can be searched by
using quotes if the initial retrieval is suboptimal. Sample phrases
include "rheumatoid arthritis in children" and "cystic
fibrosis of the pancreas". Even some very unusual combinations like
"rare are" and "rare probably", which contain
stopwords, are included. A nonintuitive syntax must be used with phrases
containing apostrophes because apostrophes are replaced by spaces in the
index. Therefore, "Bell's palsy" must be searched as "bell
s palsy" in order for records containing this phrase to be retrieved.
It should be kept in mind that even though many phrases are indexed, this
is not true adjacency searching and will result in no documents
found for most multi-word phrases. Right-handed truncation with an
asterisk is supported.
Searchers who wish to field-tag every word or phrase can construct a very
complex single-statement search on the basic search screen. However, this is
very tedious, and the advanced search offers a slightly easier alternative. A
field-tagged search also requires that the Boolean operators be in capital
letters, an important point not mentioned on the Help page. This appears to be
the only place where case sensitivity is important.
Advanced Searching
The advanced search mode, a link off the PubMed home page, includes all of
the basic features noted above, and, in addition, offers the capability of
specifying a field for each word or phrase searched. A search mode must
also be selected, either automatic or list terms. Automatic
mode proceeds directly to the search, while List Terms presents an index
of the selected field at the point where the first search term would fit
alphabetically. After selecting one or more terms from this list, the
search is executed. However, rather than going directly to the results
screen, an intermediate screen is presented that allows for modifications.
This page is laid out in three sections. At the top is a current
query line where search terms have been ANDed together by default and
a button leads directly to the retrieved records. A middle section, add
terms to query, is the same as the original search screen. The bottom
section, modify current query, keeps a history of all previous
search statements, similar to the Ovid main search screen, and allows the
default AND operator to be overridden by OR, BUTNOT, or a range.
The advanced search mode offers most of the features needed by the
advanced searcher, but there is still room for improvement, particularly
in the areas of field searching flexibility and Web page design. First of
all, not all MEDLINE fields available in Elhill are available in PubMed.
Among the missing fields are those for journal subsets, country of
publication, date of entry, entry month, and personal name as subject.
The absence of an entry month field makes it impossible to update searches
from another vendor such as Ovid, and, from my point of view, is the
single most important feature that should be added to future PubMed
versions. There is no separate subheading field in PubMed, but all
subheadings are searchable from the MeSH field and can be either
free-floating or attached directly to specific headings.
The typical search limits, like those available in Internet Grateful Med
(IGM) and Ovid, are not as easy to use in PubMed because each limit must
first be searched in its own field and then ANDed to the other statements
(e.g., English must be searched in the language field and
human in the MeSH field). The only exception to this is the
publication date limit, which allows one to limit output to the
most recent 30, 60, 90, or 180 days or 1, 2, 5, or 10 years. However,
these intervals are measured from the current calendar date, and so a
limit to the past year would retrieve records from mid-1996 to mid-1997.
The only way to limit to a particular year is to enter it into the
publication date field and AND it to the other search statements. Ranging
is even more complex. One must either enter the beginning and ending years
separately and use the RANGE operator to combine them or enter directly a
field-tagged search statement ("1993" THRU "1997" [pdat]). It would be a
welcome improvement if these options were explained in more detail on the
Help page. Even better would be a change in the publication date
limit feature so that publication months and years could be used
rather than the most recent options. This would then correspond to
date limits traditionally used by database vendors and to the way searches
are requested by all of my clients and probably those of other searchers
as well.
Medical subject heading (MeSH) searching is available in either the
Automatic or List Terms mode. As with Elhill, Automatic
mode maps from the see references in the MeSH thesaurus to the
correct subject headings. However, a large number of inverted headings
do not have see references, and so use of the natural language
phrase, like "rheumatoid arthritis" for "arthritis,
rheumatoid", will result in zero retrieval. In List Terms mode,
there is no mapping at all. The first word of the desired heading must
be entered correctly so that the list box of MeSH terms begins at the
proper alphabetical location. One has a choice to search with MeSH
headings that are restricted to focus or those that ar not, but
all headings in PubMed are automatically exploded, as they are in
IGM. There is no way to search a single heading by itself if it has
narrower terms associated with it (although "all fields" and "textword"
searching will search heading words in addition to other fields).
Unlike headings, subheadings are not automatically exploded. There is no
tree structure thesaurus or permuted index in PubMed, an inconvenience for
advanced searchers. By comparison, IGM includes the tree structures and
Ovid includes both.
Other search features that would be welcome enhancements to PubMed
include:
True adjacency searching in titles and abstracts
Search statement numbering in the modify current query box
so that nested statements could be used to improve efficiencey in
complex searching
Greater flexibility in specifying fields to be displayed, printed or
saved and the option to include the search strategy in printed/saved
results
Publication date limits that can be applied to the retrieval of
related records, not just the initial search results
The option to delete statements from the search history and save
search strategies
For first-time users of the advanced search, understanding and
negotiating the several Web page layouts takes some time. This is
probably unavoidable because of the fact that a large number of buttons
and boxes are contained on a single page that cannot be completely viewed
without scrolling. However, addressing the following points on the Help
pages would make the job much easier:
In List Terms mode, if it is decided not to select a term from the
list box, the only way to continue the search is to change back to
Automatic mode; otherwise clicking on the search button will continue to
repeat the same screen.
The modify current query box at the bottom of the Web page
offers a great deal of flexibility in combining sets, but users should be
reminded that the control key can be used to select more than one term,
and they need not be adjacent.
When browsing, selecting, and printing records it is best to set the
display number high so that all of the records in a given set will
be displayed on the same page. This minimizes the number of Web pages
that must be retrieved and makes it much easier to keep your place.
The best feature of PubMed is probably its links to additional citations
most closely related to selected records. This is a user-friendly option
that is accomplished by "comparing text and MeSH terms of each
article using a powerful algorithm" and is not available with Elhill,
Ovid, or IGM. The details of the algorithm are not specified, but
hopefully this will be remedied in future versions. Presumably
frequently-used MeSH terms and textwords are mixed and
matched to create a ranked list of related records. There is no mention
that the controlled vocabulary is supplemented or expanded with additional
terms. This algorithm is analogous to what an experienced searcher might
do to formulate a comprehensive search strategy, but it is not yet a
replacement for the professional.
As a sample search, I checked PubMed for records dealing with the
prevention of pedestrian accidents, selecting a record with all
three concepts in the title (Prevention of pedestrian
accidents, Arch Dis Child 68(5):669-672, 1993) as the basis for a
"related articles" search. I limited the output to the years
1993-1995 so that database currency would not be a confounding issue and
retrieved 51 records. An Ovid search on the same subject yielded 76
records, 15 appearing on both lists. There was very good relevance
ranking with PubMed because 14 of the 15 duplicate records were in the
top 21 of the PubMed list and only one was in the bottom 30. The 36
PubMed records not retrieved by the Ovid search were missing either the
"pedestrian" or the "prevention" concept and appeared
not to be relevant. Conversely, of the 61 Ovid records not on the PubMed
list, there were quite a few that appeared relevant, including two with
the major concepts in the title:
The Harstad injury prevention study: hospital-based injury recording
used for outcome evaluation of commmunity-based prevention of bicyclist
and pedestrian injury. Scandinavian Journal of Primary Health Care.
13(2):141-9, 1995 Jun.
Preventing child pedestrian injury: pedestrian education or traffic
calming? Australian Journal of Public Health. 18(2):209-12, 1994 Jun.
It is very likely that continued browsing of PubMed's related records
links would have uncovered additional relevant articles, but this is a
very inefficient way to conduct a comprehensive search.
The bottom line is that the perfect search is always an elusive goal, but
an experienced searcher with a maximally flexible system still has the
opportunity to create the best strategies. Having said this, the related
records algorithm in PubMed is a very useful feature that will be greatly
praised by many MEDLINE users and will meet their needs the vast
majority of the time.
Another valuable PubMed feature is its "clinical queries using research
methodology filters" option. These filters were developed by Dr. Brian
Haynes, et. al., at McMaster University in order to isolate the high
quality research articles most useful to the practicing clinician. Four
topic areas are available--therapy, diagnosis, etiology, and
prognosis--and retrieval can be adjusted for maximum specificity or
sensitivity. A very well-organized table hyperlinked from the search page
lists the hedges that are used for each of these topic combinations.
Examining this table, however, I am not clear as to why the "etiology"
subheading is not used in the etiology filter or how specificity is
increased in the therapy, diagnosis, and etiology groups without any
mention of these concepts. However, this is not under the purview
of the PubMed creators, who should be commended for incorporating
this very useful evidence-based medicine tool.
Other useful PubMed features that will be of value to selected audiences
include:
a Journal Browser database that allows cross-searching for journal
names, title abbreviations, or ISSN numbers
a complete list of all journals indexed in PubMed
a listing of PubMed journals that offer full text articles
from publishers' Web sites
a Citation Matcher that can be used to
verify citations with missing information (especially title) or for
pulling up all articles in a particular journal issue (but not in table of
contents order)
links to all NIH Clinical Alerts
A final point that is non-search related but well worth mentioning is the
potential value of PubMed as a current awareness tool, The currency of
its records compares very favorably with Current Contents (CC), and it is
not difficult to construct a search that combines specific journal titles
with a subject that is limited to the most recent records. As a test, I
checked PubMed, Ovid, and Current Contents for the currency of the
following titles: JAMA, New England Journal of Medicine, Journal of
Biological Chemistry (JBC), American Journal of Physiology, and the
Japanese Heart Journal. Ovid fared the worst, the CD-ROM version being
from 2 to 5 months behind. PubMed and Current Contents were very
comparable. Both were within one week of the most current issue of JAMA.
PubMed had indexed and abstracted the most current issue of the New
England Journal while CC was 3 weeks behind. PubMed had abstracted, but
not yet indexed, the most current issue of JBC while CC was one month
behind. Both had the most current monthly issue of the American Journal
of Physiology. The bimonthly Japanese Heart Journal was most current in
Current Contents, Ovid having the January issue, PubMed the March issue,
and CC the May issue. It appears from this brief survey, that PubMed
should be strongly considered a comparable and probably more accessible
alternative to Current Contents for keeping up-to-date with the clinical
and basic science literature.
In summary, PubMed is a very good search system for the majority of end
users. Professional searchers should also find it useful for
verifying citations and performing high precision searches. However,
there are still a number of enhancements needed before PubMed can be
considered an Elhill replacement or before the experienced searcher
will feel confident using it to perform comprehensive searches. The most
urgent needs are the addititions of the entry month field and an online
thesaurus with the capability of performing unexploded MeSH searches.
However, all of the features and built-in flexibility currently
available with Elhill and IGM should be considered for incorporation
into PubMed before command level searching is phased out. Hopefully,
this will not be an insurmountable task.
PubMed Review/ Tom Emmett /
temmett@iupui.edu /
Ruth Lilly Medical Library / Indiana University School of
Medicine