RLML Logo
Ruth Lilly Medical Library 975 West Walnut Street Reference Phone (317) 274-7185
Indiana University School of Medicine Indianapolis, IN 46202-5121 Reference Fax (317) 274-4056

PubMed Review

by
Thomas W. Emmett, MD, MLS


PubMed is one of the National Library of Medicine's free Web-based MEDLINE alternatives and is viewed as an eventual replacement for Elhill command language searching. This review examines many of PubMed's search features, their advantages and disadvantages, and PubMed's suitability to replace Elhill. Non-search features and comparisons with other MEDLINE products will only briefly be addressed.

PubMed is available from the National Center for Biotechnology Information's Web site (www.ncbi.nlm.nih.gov/PubMed) and provides access to MEDLINE, PreMEDLINE, and the molecular biology databases included in NCBI's Entrez retrieval system. One of the major enhancements of PubMed over Internet Grateful Med (IGM) is the linkage of MEDLINE and PreMEDLINE records to medical publishers' Web sites in order to provide access to the full text of journal articles. The references in these articles would in turn link back to PubMed records to complete the circle. To date, there are links to 72 full-text journals. However, only 10 are freely available in their entirety; two others are available selectively, and the other 60 require a subscription. PubMed will include records for all articles of participating publishers' journals, even if the journals are selectively indexed by NLM, and the records are never included in MEDLINE.

Basic Searching

The PubMed home page features the basic search, which allows for word searching in all available fields (including MeSH). Words are automatically ANDed together and can include authors' last names and initials, full journal titles, title abbreviations, or ISSNs. A limited number of phrases are included in the basic index and can be searched by using quotes if the initial retrieval is suboptimal. Sample phrases include "rheumatoid arthritis in children" and "cystic fibrosis of the pancreas". Even some very unusual combinations like "rare are" and "rare probably", which contain stopwords, are included. A nonintuitive syntax must be used with phrases containing apostrophes because apostrophes are replaced by spaces in the index. Therefore, "Bell's palsy" must be searched as "bell s palsy" in order for records containing this phrase to be retrieved. It should be kept in mind that even though many phrases are indexed, this is not true adjacency searching and will result in no documents found for most multi-word phrases. Right-handed truncation with an asterisk is supported.

Searchers who wish to field-tag every word or phrase can construct a very complex single-statement search on the basic search screen. However, this is very tedious, and the advanced search offers a slightly easier alternative. A field-tagged search also requires that the Boolean operators be in capital letters, an important point not mentioned on the Help page. This appears to be the only place where case sensitivity is important.

Advanced Searching

The advanced search mode, a link off the PubMed home page, includes all of the basic features noted above, and, in addition, offers the capability of specifying a field for each word or phrase searched. A search mode must also be selected, either automatic or list terms. Automatic mode proceeds directly to the search, while List Terms presents an index of the selected field at the point where the first search term would fit alphabetically. After selecting one or more terms from this list, the search is executed. However, rather than going directly to the results screen, an intermediate screen is presented that allows for modifications. This page is laid out in three sections. At the top is a current query line where search terms have been ANDed together by default and a button leads directly to the retrieved records. A middle section, add terms to query, is the same as the original search screen. The bottom section, modify current query, keeps a history of all previous search statements, similar to the Ovid main search screen, and allows the default AND operator to be overridden by OR, BUTNOT, or a range.

The advanced search mode offers most of the features needed by the advanced searcher, but there is still room for improvement, particularly in the areas of field searching flexibility and Web page design. First of all, not all MEDLINE fields available in Elhill are available in PubMed. Among the missing fields are those for journal subsets, country of publication, date of entry, entry month, and personal name as subject. The absence of an entry month field makes it impossible to update searches from another vendor such as Ovid, and, from my point of view, is the single most important feature that should be added to future PubMed versions. There is no separate subheading field in PubMed, but all subheadings are searchable from the MeSH field and can be either free-floating or attached directly to specific headings.

The typical search limits, like those available in Internet Grateful Med (IGM) and Ovid, are not as easy to use in PubMed because each limit must first be searched in its own field and then ANDed to the other statements (e.g., English must be searched in the language field and human in the MeSH field). The only exception to this is the publication date limit, which allows one to limit output to the most recent 30, 60, 90, or 180 days or 1, 2, 5, or 10 years. However, these intervals are measured from the current calendar date, and so a limit to the past year would retrieve records from mid-1996 to mid-1997. The only way to limit to a particular year is to enter it into the publication date field and AND it to the other search statements. Ranging is even more complex. One must either enter the beginning and ending years separately and use the RANGE operator to combine them or enter directly a field-tagged search statement ("1993" THRU "1997" [pdat]). It would be a welcome improvement if these options were explained in more detail on the Help page. Even better would be a change in the publication date limit feature so that publication months and years could be used rather than the most recent options. This would then correspond to date limits traditionally used by database vendors and to the way searches are requested by all of my clients and probably those of other searchers as well.

Medical subject heading (MeSH) searching is available in either the Automatic or List Terms mode. As with Elhill, Automatic mode maps from the see references in the MeSH thesaurus to the correct subject headings. However, a large number of inverted headings do not have see references, and so use of the natural language phrase, like "rheumatoid arthritis" for "arthritis, rheumatoid", will result in zero retrieval. In List Terms mode, there is no mapping at all. The first word of the desired heading must be entered correctly so that the list box of MeSH terms begins at the proper alphabetical location. One has a choice to search with MeSH headings that are restricted to focus or those that ar not, but all headings in PubMed are automatically exploded, as they are in IGM. There is no way to search a single heading by itself if it has narrower terms associated with it (although "all fields" and "textword" searching will search heading words in addition to other fields). Unlike headings, subheadings are not automatically exploded. There is no tree structure thesaurus or permuted index in PubMed, an inconvenience for advanced searchers. By comparison, IGM includes the tree structures and Ovid includes both.

Other search features that would be welcome enhancements to PubMed include:

  1. True adjacency searching in titles and abstracts
  2. Search statement numbering in the modify current query box so that nested statements could be used to improve efficiencey in complex searching
  3. Greater flexibility in specifying fields to be displayed, printed or saved and the option to include the search strategy in printed/saved results
  4. Publication date limits that can be applied to the retrieval of related records, not just the initial search results
  5. The option to delete statements from the search history and save search strategies
For first-time users of the advanced search, understanding and negotiating the several Web page layouts takes some time. This is probably unavoidable because of the fact that a large number of buttons and boxes are contained on a single page that cannot be completely viewed without scrolling. However, addressing the following points on the Help pages would make the job much easier:
  1. In List Terms mode, if it is decided not to select a term from the list box, the only way to continue the search is to change back to Automatic mode; otherwise clicking on the search button will continue to repeat the same screen.
  2. The modify current query box at the bottom of the Web page offers a great deal of flexibility in combining sets, but users should be reminded that the control key can be used to select more than one term, and they need not be adjacent.
  3. When browsing, selecting, and printing records it is best to set the display number high so that all of the records in a given set will be displayed on the same page. This minimizes the number of Web pages that must be retrieved and makes it much easier to keep your place.
The best feature of PubMed is probably its links to additional citations most closely related to selected records. This is a user-friendly option that is accomplished by "comparing text and MeSH terms of each article using a powerful algorithm" and is not available with Elhill, Ovid, or IGM. The details of the algorithm are not specified, but hopefully this will be remedied in future versions. Presumably frequently-used MeSH terms and textwords are mixed and matched to create a ranked list of related records. There is no mention that the controlled vocabulary is supplemented or expanded with additional terms. This algorithm is analogous to what an experienced searcher might do to formulate a comprehensive search strategy, but it is not yet a replacement for the professional.

As a sample search, I checked PubMed for records dealing with the prevention of pedestrian accidents, selecting a record with all three concepts in the title (Prevention of pedestrian accidents, Arch Dis Child 68(5):669-672, 1993) as the basis for a "related articles" search. I limited the output to the years 1993-1995 so that database currency would not be a confounding issue and retrieved 51 records. An Ovid search on the same subject yielded 76 records, 15 appearing on both lists. There was very good relevance ranking with PubMed because 14 of the 15 duplicate records were in the top 21 of the PubMed list and only one was in the bottom 30. The 36 PubMed records not retrieved by the Ovid search were missing either the "pedestrian" or the "prevention" concept and appeared not to be relevant. Conversely, of the 61 Ovid records not on the PubMed list, there were quite a few that appeared relevant, including two with the major concepts in the title:
  1. The Harstad injury prevention study: hospital-based injury recording used for outcome evaluation of commmunity-based prevention of bicyclist and pedestrian injury. Scandinavian Journal of Primary Health Care. 13(2):141-9, 1995 Jun.

  2. Preventing child pedestrian injury: pedestrian education or traffic calming? Australian Journal of Public Health. 18(2):209-12, 1994 Jun.
It is very likely that continued browsing of PubMed's related records links would have uncovered additional relevant articles, but this is a very inefficient way to conduct a comprehensive search.

The bottom line is that the perfect search is always an elusive goal, but an experienced searcher with a maximally flexible system still has the opportunity to create the best strategies. Having said this, the related records algorithm in PubMed is a very useful feature that will be greatly praised by many MEDLINE users and will meet their needs the vast majority of the time.

Another valuable PubMed feature is its "clinical queries using research methodology filters" option. These filters were developed by Dr. Brian Haynes, et. al., at McMaster University in order to isolate the high quality research articles most useful to the practicing clinician. Four topic areas are available--therapy, diagnosis, etiology, and prognosis--and retrieval can be adjusted for maximum specificity or sensitivity. A very well-organized table hyperlinked from the search page lists the hedges that are used for each of these topic combinations. Examining this table, however, I am not clear as to why the "etiology" subheading is not used in the etiology filter or how specificity is increased in the therapy, diagnosis, and etiology groups without any mention of these concepts. However, this is not under the purview of the PubMed creators, who should be commended for incorporating this very useful evidence-based medicine tool.

Other useful PubMed features that will be of value to selected audiences include:

  1. a Journal Browser database that allows cross-searching for journal names, title abbreviations, or ISSN numbers
  2. a complete list of all journals indexed in PubMed
  3. a listing of PubMed journals that offer full text articles from publishers' Web sites
  4. a Citation Matcher that can be used to verify citations with missing information (especially title) or for pulling up all articles in a particular journal issue (but not in table of contents order)
  5. links to all NIH Clinical Alerts
A final point that is non-search related but well worth mentioning is the potential value of PubMed as a current awareness tool, The currency of its records compares very favorably with Current Contents (CC), and it is not difficult to construct a search that combines specific journal titles with a subject that is limited to the most recent records. As a test, I checked PubMed, Ovid, and Current Contents for the currency of the following titles: JAMA, New England Journal of Medicine, Journal of Biological Chemistry (JBC), American Journal of Physiology, and the Japanese Heart Journal. Ovid fared the worst, the CD-ROM version being from 2 to 5 months behind. PubMed and Current Contents were very comparable. Both were within one week of the most current issue of JAMA. PubMed had indexed and abstracted the most current issue of the New England Journal while CC was 3 weeks behind. PubMed had abstracted, but not yet indexed, the most current issue of JBC while CC was one month behind. Both had the most current monthly issue of the American Journal of Physiology. The bimonthly Japanese Heart Journal was most current in Current Contents, Ovid having the January issue, PubMed the March issue, and CC the May issue. It appears from this brief survey, that PubMed should be strongly considered a comparable and probably more accessible alternative to Current Contents for keeping up-to-date with the clinical and basic science literature.

In summary, PubMed is a very good search system for the majority of end users. Professional searchers should also find it useful for verifying citations and performing high precision searches. However, there are still a number of enhancements needed before PubMed can be considered an Elhill replacement or before the experienced searcher will feel confident using it to perform comprehensive searches. The most urgent needs are the addititions of the entry month field and an online thesaurus with the capability of performing unexploded MeSH searches. However, all of the features and built-in flexibility currently available with Elhill and IGM should be considered for incorporation into PubMed before command level searching is phased out. Hopefully, this will not be an insurmountable task.

(Reference Services) (RLML Home)
PubMed Review/ Tom Emmett / temmett@iupui.edu / Ruth Lilly Medical Library / Indiana University School of Medicine

URL: http://www.medlib.iupui.edu/ref/pubmed.html