Tuesday, July 31, 2012

Searching for review articles, literature reviews and more in Summon & Google Scholar

I find students doing thesis and dissertations love to look for past thesis and dissertations because they instinctively know the literature review chapter is a virtual goldmine, contains a wealth of resources, allowing them to jump-start their research.

What they might not know of is the existence of survey papers, review papers/articles etc which are often peer reviewed articles but contain no original research beyond a review and perhaps critique of the state of art research in a certain area.

Somewhat related are bibliographies, meta-analysis and systematic reviews. The last two, in particular systematic reviews are well known in the medical field of course. But let's leave out the medical field for now.

The question is how does one find such papers? Besides looking at publications like Annual Reviews Publications, there are generally two approaches to searching for them.

1. By Facet/Subject heading control

Firstly, some databases have facets that allow you to select for them. So do a search for the keyword then refine down into them. Some examples

i) Document type : Reviews - in Web of Science
ii) Document type : Review - in Scopus
iii) Document type : Literature Review - in Ebscohost
iv) Publication type : Meta-analysis/reviews/systematic reviews - in Pubmed (but more on this later)

Document type facet in Web of Science

It is unclear to me how good using these facets are in terms of precision and recall for finding such articles but I suppose it depends on the quality of indexing and also I assume this is a controlled term.

This approach is available also for Summon, but a lot less effective. You occasionally see the subject terms facet appear with values like
  • literature review 
  • meta-analysis
  • metaanalysis
  • review
  • reviews
  • bibliographie
  • bibliography
  • systematic review
Unfortunately you can't count on it appearing for two reasons. Firstly, by default Summon lists I believe only the top 100 most common subject terms in the results and if you have a lot of results, such subject terms won't appear because they aren't numerous enough.

You can get around such a problem by forcing a search in Summon with field searching in subject terms and the keyword you want. In Summon syntax you need to do subjectterms:("xxxxx") where xxxxx is the subject term to match a item with that subject term.

So for example you could do

"Class size" AND (subjectterms:("literature review") OR subjectterms:("bibliographie") OR subjectterms:("bibliography") OR subjectterms:("meta-analysis") OR subjectterms:("metaanalysis") OR subjectterms:("review") OR subjectterms:("reviews") OR subjectterms:("systematic review") OR subjectterms:("bibliographic literature") OR subjectterms:("bibliographical literature"))

and find articles with the phrase "class size" (or perhaps restricting this just to title might be even more relevant) and which had any of the subject terms listed. Try example search using Princeton's Summon.

A minor weakness with this search is that you will notice items with "book review" subject terms appearing but this can be easily excluded after search using the facet.

The main weakness of this approach was already hinted by the fact that you had to search for subject terms such as both meta-analysis and metaanalysis. The reason is the subject term facet is not controlled as it comes from items from varying sources which different standards, I have even seen examples where journal articles do not have any subject terms.

EDIT : It seems in Summon if you do subjectterms:("library"), it doesn't give you an exact search for items with the subject terms that is exactly library but you can get items with subject terms like say Science library or library & librarians. This is different from using the widget builder from Serialssolutions which gives you exact matches. But unfortunately the widget builder only lists a small subset of possible subject terms. 

That said at least Summon has a subject term facet, you can't do this for Google Scholar even if you wanted to. So what can you do instead if such an option is not possible and/or you don't trust the meta-data that allows you to refine on?

2. Matching phrases in titles, abstracts, and other important parts of the journal article

If you look through many review articles you will notice a pattern, they tend to have very similar titles. Something along the lines of

  • XXXX, a review of the literature
  • A survey of the literature on XXXX
  • XXXX, a survey of the literature

and many more.

The obvious idea here is to generate a complicated search strategy to match as many such titles as possible.

For example here's a fairly complicated string I am toying with to match in the title field in Summon or Google Scholar.

 ((literature AND (survey OR review)) OR (systematic review) OR meta-analysis OR meta-analytical OR "a review" OR "a survey" OR a mini review OR a brief review  OR
Bibliography OR Bibliographic)

PhraseTo matchComment
literature review (no quotes)a literature review, a review of the literatureProbably most accurate
"Systematic review"systematic reviewNot needed for non-medical ? 
meta-analysis OR meta-analyticalmeta-analytical analysishyperhens may make a difference, use * if it works
"A review"A reviewMay lead to false positives of the nature A review of etc , Some databases allow you to filter.
"A survey"A surveyMay lead to false positives of a real survey not survey of literature
literature survey (no quotes)survey of literature
"A brief review" or "A mini review"
bibliography OR bibliographicbibliographic survey,
a select bibliography


Here's a sample search in Summon

And some search results

The phrase meta-analysis is carrying a lot of the load due to nature of this topic but further down you can see other matches

I tried it with other topics and it seemed to be reasonably robust and in my subjective view outperformed similar searches by facet refinement. In fact the equalvant search strategy in Scopus , in my view gave better results in terms of recall than using the facet document type: review in Scopus. More testing needed of course.

A similar search in Google scholar  using  intitle:(xxxx) to match phrases in the title works well also.

I was feeling somewhat pleased with myself until while working through a Pubmed tutorial, I coincidentally stumbled upon the fact that such strategies are old hat in Medical community,

Correct me if I am wrong, any medical librarians reading this but while Pubmed has publication types for meta-analysis and reviews it doesn't actually assign a Publication type to systematic reviews.

This is very counter-intuitive for many reasons, one of which is because you see one of the filters under Article types is indeed systematic reviews , so if the rest are publication types, shouldn't that be as well?

But a close reading shows that while almost all the values in the Article Type filter is a publication type ,systematic review itself is not.

Here's what it says about Article Types.

In fact, this filter (which you can access via clinical query as well) is a search strategy.

Surprisingly isn't it? We talking about PubMed here where Medline articles are carefully indexed using the most specific Mesh heading and searchers do auto-exploding mesh heading searches. 

But it seems indeed while "Review" and "Systematic Review" is directly indexed/assigned systematic review isn't.

In short when you select Systematic review in the limiter it's doing XXXXX AND systematic [sb]. 

For those not into pubmed this is a subset[sb] search for systematic review and the subset is the predefined saved search strategy.

Very impressive indeed, makes my search string above look like child's play.

Further digging shows that people actually write papers on figuring out the best search strategy to pull out such papers, measuring both precision and recall of each search strategy against a certain known test bed.

Pubmed credits the current search strategy mostly to this paper

Taking advantage of the explosion of systematic reviews: an efficient MEDLINE search strategy. Eff Clin Pract. 2001 Jul-Aug;4(4):157-62. [PMID: 11525102].

It's a free paper worth a read. I couldn't locate any non-medical papers devoted to this area though -eg finding Survey/review articles/papers, though I think some might exist. (Is there a review article on this topic??)

To be fair my search strategy is a lot simpler than PubMed's because it was designed for Summon and Google Scholar.

The Pubmed search above has the advantage of matching on very fine grained fields not just the title [ti] including

This allows all kinds of sophisticated search strategies to maximize recall and precision. With Google Scholar all you can do is match on title.

Summon is a little better,  you can match on subject terms (already seen above or you can combine ORing that with the matches in title) and increase precision by exclude facet content such as book reviews in content type (that tend to appear as false positives) etc.

But you can't restrict matches to just abstract for example, so there is less room for sophisticated searches.


I wonder if it is worth doing such complicated searches as Summon and Google Scholar really isn't meant for such complicated logic searches as weird results may occur.

It is also unlikely users will bother with such complicated searches.

Personally I like Pubmed's idea of using a saved search strategy as a filter/facet so perhaps Summon could implement a similar idea that if you click will apply the search strategy subset of results against the current search.

So perhaps a new content type or under "Refine" could be added that does this automatically.

The challenge of course is you need to do a lot of tests to come up with a robust search strategy that works reasonably well for all fields, a taller order than just for one field, but arguably even a half working one could be worth having.

Also this strategy works only if there are such articles to be found of course, pair it with too broad a keyword or too specific a keyword and you will end up with nothing relevant. The right granularity of detail needs to be searched where a survey/review article is likely to exist.

Beyond even that we can look at Primo's ScholarRank technology which is supposed to be able to tell the difference between broad and narrow topic searches and for the former tries to rank review articles higher, which makes sense since if you are new to the field you are likely to benefit from such articles.

blog comments powered by Disqus

Share this!

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.
Related Posts Plugin for WordPress, Blogger...