Document Detail

Is the coverage of Google Scholar enough to be used alone for systematic reviews.
Jump to Full Text
MedLine Citation:
PMID:  23302542     Owner:  NLM     Status:  MEDLINE    
Abstract/OtherAbstract:
BACKGROUND: In searches for clinical trials and systematic reviews, it is said that Google Scholar (GS) should never be used in isolation, but in addition to PubMed, Cochrane, and other trusted sources of information. We therefore performed a study to assess the coverage of GS specifically for the studies included in systematic reviews and evaluate if GS was sensitive enough to be used alone for systematic reviews.
METHODS: All the original studies included in 29 systematic reviews published in the Cochrane Database Syst Rev or in the JAMA in 2009 were gathered in a gold standard database. GS was searched for all these studies one by one to assess the percentage of studies which could have been identified by searching only GS.
RESULTS: All the 738 original studies included in the gold standard database were retrieved in GS (100%).
CONCLUSION: The coverage of GS for the studies included in the systematic reviews is 100%. If the authors of the 29 systematic reviews had used only GS, no reference would have been missed. With some improvement in the research options, to increase its precision, GS could become the leading bibliographic database in medicine and could be used alone for systematic reviews.
Authors:
Jean-François Gehanno; Laetitia Rollin; Stefan Darmoni
Publication Detail:
Type:  Journal Article     Date:  2013-01-09
Journal Detail:
Title:  BMC medical informatics and decision making     Volume:  13     ISSN:  1472-6947     ISO Abbreviation:  BMC Med Inform Decis Mak     Publication Date:  2013  
Date Detail:
Created Date:  2013-01-15     Completed Date:  2013-03-29     Revised Date:  2013-07-11    
Medline Journal Info:
Nlm Unique ID:  101088682     Medline TA:  BMC Med Inform Decis Mak     Country:  England    
Other Details:
Languages:  eng     Pagination:  7     Citation Subset:  IM    
Affiliation:
Institute of Occupational Health, Rouen University Hospital and University of Rouen, 1 rue de Germont, 76000, Rouen, France. Jean-Francois.gehanno@chu-rouen.fr
Export Citation:
APA/MLA Format     Download EndNote     Download BibTex
MeSH Terms
Descriptor/Qualifier:
Information Storage and Retrieval / methods*
Internet*
PubMed
Comments/Corrections

From MEDLINE®/PubMed®, a database of the U.S. National Library of Medicine

Full Text
Journal Information
Journal ID (nlm-ta): BMC Med Inform Decis Mak
Journal ID (iso-abbrev): BMC Med Inform Decis Mak
ISSN: 1472-6947
Publisher: BioMed Central
Article Information
Download PDF
Copyright ©2013 Gehanno et al.; licensee BioMed Central Ltd.
open-access:
Received Day: 20 Month: 8 Year: 2012
Accepted Day: 4 Month: 1 Year: 2013
collection publication date: Year: 2013
Electronic publication date: Day: 9 Month: 1 Year: 2013
Volume: 13First Page: 7 Last Page: 7
PubMed Id: 23302542
ID: 3544576
Publisher Id: 1472-6947-13-7
DOI: 10.1186/1472-6947-13-7

Is the coverage of google scholar enough to be used alone for systematic reviews
Gehanno Jean-François12 Email: Jean-Francois.gehanno@chu-rouen.fr
Rollin Laetitia12 Email: laetitia.rollin@chu-rouen.fr
Darmoni Stefan2 Email: stefan.darmoni@chu-rouen.fr
1Institute of Occupational Health, Rouen University Hospital and University of Rouen, 1 rue de Germont, 76000, Rouen, France
2CISMeF-TIBS-LITIS EA 4108, Rouen University Hospital, Rouen, France

Background

The release of the beta version of Google Scholar (GS) (http://scholar.google.com) in November 2004 generated much media coverage and academic commentary. It has been met with both enthusiasm and criticism but Google and GS now lead more visitors to many biomedical journal websites than does Medline via its PubMed interface [1-3].

GS searches retrieve results that include scholarly literature citations as well as peer-reviewed publications, theses, books, abstracts, and other articles from academic publishers, professional organizations, and preprint repositories, universities, and other scholarly organizations. Therefore, GS is able to retrieve more types of literature compared with medical literature database retrieval search engines, like PubMed [4]. GS is also able to identify some of the references of PubMed, but not all [5].

Doctors are encouraged to consult GS for browsing and serendipitous discovery, not for literature reviews [1]. In searches for clinical trials and systematic reviews, it is said that GS should never be used in isolation, but in addition to PubMed, Cochrane, and other trusted sources of information [1]. Many studies have demonstrated that a single search engine does not capture all of the available articles, and using two or more databases provides greater coverage of all possible citations [6-17].

Nevertheless, the coverage of GS is increasing and, despite the fact that it is said to be not exhaustive, is it exhaustive enough for the studies that are considered of enough quality or relevance for systematic reviews [18].

Therefore, the objective of this study was to assess the coverage of GS, and its potential recall, specifically for such studies, and therefore to assess if this database could be used alone for systematic reviews.


Methods

The first step aimed at identifying a subset of studies selected by experts to be included in systematic reviews. We searched Medline in December 2009 for the systematic reviews published in the JAMA or the Cochrane Library. For the JAMA, we used the most specific search string proposed by Montori et al., with limits for the years 2008 and 2009 [19]. For the Cochrane Library, we examined all the systematic reviews published in the Cochrane Database Syst Rev. 2009 Jul 8;(3).

We excluded the systematic reviews using less than 2 bibliographic databases in their search and those which restricted the search to English language studies.

The gold standard database was then built by gathering all the studies included in the systematic reviews we selected, excluding abstracts and personal communications. We considered Gray literature (i.e. written material that is not published commercially or is not generally accessible) as a specific subset, but we included these references in the gold standard database.

GS was searched for each reference, one by one, by searching with the title of each of the studies included in the gold standard database. Recall (i.e. the proportion of studies retrieved from the database) of GS were computed for each review published in the Cochrane Library or the JAMA.


Results

Overall, 14 reviews from the Cochrane library and 15 reviews from the JAMA were included. To identify all the possible relevant studies, each systematic review from the Cochrane Library and from the JAMA had searched between 3 and 10 (mean: 5.4) and between 2 and 9 (mean : 4) different databases, respectively. All of them searched Medline and 17 mentioned to have also scanned the reference list of the studies they included.

The 29 systematic reviews had included 755 original studies. Among them, 733 were published in peer-reviewed journals and 5 were detailed only in document belonging to the gray literature. The 18 remaining studies were referenced only as an abstract or as personal communication and were therefore not included in the gold standard database, which included finally 738 original studies. All the 738 studies were identified in GS, leading to 100% coverage.

The detailed results are presented in Table 1.

As a side result, we discovered that a striking number of bibliographic references included major errors, i.e. errors that involve the data elements by which references are searched by users in Medline [20]. Overall, 10 references contained at least one major error, some of them containing up to 3 major errors.

Some of the reviews concentrated these citation errors. For example, among the 24 references included in the Cochrane review " The effects of antimicrobial therapy on bacterial vaginosis in non-pregnant women", 5 contained at least one major error.


Discussion

Performing systematic reviews is a complex and time consuming task, because of the body of literature to be searched and the high number of databases that must be used, considering that no one of them is considered exhaustive. The use of GS is increasing, as well as its coverage, and we wanted to assess if this coverage is high enough to be used alone in systematic reviews.

GS allowed to retrieve 100% of the studies included in the systematic reviews we studied, and which covered many different fields of medicine.

Although GS does not cover all the medical literature, we therefore observed that its coverage of the studies of sufficient quality or relevance to be included in a systematic review was complete. In other words, if the authors of these 29 systematic reviews had used only GS, they would have obtained the very same results.

The validity of our gold standard database could nevertheless be questioned. To identify the studies that worth to be included in a systematic review, we relied on the works of the experts used as reviewer in the systematic reviews we included, since all of them used at least 2 independent reviewers. Furthermore, we excluded from our gold standard database personal communications, because they cannot be retrieved by any database, and abstracts because it has been clearly demonstrated that such abstracts often display non-valid results [21,22]. Considering the methods used by the authors of the systematic reviews we selected, the use of at least two independent reviewers to select relevant articles in these reviews, the high number of databases searched and the absence of restriction to English studies in each of them, we can also assume that, for each topic covered, all the relevant studies were identified. Therefore, we can assume that our gold-standard database really included all the studies of sufficient quality and relevant to the topics covered by the systematic reviews, and only them.

We chose to study the systematic reviews published by the JAMA and Cochrane because they usually don’t restrict their search to English literature and they use more than one database to perform the search, which is not the case of most of the systematic reviews published by the Annals of Internal Medicine, for example.

Although the recall of GS was 100%, the amount of information delivered by GS was heterogeneous. Yet, some of the studies were only identified as "citations", which means that GS only displayed the authors, the title of the article and the name, year and pages of the journals. This can be considered as insufficient, but traditional biomedical databases (such as Medline or Embase) do the same for old articles or for articles published in another language that English. Furthermore, this is exactly the same situation when authors of systematic reviews perform hand searching in the reference list of selected articles. Therefore, we considered valid to include these hits as positive results.

This 100% coverage of GS can be seen as amazing, since no single database is supposed to be exhaustive, even for good quality studies. For example, the recall ratios of Medline for randomized control trials (RCTs) only stand between 35% and 56% [23,24]. Since GS accesses only 1 million of the some 15 million records at PubMed, how can our results be explained? In fact, through agreements with publishers, GS accesses the “invisible” or “deep” Web, that is, commercial Web sites the automated “spiders” used by search engines such as Google cannot access. Furthermore, we observed in our study that most of the articles indentified by GS were found directly on the publishing journal web-sites, and not on the PubMed web-site.

Nevertheless, while its advantages are substantial, GS is not without flaws. The shortcomings of the system and its search interface have been well documented in the literature and include lack of reliable advanced search functions (e.g. no MeSH term subheading search function), lack of controlled vocabulary, lack of a “similar pages” feature, and issues regarding scope of coverage and currency [4,5,25]. Furthermore, whereas PubMed displays results in a chronological order, GS places more relevance on articles that are cited most often. Therefore, the citations located are reportedly biased toward older literature [26,27]. This last point can also be viewed as an advantage, since it allows to identify quickly landmark articles, i.e. articles of importance in a field. Yet, when comparing searches with PubMed and Google Scholar by evaluating the first 20 articles recovered for four clinical questions for relevance and quality, Nourbakhsh and coll. demonstrated that GS provided more relevant results that PubMed, although the difference was not significant (p=0.116) [28].

GS has been reported to be less precise than PubMed, since it retrieves hundreds or thousands of documents, most of them being irrelevant [29,30]. Nevertheless, we should not overestimate the precision of PubMed in real life since Precision and recall of a search in a database is highly dependent on the skills of the user [10]. Many of them overestimate the quality of their searching performance, and experienced reference librarians typically retrieve about twice as many citations as do less experienced users [31,32].

Although this was not the purpose of our study, we tried to assess the precision of GS for some of the clinical questions that were studied by the systematic reviews.

For example, searching for "(Erythropoietin or Darbepoetin) cancer" in GS gave a recall of 100% and a precision of 0.1% (36,630 articles found, for 36 included in the systematic review). In GS, the search string "(depression treatment placebo antidepressant) ("general practice" OR "Primary care")" identified 16100 articles, leading to a recall of 100% and a precision of 0.09 (14 articles included in the corresponding systematic review).


Conclusion

In conclusion, the coverage of GS is much higher than previously thought for high quality studies. GS is highly sensitive, easy to search and could be the first choice for systematic reviews or meta-analysis. It could even be used alone. It just requires some improvement in the advanced search features to improve its precision and to become the leading bibliographic database in medicine.


Competing interests

The authors declare they have no competing interest.


Authors’ contribution

JFG conceived of the study. JFG and LR collected the data. JFG, LR and SJD analyzed the data and drafted the manuscript. All authors read and approved the final manuscript.


Pre-publication history

The pre-publication history for this paper can be accessed here:

http://www.biomedcentral.com/1472-6947/13/7/prepub


References
Giustini D,How Google is changing medicineBMJYear: 20053311487148810.1136/bmj.331.7531.148716373722
Lindberg DA,Searching the medical literatureNEJMYear: 2006354239316738283
Wang Y,Howard P,Google Scholar Usage: An Academic Library's ExperienceJ Web LibrarianshipYear: 2012629410810.1080/19322909.2012.672067
Freeman MK,Lauderdale SA,Kendrach MG,Woolley TW,Google Scholar versus PubMed in locating primary literature to answer drug-related questionsAnn PharmacotherYear: 20094347848410.1345/aph.1L22319261965
Shultz M,Comparing test searches in PubMed and Google ScholarJ Med Libr AssocYear: 20079544244510.3163/1536-5050.95.4.44217971893
Wilkins T,Gillies RA,Davies K,EMBASE versus MEDLINE for family medicine searches: can MEDLINE searches find the forest or a tree?Can Fam PhysicianYear: 200551849
Verbeek J,Salmi J,Pasternack I,Jauhiainen M,Laamanen I,Schaafsma F,Hulshof C,van Dijk F,A search strategy for occupational health intervention studiesOccup Environ MedYear: 20056268268710.1136/oem.2004.01911716169913
Minozzi S,Pistotti V,Forni M,Searching for rehabilitation articles on MEDLINE and EMBASE. An example with cross-over designArch Phys Med RehabilYear: 20008172072210857512
McDonald S,Taylor L,Adams C,Searching the right database. A comparison of four databases for psychiatry journalsHealth Libr RevYear: 19991615115610.1046/j.1365-2532.1999.00222.x10620849
Watson RJ,Richardson PH,Identifying randomized controlled trials of cognitive therapy for depression: comparing the efficiency of Embase, Medline and PsycINFO bibliographic databasesBr J Med PsycholYear: 19997253554210.1348/00071129916022010616135
Farriol M,Jordà-Olives M,Padró JB,Bibliographic information retrieval in the field of artificial nutritionClin NutrYear: 19981721722210.1016/S0261-5614(98)80062-910205345
Gehanno JF,Paris C,Thirion B,Caillard JF,Assessment of bibliographic databases performance in information retrieval for occupational and environmental toxicologyOccup Environ MedYear: 19985556256610.1136/oem.55.8.5629849544
Woods D,Trewheellar K,Medline and Embase complement each other in literature searchesBMJYear: 199831611669552968
Barillot MJ,Sarrut B,Doreau CG,Evaluation of drug interaction document citation in nine on-line bibliographic databasesAnn PharmacotherYear: 19973145498997464
Brazier H,Begley CM,Selecting a database for literature searches in nursing: MEDLINE or CINAHL?J Adv NursYear: 19962486887510.1046/j.1365-2648.1996.26426.x8894905
Burnham J,Shearer B,Comparison of CINAHL, EMBASE, and MEDLINE databases for the nurse researcherMed Ref Serv QYear: 199312455710132288
Gallagher KE,Hulbert LA,Sullivan CP,Full-text and bibliographic database searching in the health sciences: an exploratory study comparing CCML and MEDLINEMed Ref Serv QYear: 19909172510110455
Beckmann M,von Wehrden H,Where you search is what you get: literature mining – Google Scholar versus Web of Science using a data set from a literature search in vegetation scienceJ Veg SciYear: 20122361197119910.1111/j.1654-1103.2012.01454.x
Montori VM,Wilczynski NL,Morgan D,Haynes RB,Optimal search strategies for retrieving systematic reviews from Medline: analytical surveyBMJYear: 20053306810.1136/bmj.38336.804167.4715619601
Gehanno JF,Darmoni SJ,Caillard JF,Major inaccuracies in articles citing occupational or environmental medicine papers and their implicationsJ Med Libr AssocYear: 20059311812115685284
Scherer RW,Langenberg P,Von Elm E,Full publication of results initially presented in abstractsCochrane Database Syst RevYear: 20072MR00000517443628
Rollin L,Darmoni S,Caillard J,Gehanno J,Fate of abstracts presented at an International Commission on Occupational Health (ICOH) congress - followed by publication in peer-reviewed journals?Scand J Work Environ HealthYear: 20093546146510.5271/sjweh.136219851699
Türp JC,Schulte J,Antes G,Nearly half of dental randomized controlled trials published in German are not included in MedlineEur J Oral SciYear: 200211040541110.1034/j.1600-0722.2002.21343.x12507212
Hopewell S,Clarke M,Lusher A,Lefebvre C,Westby M,A comparison of hand searching versus MEDLINE searching to identify reports of randomized controlled trialsStat MedYear: 2002211625163410.1002/sim.119112111923
Aguillo IF,Is Google Scholar useful for bibliometrics? A webometric analysisScientometricsYear: 20129134335110.1007/s11192-011-0582-8
Henderson J,Google Scholar: a source for clinicians?CMAJYear: 20051721549155010.1503/cmaj.05040415939908
Vine R,Google ScholarJ Med Libr AssocYear: 2006949799
Nourbakhsh E,Nugent R,Wang H,Cevik C,Nugent K,Medical literature searches: a comparison of PubMed and Google ScholarHealth Info Libr JYear: 201229321422210.1111/j.1471-1842.2012.00992.x22925384
Anders ME,Evans DP,Comparison of PubMed and Google Scholar literature searchesRespir CareYear: 20125557858320420728
Mastrangelo G,Fadda E,Rossi CR,Zamprogno E,Buja A,Cegolon L,Literature search on risk factors for sarcoma: PubMed and Google Scholar may be complementary sourcesBMC Res NotesYear: 2010313110.1186/1756-0500-3-13120459746
Hersh WR,Hickam DH,How well do physicians use electronic information retrieval systems? A framework for investigation and systematic reviewJAMAYear: 19982801347135210.1001/jama.280.15.13479794316
Haynes RB,McKibbon KA,Walker CJ,Ryan N,Fitzgerald D,Ramsden MF,Online access to MEDLINE in clinical settings. A study of use and usefulnessAnn Intern MedYear: 199011278842403476

Tables
[TableWrap ID: T1] Table 1 

Recall of Google scholar for the 29 systematic reviews


Source of the systematic review Title of the systematic review Number of databases searched by the authors Number of studies included in the review Number of studies found in Google Scholar
Cochrane Library
Antidepressants versus placebo for depression in primary care
8
14
14
Cochrane Library
Artemisinin-based combination therapy for treating uncomplicated malaria
6
49
49
Cochrane Library
Brief interventions for heavy alcohol users admitted to general hospital wards
5
11
11
Cochrane Library
Combined DTP-HBV-HIB vaccine versus separately administered DTP-HBV and HIB vaccines for primary prevention of diphtheria, tetanus, pertussis, hepatitis B and Haemophilusinfluenzae B (HIB)
3
18
18
Cochrane Library
Erythropoietin or Darbepoetin for patients with cancer--meta-analysis based on individual patient data
3
39
39
Cochrane Library
Green tea (Camellia sinensis) for the prevention of cancer
7
51
51
Cochrane Library
Incentive spirometry for prevention of postoperative pulmonary complications in upper abdominal surgery
5
11
11
Cochrane Library
Interventions to prevent occupational noise induced hearing loss
10
20
20
Cochrane Library
Non-pharmacological interventions for assisting the induction of anaesthesia in children
7
17
17
Cochrane Library
Oral iron supplementation for preventing or treating anaemia among children in malaria-endemic areas
5
68
68
Cochrane Library
Pharmacotherapy for anxiety disorders in children and adolescents
4
25
25
Cochrane Library
Single dose oral flurbiprofen for acute postoperative pain in adults
4
11
11
Cochrane Library
The effects of antimicrobial therapy on bacterial vaginosis in non-pregnant women
5
24
24
Cochrane Library
Therapeutic interventions for symptomatic treatment in Huntington’s disease
4
20
20
JAMA
Acute-onset floaters and flashes: is this patient at risk for retinal detachment?
2
17
17
JAMA
Adiponectin levels and risk of type 2 diabetes: a systematic review and meta-analysis
3
14
14
JAMA
Allogeneic stem cell transplantation for acute myeloid leukemia in first complete remission: systematic review and meta-analysis of prospective clinical trials
3
17
17
JAMA
Aspirin for the prevention of cardiovascular events in patients with peripheral artery disease: a meta-analysis of randomized trials.
4
15
15
JAMA
Bed bugs (Cimexlectularius) and clinical consequences of their bites.
2
49
49
JAMA
Cancer survivors and unemployment: a meta-analysis and meta-regression.
5
24
24
JAMA
Cardiorespiratory fitness as a quantitative predictor of all-cause mortality and cardiovascular events in healthy men and women: a meta-analysis.
2
32
32
JAMA
Combined corticosteroid and antiviral treatment for Bell palsy: a systematic review and meta-analysis.
6
17
17
JAMA
Corticosteroids in the treatment of severe sepsis and septic shock in adults: a systematic review
4
19
19
JAMA
Diagnostic performance of computed tomography angiography in peripheral arterial disease: a systematic review and meta-analysis
3
20
20
JAMA
Interaction between the serotonin transporter gene (5-HTTLPR), stressful life events, and risk of depression: a meta-analysis.
3
14
14
JAMA
Lipoprotein(a) concentration and the risk of coronary heart disease, stroke, and nonvascular mortality.
2
36
36
JAMA
Predictive value of factor V Leiden and Prothrombin G20210A in adults with venous thromboembolism and in family members of those with a mutation. A systematic review
5
46
46
JAMA
Sexual abuse and lifetime diagnosis of somatic disorders: a systematic review and meta-analysis
9
22
22
JAMA
Treatment of fibromyalgia syndrome with antidepressants: a meta-analysis.
6
18
18
Total     738 738 (100%)


Article Categories:
  • Research Article

Keywords: Bibliometrics, Google scholar, Information retrieval methods, Systematic reviews.

Previous Document:  Telomere Length in Epidemiology: A Biomarker of Aging, Age-Related Disease, Both, or Neither?
Next Document:  Population-based outcomes after brain radiotherapy in patients with brain metastases from breast can...