Laboratory-guided detection of disease outbreaks: three generations of surveillance systems.
Abstract: * Context.--Traditional biothreat surveillance systems are vulnerable to incomplete and delayed reporting of public health threats.

Objective.--To review current and emerging approaches to detection and monitoring of biothreats enabled by laboratory methods of diagnosis and to identify trends in the biosurveillance research.

Data Sources.--PubMed (1995 to December 2007) was searched with the combined search terms "surveillance" and "infectious diseases." Additional articles were identified by hand searching the bibliographies of selected papers. Additional search terms were "public health," "disease monitoring," "cluster," "outbreak," "laboratory notification," "molecular," "detection," "evaluation," "genomics," "communicable diseases," "geographic information systems," "bioterrorism," "genotyping," and "informatics." Publication language was restricted to English. The bibliographies of key references were later hand searched to identify articles missing in the database search. Three approaches to infectious disease surveillance that involve clinical laboratories are contrasted: (1) laboratoryinitiated infectious disease notifications, (2) syndromic surveillance based on health indicators, and (3) genotyping based surveillance of biothreats. Advances in molecular diagnostics enable rapid genotyping of biothreats and investigations of genes that were not previously identifiable by traditional methods. There is a need for coordination between syndromic and laboratory-based surveillance. Insufficient and delayed decision support and inadequate integration of surveillance signals into action plans remain the 2 main barriers to efficient public health monitoring and response. Decision support for public health users of biosurveillance alerts is often lacking.

Conclusions.--The merger of the 3 scientific fields of surveillance, genomics, and informatics offers an opportunity for the development of effective and rapid biosurveillance methods and tools.

(Arch Pathol Lab Med. 2009;133:916-925)
Article Type: Report
Subject: Epidemics (Canada)
Epidemics (United States)
Epidemics (Surveys)
Communicable diseases (Surveys)
Communicable diseases (Diagnosis)
Bioterrorism (Surveys)
Molecular biology (Innovations)
Molecular biology (Usage)
Authors: Sintchenko, Vitali
Gallego, Blanca
Pub Date: 06/01/2009
Publication: Name: Archives of Pathology & Laboratory Medicine Publisher: College of American Pathologists Audience: Academic; Professional Format: Magazine/Journal Subject: Health Copyright: COPYRIGHT 2009 College of American Pathologists ISSN: 1543-2165
Issue: Date: June, 2009 Source Volume: 133 Source Issue: 6
Product: Product Code: 8521213 Molecular Biology NAICS Code: 54171 Research and Development in the Physical, Engineering, and Life Sciences
Geographic: Geographic Scope: Canada; United States Geographic Code: 1CANA Canada; 1USA United States
Accession Number: 230152020
Full Text: Biosurveillance has been made a health care priority because of rising concerns over emerging infectious diseases and possible bioterrorism. (1,2) The number of microbial threats--in the form of newly identified pathogens, infections crossing the species barrier to people, diseases and vectors adapting to new environments, and microbial agents appearing in more virulent forms--has multiplied to an unprecedented degree. (2,3) New and newly recognized infectious diseases are now being reported at the rate of about 1 per year. At least 33 completely new pathogens, including HIV and severe acute respiratory syndrome, have emerged during the past 3 decades. (3) In addition, the epidemiology of well-known infectious diseases has been changing because of the globalization of trade and in response to immunization campaigns. This changing epidemiology presents new challenges to countries, both in terms of the understanding and monitoring of determinants of infections and in terms of service provision and the implementation of appropriate prevention measures.

Traditional biothreat surveillance systems are vulnerable to the incomplete and delayed reporting of public health threats. (4) Recent outbreaks of reemerging and new communicable diseases have highlighted inefficiencies in public health monitoring and control systems. Specifically, many outbreaks have been characterized by delayed recognition and/or public health response. (5,6) For example, analysis of 51 outbreaks reported in the United States between 1999 and 2000 demonstrated that only 42% were detected within 1 week of the first case, and 29% were identified after a month or more. (7) Such delays diminish the window of opportunity to mount effective response measures and are likely to be costly to society. (8) It was estimated that in Canada, a 1-week delay in the implementation of control measures for severe acute respiratory syndrome resulted in a 2.6-fold increase in the mean epidemic size and a 4-week extension of the mean epidemic duration. (9) Deaths from anthrax would be expected to double if the detection delay for an attack increased from 2 to 4.8 days. (10) On the other hand, the rapid identification of outbreaks and implementation of control measures have been crucial in limiting the impact of epidemics, both in preventing more casualties and in shortening the period during which the stringent control measures were needed. (10-13)


The aim of this paper was to review current and emerging approaches to the detection and monitoring of bio-threats through laboratory methods of diagnosis and to identify trends in biosurveillance research.


Three approaches to infectious disease surveillance that involve clinical laboratories can be contrasted.

Laboratory-Initiated Infectious Disease Notifications

Clinical laboratories have been making most notifications about infectious diseases of public health importance. (14) The conventional microbiological diagnosis of infectious diseases has historically enabled a profound change in their management, mainly by means of phenotypic approaches to pathogen identification and antibiotic susceptibility testing (Figure 1). However, phenotypic assays, such as microbial culture of fastidious microorganisms and, especially, convalescent serology, require a significant amount of time before the final results are reported. Nevertheless, timeliness has often been better for laboratory than for clinician case notifications, primarily as a result of the more commonly automated, electronic nature of laboratory reporting. (14,15) For example, the electronic laboratory reporting systems monitor coded organism names or procedures in each HL7 message and extracted relevant ones with at least 80% completeness. (16)

Only a small fraction of human infections is sampled and confirmed by laboratory means. Differences in case definitions, notably the requirement for laboratory confirmation for reporting, affect the relative specificity of notification systems. Variations in the accessibility of health care affect the representativeness of reporting systems, with some countries having significant geographic differences in coverage, as well as different reporting rates from different sites where infections are diagnosed and treated. However, the coverage of universal reporting systems can vary, from 10% to 25% of diagnosed cases in countries like the Netherlands to close to 100% in Sweden. (14) The sensitivity of surveillance systems (that is, the proportion of all cases occurring that are detected) depends not only on the coverage of the system (that is, the proportion of all cases diagnosed that are reported), but also on the proportion of cases occurring that are diagnosed. This is particularly so for frequently asymptomatic infections. Therefore, the numbers of laboratory notifications reflect a complex mix of infection diagnosis and screening practices, contact tracing practices, performance characteristics of diagnostic tests used, and the coverage of laboratory-based surveillance. The specificity of electronic reporting is particularly high for diseases diagnosed by laboratory tests with a low rate of false-positives. (16) Electronic reporting significantly increases the total number of notifications and improves their timeliness and completeness, thus facilitating the more rapid and comprehensive institution of disease control measures. (16,17)

Syndromic Surveillance

New health indicator surveillance systems, such as those termed syndromic surveillance systems (SSS), are potentially more rapid and sensitive than traditional methods for the detection of outbreaks or bioterrorism-related events. (6,18-21) They were introduced to follow the temporal and spatial distribution of outbreaks, to monitor seasonal trends in disease incidence, and to provide reassurance that an outbreak has not occurred. (22) SSSs monitor health care utilization patterns in real time and rely on the use of data collected electronically for other purposes. Current SSSs model the average pattern of patients reporting to primary care physicians or emergency departments and signal an alarm whenever the observed pattern of patients diverges significantly from the normal one. Reporting sources include emergency departments, intensive care units, hospital admission and discharge systems, and laboratories. The number of laboratory requests, for example, for cerebral fluid microbiology may be used to monitor potential outbreaks of encephalitis and meningitis. However, such data types are only surrogate markers and may introduce confounding factors and nonspecific information noise into the outbreak "signal." (6,21) Because syndromic data are gathered before laboratory results are reported, health departments may be able to recognize increases in disease incidence before formal diagnoses are made and to respond to outbreaks early in their course. The accuracy of aberration detection signals generated by SSSs is dependent on the amount of baseline data. The characterization of a baseline of "normal" visits and the identification of an appropriate threshold above which the system should signal an alarm have been the main challenges in the SSS implementation. (23,24)

The sensitivity and specificity of alarms generated by SSSs range from 55% to 96%. (21,25,26) These systems reliably identify large outbreaks, but the false-positive rate can reach 60% when the number of reported patients falls below 30% of the average normal pattern. (22,23) Evidence indicates that the integration of multiple data sources can significantly improve detection accuracy. (24,27) It has been suggested that SSSs are best suited for detecting diseases that have a narrow incubation period, a steep epidemic curve, a long prodromal phase, and are tested for routinely. (23) Importantly, there are variations in the definitions of communicable disease clusters and in investigative methods because of differences in local epidemiology and the availability of public health resources. It is therefore critical to have outbreak definitions (in the absence of epidemiologic information) that optimize the limited resources of public health practitioners while preventing further spread.

Evidence suggests that SSSs can potentially be used in coordination with laboratory-based surveillance. (28) The high sensitivity and low specificity of syndromic surveillance can complement the high specificity but low sensitivity of diagnostic microbiology. For example, syndromic surveillance alerts can act as triggers for different laboratory testing algorithms (Figure 2). In one study, self-sampling of callers to the UK National Health Service call line provided the earliest reports of the influenza season and augmented community-based surveillance programs. (29)

Genomics-Based Biothreat Monitoring

Advances in molecular biology have resulted in the development of rapid diagnostic tests for the detection of microbes and their markers of virulence and antimicrobial resistance. These approaches combine gene amplification and genotyping and involve investigations of genes that were not previously identifiable by traditional methods. For the past decade, the routine use of molecular subtyping has increased the sensitivity of outbreak detection and the specificity of outbreak investigations at local, state, and international levels. (30,31) Novel typing methods, such as oligonucleotide motif frequency and epidemiologic monitoring of gene cluster evolution on antigenic shift and drift, have revolutionized laboratory capacity for the detection of outbreaks and have often shifted the initiative in the early warning from public health units to clinical laboratories. (32-34) High-throughput pathology testing techniques generate real-time data that enable the development of tools for understanding and responding to an outbreak as it unfolds. (35,36)

There is increasing evidence of the value of rapid molecular profiling as a means to assist outbreak detection and infection control. (36-39) This can be of particular importance, because the monitoring of mobile genetic elements spreading antibiotic-resistance genes in hospital settings has become essential for successful infection control. (15,37,38) In one prospective study, automated clonal alerts based on the real-time subtyping of hospital methicillin-resistant Staphylococcus aureus isolates and temporal-scan test statistics were 100% and 95.2% sensitive and specific, respectively, in identifying outbreaks and more sensitive and timely than infection control nurses. (39)

More bacterial genomes are being sequenced by the growing number of laboratories and genomic data stored and annotated online. (36) Public databases, such as MLSTNet (; accessed December 16, 2008), PulseNet (; accessed December 16, 2008), and the BioPortal (www.; accessed December 16, 2008), among others, allow access and matching of bacterial or viral isolates. (40) Linking systematically annotated profiles with clinical and research databases can identify previously unrecognized associations between phenotype, genotype, environment, and host responses, and it can potentially identify the specific genes that govern them. Networks, which are created by relationships between phenotype, disease expression, environment, experimental context, and associated genes with differential expression, could provide new insights into microbial interactions and pathogenesis. (15) This approach has been fruitful in metagenomics, and information management systems capable of assisting with genotyping or functional genomics are being developed. (41-43) The differences between 3 approaches to surveillance or 3 "generations" of biosurveillance are contrasted in the Table.

Portable molecular diagnostics have been used in different biosensors and environmental monitoring systems in order to detect the release of a biologic agent before the onset of symptoms in exposed subjects. (44) Such systems have been developed for the detection of agent release in indoor or outdoor settings and employ real-time nucleic acid amplification or antibody assay methodologies. (45) Programs such as BioWatch and Biological Aerosol Sentry and Information System (BASIS) use sampling stations to periodically pull air samples through a filter, which is then analyzed. These systems perform with minimal false alarms, and a semiautomated version of a Biological Aerosol Sentry and Information System laboratory allows 2 technicians to complete 10000 polymerase chain reaction assays on 1000 samples in an 8-hour period. (1) Similarly, the Autonomous Pathogen Detection System (APDS) is a fully automated system that can perform 100 simultaneous measurements every 30 minutes for more than a week and has the capacity to detect about 30 different pathogens. (1,44) Positive alarms from such systems must still undergo confirmation by traditional laboratory methods until more information is gained regarding the performance of these types of systems in the field. There is a pressing need to improve the scaling of the algorithms' and protocols' underlying sensor networks, as well as such systems' architectures. (45)


Figure 3 summarizes existent and proposed data flows of 3 approaches to surveillance from data acquisition to application; namely, laboratory-initiated infectious disease notifications (Figure 3, A), syndromic surveillance (Figure 3, B), and genomics-based biothreat monitoring (Figure 3, C). They are based on event-driven processing and highlight the importance of adequate public health resources at the receiving end of the data flow. Genomics-based monitoring shifts the power to generate surveillance alerts from public health units to a laboratory. The wealth of electronic data collected by syndromic and genomic surveillance requires new technologies for data warehousing and the temporospatial analysis of outbreaks that are not currently supported by laboratory information systems. (15)


Standards for Microbial Data Representation

The medical and cost benefits of highly integrated, comprehensive disease control programs that include routine microbial genotyping have been demonstrated, yet incorporating multiple data sources remains a challenge. (46,47) The need for models that define discrete data elements in communicable disease informatics, and the relationships between them, have been identified. The ability to capture and share profiling data depends on their vocabulary (the words or individual components), syntax (the "sentence" structure), and messaging protocols. The most developed health care vocabularies complementing each other are the United Medical Language System (UMLS; National Library of Medicine), Logical Observation Identifier Names and Codes (LOINC; Regenstrief Institute), and Systematized Nomenclature of Medicine (SNOMED; College of American Pathologists). (47-49) Synergistically, these vocabularies can support the integration of high-level terms used in decision rules (eg, "sepsis") with the relatively low-level terms used in the clinical records (eg, "bacterial blood culture").

Data models employed in large-scale relational databases often do not satisfy the potential complexity of genomic data. Object-oriented modeling has been suggested as an alternative and has been extensively applied to biomedical data. Specifically, the Extensible Markup Language (XML) has come to prominence as the preferred syntax for structured data transmission on the Web, thereby providing a consistent data format based on a hierarchical structure with custom tag creation. (48,50) The Pathogen Information Markup Language has been developed to enhance the interoperability of microbiology data sets for pathogens with epidemic potential. (51) It offers a means to capture the data elements essential for describing many of the determinants of pathogen profiles. This technique is important to communicable disease reporting systems and provides close to real-time aggregated and geographically mapped data.

The role of ontologies has been increasing as the "semantic web" and grid technologies enable the building of distributed knowledge resources, similar to the World Wide Web but precise enough for automated reasoning. (52) Ontology is a formal machine-readable representation of knowledge providing a middle layer to map references to respective concepts. (53,54) Ontologies are important for bio surveillance because they allow data integration and improve the capacity of laboratory information systems to measure multiparameter responses over time. (2) Efforts are underway to develop and implement generic ontologies for genomics. For example, the Gene Ontology (GO) (53) has been organized as a directed acyclic graph whose nodes represent the strict vocabulary that describes key biologic functions and processes. It has 3 independent levels of terms, which are biological processes, molecular function, and cellular component. The GO graphs are stored as XML files in order to allow their fast reconstruction for the web service. Gene Ontology has been designed for description and querying, but it also is being used for text mining. A hierarchical structure of GO prevents linking concepts from one level to another. An ontology itself can be expressed as topic maps or in the Resource Description Framework. (55) XML topic maps have been developed to support a machine-understandable Internet, referred to as the "semantic web." (47,56) A topic map explicitly defines objects (ie, microbial pathogens in our case), their attributes (ie, virulence markers, genotypes, genotyping methods, etc), and relationships between them. The Resource Description Framework is an advanced data model developed specifically for data integration through the semantic web by capturing the object, its attributes, and its values. The Resource Description Framework data model can be a powerful tool capable of describing pathogen profiles and linking them to outbreak analysis tools.


Laboratory Database Networks

Information relevant to microbial profiling exists in a variety of sources and formats. A compilation of microbial reference sequences (RefSeq) specifying gene name and DNA sequences can be found at http://www.ncbi.nlm.nih. gov/genomes/lproks.cgi (bacterial RefSeq; accessed December 16, 2008), FUNGI/funtab.html (fungal RefSeq; accessed December 16, 2008), and cgi?taxid = 10239&type=5&name=Viruses (viral RefSeq; accessed December 16, 2008). Public electronic bacterial typing databases, such as MLSTNet ( databases/default.asp; accessed December 16, 2008), PulseNet (; accessed December 16, 2008), the SPOTCLUST ( ~vitoli/InfoWeb/Info/Info.html; accessed December 16, 2008), and others, employ a Web-based format, allowing universal access and matching bacterial or viral isolates to each other and to those represented in databases. More recently, structured polymorphism databases have been built, but data sharing and integration remain difficult because of the lack of common structures. (52,57) Several hundred public-domain molecular biology databases are currently online, but only a few contain raw data. Most represent the efforts of individuals to organize, annotate, and interpret data from other sources. These databases are highly valued and increasingly expected to replace paper publication as the medium of communication. A number of these are classification databases (eg, spa typing of S aureus (www.spaServer.; accessed December 16, 2008), SPOTCLUST of Mycobacterium tuberculosis, etc). (37,38) In each case, a similarity measure, defined as a set of threshold criteria, is used as to group types.

MLSTNet and PulseNet can be considered examples of more advanced databases. At the core of the MLSTNet concept is the provision of freely accessible nucleotide sequence databases, which act as a common dictionary to enable the direct comparison of bacterial isolates without requiring the physical exchange of cultures. In this sense, they provide the basis of a common language for bacterial typing. (40,44) In contrast to archival databases, such as GenBank, MLSTNet databases are curated for accuracy. Furthermore, to overcome the limitations of the first MLSTNet stand-alone Web sites, a new network-based database (MLSTdB-Net) has been implemented, with more than 30 MLST schemes for different bacterial species hosted at 33 Web sites. (44) Some of the MLST Web sites allow researchers to run and curate their own schemes remotely. The PulseNet system, based on PFGE patterns, represents the most developed system for the characterization of bacterial isolates with a fingerprinting approach. It is one of the few networks that integrate epidemiologic and typing data over wide geographic regions. (30,44)

Pathogen-specific data collections have been developed to type relatively simple viral genomes using partial sequences, pair-wise sequence alignment, multiple-sequence alignment, and phylogenetic inference. (37) Databases and software tools developed for the analysis of molecular sequences and microarrays are helpful, but they are limited by the unique attributes of clinicogenomic profiles and by differing application goals. (15,37,38)

Global Laboratories

A particularly interesting prospect of this framework is the integration of molecular typing with epidemiologic information, potentially achieving the global real-time epidemiologic surveillance of pathogens with epidemic potential. Surveillance typing networks have improved the speed of outbreak detection and made possible the exchange of data between laboratories. (46,47,58,59) In such "online" surveillance systems, novel and previously characterized strains can be compared and grouped by cluster analysis. Spatial surveillance using emerging geographic information systems enhances the likelihood that even localized events will be detected and their extent and variables measured in space and time. The output from these systems ultimately needs to be integrated into clinical and diagnostic processes. Layne and Beugelsdijk (59) initiated the development of global laboratories with the capacity for rapid and consistent monitoring of circulating strains for vaccine development, serving as a pandemic sentinel, and expanding surveillance to animals. Such a lab could generate a petabit-sized (about 1015 bits) database on influenza viruses worldwide for shared use. (59)

New World Health Organization (WHO) infrastructure was formalized in 2000, when the WHO launched the Global Outbreak Alert and Response Network. (3) The network interlinks, in real time, a large number of existing networks which together possess much of the data, expertise, and skills needed to keep the international community constantly alert and ready to respond. The WHO conducts a number of activities aimed at helping countries strengthen their capacity and take advantage of new tools. For example, the HealthMap (; accessed December 16, 2008), a news surveillance system, lets users search for mentions of specific diseases and then displays the search results on a map to reveal where disease patterns are concentrated and how the disease might be spreading. There are other advanced interactive information and mapping systems and remote data-sensing capabilities that can recognize environmental conditions favorable for an outbreak. The International System for Total Early Disease Detection (InStedd) has been developing a pilot syndromic surveillance system in Southeast Asia through its Mekong Collaboration Program (http://; accessed December 16, 2008). Of the informal sources, one of the most important is a semi-automated electronic system that continuously scours world communications for rumors of unusual disease events. This is the Global Public Health Intelligence Network (GPHIN) electronic surveillance system, developed for the WHO in 1997 in partnership with Health Canada. From July 1998 to August 2001, the WHO verified 578 outbreaks, of which 56% were initially picked up by the Global Public Health Intelligence Network. The system searches for tell tale keywords in 9 languages from thousands of news sources. (3)


Rapid assessment of public health options can be critical. However, the efficiency and speed with which resource managers can combine models with highly dispersed data sets depend on access to appropriate computational tools. Insufficient and delayed decision support and inadequate integration of surveillance signals into action plans remain the 2 main barriers to efficient public health monitoring and response. (5-7) There is an apparent lack of electronic and near-real-time decision support for public health users of biosurveillance alerts with different time frames and priorities. (6) Not surprisingly, decision making, resource utilization, and management were identified as major problem areas by professionals faced with critical decisions about possible biothreats. (13,26) Rapid confirmation of the specific infection and assessment of the epidemic curve before and after a public health intervention are the key elements of effective outbreak detection and control strategies. (10,36) Consequently, there is a need to refine analytic methods to improve pattern recognition and integration of multiple streams of data so that significant events can be detected and optimal decisions made.

Laboratory notification patterns and pathogen profiles can be characterized as nonlinear models of different complexity. Learning from data with large numbers of noisy and interdependent variables and limited domain models has remained a challenge to both biomedical scientists and informaticians. Modular approaches (separate learning algorithms and principal component analysis for dimensionality reduction and the identification and classification of markers) appear to provide better discriminatory power than single statistic algorithms. (60-62) Classification and regression trees, multivariate regression, tree-structured classifiers, support vector machines, and Bayesian probabilistic and neural networks have been validated for biosurveillance. (13,63) The most validated techniques include univariate detection algorithms based on time series models or regression models, which monitor a single variable. The need for cross-validation for these posthoc analyses has to be acknowledged to ensure the sensitivity and specificity of alerts. (64)

Although the geographic information systems have been in development for more than 20 years, recent releases of more advanced software have made geographic information systems substantially easier to use for nonspecialists. (65,66) The systems offer new opportunities for epidemiology because they give an informed user an important spatial perspective of communicable diseases. Used to their optimal level, as tools for analysis and decision making, they are new information management vehicles with a rich potential for public health. (64,67) Spatial scan statistic algorithms can detect regions where the probability of an incident case occurring inside the scanning window is higher than outside, (64,68) and they have been fitted to time series analysis methods, (69) extended to include the time dimension into spatiotemporal scan statistic methods, (70) and combined with empirical Bayesian models. (71,72) The development of geographic information systems-enabled decision-support tools as described earlier can improve the capability for rapid responses in a spatial context to quarantine, vaccinate, etc. The availability of Internet-based cartographic representation software and customizable georeferencing tools allows clinicians and public health practitioners to discover the causes for geographic variations that may potentially have been previously unidentified risk factors. Spatial and temporal mapping has been useful for defining areas of infection, identifying social networks, and interrupting transmission during outbreaks. (64,73) The goals of mapping have been to identify features that influence the spatial distribution of pathogens and/or infection using a combination of point (occurrence of the notification event) and areal (ratios of positive samples in each cell of a map) events. Maps of disease distribution or antibiotic resistance enable the implementation of targeted prevention and education campaigns (74) and antibiotic-prescribing choices. (75-77)

Many methods for exploratory analysis in geographic information systems are not appropriate for the field of infectious diseases, because the methods are essentially static and assume independence. (62,64) For infectious disease, cases clearly are not independent, and the disease moves through time and space. Thus, spatial autocorrelation methods and space-time correlograms are more suitable for the exploration of spatial and temporal patterns. Crude incident rate estimates, empirical Bayes standardization, hierarchical regression, and Bayesian maximum entropy models are the most commonly used approaches for the analysis of temporospatial dependencies and patterns of diseases. (61) They successfully address the high variance of estimates in small geographic areas, posterior predictive densities, and multilevel hierarchies, and they also enable risk maps. Epidemic compartmental models and, more recently, small-world networks have been applied to optimize outbreak detection with a range of pattern-matching, weighting, and competitive-learning algorithms. (44,67) New, improved methods are urgently needed to present spatial epidemiologic data and to determine probable pathogen exposure sites that will yield reliable results while taking into account the economic and time constraints of the public health systems and attending physicians. (78)

In conclusion, the merger of the 3 scientific fields of surveillance, genomics, and informatics offers a novel and revolutionary approach for the rapid development of effective biosurveillance methods and tools. Microbial genotyping improves the efficiency of outbreak investigations because it confirms or refutes epidemiologic links among cases and between cases and potential environmental sources, thus triggering public health investigations. However, the utility of pathogen profiling goes beyond specific questions related to the investigation of possible outbreaks. It can also be used for disease monitoring by identifying the transmission events and the associations between microbial types and clinical outcomes. Molecular profiling can assist in the assessment of the reproductive rates of infection during epidemics and in making infection control policies more pathogen specific. Molecular typing also facilitates the detection of chains and patterns of infection transmission and the construction of epidemic trees. In essence, it enables preemptive medicine. It makes medicine preemptive on the individual patient level in the sense that various diseases will be diagnosed and cured early in their development stages, and on a population health level in a sense that disease outbreaks will be detected and acted upon with greater speed and efficiency.


(1.) Fitch JP, Raber E, Imbro DR. Technology challenges in responding to biological and chemical attacks in the civilian sector. Science. 2003;302:13501354.

(2.) Crubezy M, O'Connor M, Pincus Z, Musen MA, Buckeridge DL. Ontology-centered syndromic surveillance for bioterrorism. IEEE Intell Syst. 2005;20(5):2 635.

(3.) Heymann DL, Rodier GR; WHO Operational Support Team to the Global Outbreak Alert and Response Network. Hot spots in a wired world: WHO surveillance of emerging and re-emerging infectious diseases. Lancet Infect Dis. 2001;1:345-353.

(4.) Sosin DM. Syndromic surveillance: the case for skilful investment. Biosecur Bioterr. 2003;1:247-253.

(5.) Arnold JL. Disaster medicine in the 21st century: future hazards, vulnerabilities, and risk. Prehosp Disast Med. 2002;17:3-11.

(6.) Bravata DM, McDonald KM, Smith WM, et al. Systematic review: surveillance systems for early detection of bioterrorism-related diseases. Ann Intern Med. 2004;140:910-922.

(7.) Dato V, Wagner MM, Fapohunda A. How outbreaks of infectious disease are detected: a review of surveillance systems and outbreaks. Public Health Rep. 2004;119:464-471.

(8.) Kaufmann AF, Metzer MI, Schmid GP, et al. The economic impact of a bioterrorist attack: are prevention and post-attack intervention programs justifiable? Emerg Infect Dis. 1997;2:83-94.

(9.) Wallinga J, Teunis P. Different epidemic curves for Severe Acute Respiratory Syndrome reveal similar impacts of control measures. Am J Epidemiol. 2004;160: 509-516.

(10.) Wein LM, Craft DL, Kaplan EH. Emergency response to an anthrax attack. Proc Natl Acad Sci USA. 2003;100:4346-4351.

(11.) Svoboda T, Henry B, Schulman L, et al. Public health measures to control the spread of the Severe Acute Respiratory Syndrome during the outbreak in Toronto. N Engl J Med. 2004;350:2352-2361.

(12.) Buckeridge DL, Burkom H, Campbell M, Hogan WR, Moore AW. Algorithms for rapid outbreak detection: a research synthesis. J Biomed Inform 2005; 38:99-113.

(13.) Buckeridge DL. Outbreak detection through automated surveillance: A review of the determinants of detection. J Biomed Inform. 2007;40:370-379.

(14.) Lowndes CM, Fenton KA; the ESSTI (European Surveillance of STIs) Network. Surveillance systems for STIs in the European Union: facing a changing epidemiology. Sex Trans Infect. 2004;80:264-271.

(15.) Sintchenko V, Iredell JR, Gilbert GL. Genomic profiling of pathogens for disease management and surveillance. Nat Microbiol Rev. 2007;5:464-470.

(16.) Panackal AA, M'ikanatha NM, Tsui FC, et al. Automatic electronic laboratory-based reporting of notifiable infectious diseases at a large health system. Emerg Infect Dis. 2002;8:685-691.

(17.) Effler P, Ching-Lee M, Bogard A, leong MC, Nekomoto T, Jernigan D. Statewide system of electronic notifiable disease reporting from clinical laboratories. JAMA. 1999;282:1845-1850.

(18.) Lewis MD, Pavlin JA, Mansfield JL, et al. Disease outbreak detection system using syndromic data in the greater Washington, DC area. Am J Prev Med. 2002; 23:180-186.

(19.) Tsui FC, Espino JU, Dato VM, Gesteland PH, Hutman J, Wagner MM. Technical description of RODS: a real-time public health surveillance system. J Am Med Inform Assoc. 2003;10:399-408.

(20.) Widdowson MA, Bosman A, van Straten E, et al. Automated, laboratory-based system using the Internet for disease outbreak detection, the Netherlands. Emerg Infect Dis. 2003;9:1046-1052.

(21.) Lombardo J, Burkom H, Elbert E, et al. A systems over view of the Electronic Surveillance System for the Early Notification of Community-Based Epidemics (ESSENCEII). J Urban Health. 2003;80(suppl1):i32-i42.

(22.) Muscatello DJ, Churches T, Kaldor J, et al. An automated, broad-based, near real-time public health surveillance system using presentations to hospital Emergency Departments in New South Wales, Australia. BMC Public Health. 2005;5:141.

(23.) Wang L, Ramoni MF, Mandl KD, Sebastiani P. Factors affecting automated syndromic surveillance. Artif Intell Med. 2005;34:269-278.

(24.) Berger M, Shiau R, Weintraub JM. Review of syndromic surveillance: implications for waterborne disease detection. J Epidemiol Community Health. 2006;60:543-550.

(25.) Weber SG, Pitrak D. Accuracy of a local surveillance system for early detection of emerging infectious disease. J Am Med Assoc. 2003;290:596-598.

(26.) Wagner MM, Dato V, Dowling JN, Allswede M. Representative threats for research in public health surveillance. J Biomed Inform. 2003;36:177-188.

(27.) Mandl KD, Overhage JM, Wagner MM, et al. Implementing syndromic surveillance: a practical guide informed by the early experience. J Am Med Inform Assoc. 2004;11:141-150.

(28.) Buehler JW, Hopkins RS, Overhage JM, Sosin DM, Tong V; CDC Working Group. Framework for evaluating public health surveillance systems for early detection of outbreaks: recommendations from the CDC Working Group. MMWR Recomm Rep. 2004;53(RR-5):1-11.

(29.) Cooper DL, Smith GE, Chinemana F, et al. Linking syndromic surveillance with virological self-sampling. Epidemiol Infect. 2008; 136:222-224.

(30.) Hedberg CW, Besser JM. Cluster evaluation, PulseNet, and public health practice. Foodborne Pathog Dis. 2006;3:32-35.

(31.) Monecke S, Enricht R. Rapid genotyping of methicillin-resistant Staphylo coccus aureus (MRSA) isolates using miniaturised oligonucleotide arrays. Clin MicrobiolInfect. 2005;11:825-833.

(32.) Campbell CJ, Ghazal P. Molecular signatures for diagnosis of infection: application of microarray technology. J Appl Microbiol. 2004;96:18-23.

(33.) Honisch C, Chen Y, Mortimer C, et al. Automated comparative sequence analysis by base-specific cleavage and mass spectrometry for nucleic acid-based microbial typing. Proc Natl Acad Sci U S A. 2007;104:10649-10654.

(34.) Garaizar J, Rementeria A, Porwollik S. DNA microarray technology: a new tool for the epidemiological typing of bacterial pathogens? FEMS Immunol Med Microbiol. 2006;47:178-189.

(35.) Casman EA. The potential of next-generation microbiological diagnostics to improve bioterrorism detection speed. Risk Anal. 2004;24:521-536.

(36.) Fournier PE, Drancourt M, Raoult D. Bacterial genome sequencing and its use in infectious diseases. Lancet Infect Dis. 2007;7:711-723.

(37.) Liew AWC, Yan H, Yang M. Pattern recognition techniques for the emerging field of bioinformatics: a review. Pattern Recogn. 2005;38:2055-2073.

(38.) Harmsen D, Claus H, Witte W, et al. Typing of methicillin-resistant Staphylococcus aureus in a university hospital setting by using novel software for spa repeat determination and database management. J Clin Microbiol. 2003;41: 5442-5448.

(39.) Mellmann A., Friedrich AW, Rosenkotter N, et al. Automated DNA sequence-based early warning system for the detection of methicillin-resistant Staphylococcus aureus outbreaks. PloS Med. 2006;3(3):e3.

(40.) Urwin R, Maiden MCJ. Multi-locus sequence typing: a tool for global epidemiology. Trends Microbiol. 2003;11:479-487.

(41.) Reis BY, Kirby C, Hadden LE, et al. AEGIS: a robust and scalable real-time public health surveillance system. J Am MedInform Assoc. 2007;14:581-588.

(42.) O'Connor MJ, Buckeridge DL, Choy M, Crubezy M, Musen MA. BioSTORM: a system for automated surveillance of diverse data sources. AMIA Annu Symp Proc. 2003:1071.

(43.) Swaminathan B, Gerner-Smidt P, Ng LK, et al. Building PulseNet International: an interconnected system of laboratory networks to facilitate timely public health recognition and response to food borne disease outbreaks and emerging foodborne diseases. Foodborne Pathog Dis. 2006;3(1):36-50.

(44.) Maiden MC. Multilocus sequence typing of bacteria. Annu Rev Microbiol. 2006;60:561-588.

(45.) McBride MT, Gammon S, Pitesky M, et al. Multiplexed liquid arrays for simultaneous detection of simulants of biological warfare agents. Anal Chem. 2003;75:1924-1930.

(46.) Gosselin P, Lebel G, Rivest S, Douville-Fladet M. The Integrated System for Public Health Monitoring of West Nile Virus(ISPHM-WNV): a real-time GIS for surveillance and decision-making. Int J Health Geogr. 2005;4:21.

(47.) Berman JJ. Pathology data integration with extensible Markup Language. Hum Pathol. 2005;36:139-145.

(48.) Wurtz R, Cameron BJ. Electronic laboratory reporting for the infectious diseases physician and clinical microbiologist. Clin Infect Dis. 2005;40:1638 1643.

(49.) McDonald CJ, Huff SM, Suico JG, et al. LOINC, a Universal Standard for Identifying Laboratory Observations: a 5-year update. Clin Chem. 2003;49:624 633.

(50.) Achard F, Vaysseix G, Barillot E. XML, bioinformatics and data integration. Bioinformatics. 2001;17:115-125.

(51.) He Y, Vines RR, Wattam AR, et al. PIML: the Pathogen Information Markup Language. Bioinformatics. 2005;21:116-121.

(52.) Gardner SP. Ontologies and semantic data integration. Drug Discov Today. 2005;10:1001-1007.

(53.) Ashburner M, Ball CA, Blake JA, et al. Gene Ontology: tool for the unification of biology: the Gene Ontology Consortium. Nat Genet. 2000;25:25-29.

(54.) Louie B, Mork P, Martin F, Halevy A, Tarczy-Hornoch P. Data integration and genomic medicine. J Biomed Inform. 2007;40:5-16.

(55.) Wang x, Gorlitsky R, Almeida JS. From xML to RDF: how semantic web technologies will change the design of 'omic' standards. Nat Biotechnol. 2005; 23:1099-1103.

(56.) Schweiger R, Hoelzer S, Rudolf D, Rieger J, Dudeck J. Linking clinical data using xML topic maps. Artif Intell Med. 2003;28:105-115.

(57.) Brazma A, Krestyaninova M, SarkansU. Standards for systems biology. Nat Rev Genet. 2006;7:593-605.

(58.) Amadoz A, Gonzalez-Candelas F. EpiPath: an information system for the storage and management of molecular epidemiology data from infectious pathogens. BMC Infect Dis. 2007;7:32.

(59.) Lindstedt BA, Torpdahl M, Nielsen EM, Vardund T, Aaas L, Kapperud G. Harmonization of the multiple-locus variable-number tandem repeat analysis method between Denmark and Norway for typing Salmonella typhimurium isolates and closer examination of the VNTR loci. J Appl Microbiol. 2007;102(3): 728-735.

(60.) Layne SP, Beugelsdijk TJ. Laboratory firepower for infectious disease research. Nat Biotechnol. 1998;16:825-829.

(61.) Khan AS, Mujer CV, Alefantis TG, et al. Proteomics and bioinformatics strategies to design countermeasures against infectious threat agents. J Chem Inf Model. 2006;46:111-115.

(62.) Rolfhamre P, Ekdahl K. An evaluation and comparison of three commonly used statistical models foe automatic detection of outbreaks in epidemiological data of communicable diseases. Epidemiol Infect. 2006;134:863-871.

(63.) Flouris AD, Duffy J. Application of artificial intelligence systems in the analysis of epidemiological data. Eur J Epidemiol. 2006;21:167-170.

(64.) Tang J, Tao J, Urakawa H, Corander J. T-BAPS: a Bayesian statistical tool for comparison of microbial communities using terminal-restriction fragment length polymorphism (T-RFLP) data. Stat Appl Gen Mol Biol. 2007;6:30.

(65.) Revesz P, Wu S. Spatiotemporal reasoning about epidemiological data. Artif Intell Med. 2006;38:157-170.

(66.) Kulldorff M, Heffernan R, Hartman J, Assuncao R, Mostashari F. A space time permutation scan statistic for disease outbreak detection. PLoS Med. 2005; 2(3):e59.

(67.) Gierl L, Schmidt R. Geomedical warning system against epidemics. Int J Hyg Environ Health. 2005;208(4):287-297.

(68.) Kuldorff M, Nagarwalla N. Spatial disease clusters: detection and inference. Stat Med. 1995;14(88):799-810.

(69.) Sonesson C. A CUSUM framework for detection of space-time disease clusters using scan statistics. Stat Med. 2007;26:4770-4789.

(70.) Kuldorff M, Prospective time periodic geographical disease surveillance using a scan statistic. J R Stat Soc [SerA]. 2001;164:61-72.

(71.) Gangnon RE, Clayton MK. A hierarchical model for spatial clustering of disease. Stat Med. 2003;22:3213-3228.

(72.) Neill D, Moore A, Cooper G. A Bayesian scan statistic for spatial cluster detection. In: Weiss Y, Scholkopf B, Platt J, eds. Advances in Neural Information Processing Systems. Cambridge, MA: MIT Press; 2006:1003-1010.

(73.) Clarke KC, McLafferty SL, Tempalski BJ. On epidemiology and geographic information systems: a review and discussion of future directions. Emerg Infect Dis. 1996;2:85-92.

(74.) Meijer A, Brown C, Hungnes O, et al. Programme of the community network of reference laboratories for human influenza to improve influenza surveillance in Europe. Vaccine. 2006;24:6717-6723.

(75.) Trooskin SB, Hadler J, St Louis T, Navarro VJ. Geospatial analysis of hepatitis Cin Connecticut: a novel application of a public health tool. Public Health. 2005;119:1042-1047.

(76.) Tirabassi MV, Wadie G, Moriarty KP, et al. Geographic information system localization of community-acquired MRSA soft tissue abscesses. J Pediatr Surg. 2005;40:962-966.

(77.) Diekema DJ, Edmond MB. Look before you leap: active surveillance for multidrug-resistant organisms. Clin Infect Dis. 2007;44:1101-1107.

(78.) Eisen L, Eisen RJ. Need for improved methods to collect and present spatial epidemiologic data for vectorborne diseases. Emerg Infect Dis. 2007;13(12): 1816-1820.

Vitali Sintchenko, MD, PhD; Blanca Gallego, PhD

Accepted for publication January 6, 2009.

From the Centre for Infectious Diseases and Microbiology, Western Clinical School, The University of Sydney, Westmead Hospital (Dr Sintchenko), and the Centre for Health Informatics, University of New South Wales (Drs Sintchenko and Gallego), Sydney, Australia.

Based on a presentation delivered at the First World Congress On Pathology Informatics in Brisbane, Australia, August 16-17, 2007.

The authors have no relevant financial interest in the products or companies described in this article.

Reprints: Vitali Sintchenko, MD, PhD, Centre for Infectious Diseases and Microbiology, Level 3 ICPMR, Westmead Hospital, Westmead, New South Wales, Australia 2145 (e-mail: vitali.sintchenko@swahs. or
Comparison of 3 Generations of Surveillance Systems

                   Traditional Laboratory
Characteristics        Notifications         Syndromic Surveillance

Sources of        Laboratory reports based   "Syndromes" based on
  data              on phenotypic methods      secondary use of
                    of pathogen detection      health data or
                    and serologic              nontraditional data
                    responses                  sources generated
                                               electronically for
                                               other purposes
Sensitivity and   High specificity, often    High sensitivity, low
  specificity       low sensitivity            specificity
  of outbreak
Capacity to       Often low discrimination   Frequent "false alarms"
  detect and        power for clustering.      require laboratory
  monitor           This depends on the        confirmatory testing.
  outbreaks         supply of viable           Demonstrated capacity
                    pathogens in               to identify large
                    specimens.                 outbreaks of
                                               communicable diseases
                                               or "unusual" events.
Timeliness of     Often significant time     Improved timeliness of
  detection         delays for fastidious      detection of outbreaks
                    microorganisms or when
                    confirmatory testing
                    is performed in
                    different laboratories
Ability to        Usually not possible       No
  subtypes of
Maturity of       Well-developed             Efforts for development
  data analysis     epidemiologic methods      of algorithms for
  tools             and models                 temporal and spatial
                                               data analysis and
                                               signal interpretation
                                               are well underway
Examples          National and               Automated
                    international              Epidemiological
                    surveillance for           Geotemporal Integrated
                    sexually transmitted       Surveillance (AEGIS),
                    infections, such as        Massachusetts
                    syphilis, chlamydia,       Department of Public
                    and gonorrhea;             Health (41);
                    influenza surveillance     Biological
                                               Outbreak Reasoning
                                               Module (BioStorm),
                                               Stanford Center of
                                               Informatics (42);
                                               Real-Time Outbreak and
                                               Disease Surveillance
                                               (RODS), Department of
                                               University of
                                               Pittsburgh (19)

                     Genomics-Based Surveillance
Characteristics             of Biothreats

Sources of        Molecular markers of infection
  data              derived from microbial
                    genome sequencing,
                    whole-genome DNA
                    microarray, functional
                    proteomics data

Sensitivity and   More evaluation is required
  of outbreak
Capacity to       High discrimination power
  detect and        of molecular subtyping
  monitor           enables tracing of transmission
  outbreaks         events and monitoring
                    of new individual
                    genes or genotypes

Timeliness of     Rapid molecular testing
  detection         and biosensor technology
                    offer opportunities for
                    significant improvement
                    in timeliness compared
                    with traditional methods
Ability to          Yes
  subtypes of
Maturity of       Data-rich results that
  data analysis     would require new computational
  tools             analysis tools
                    and multilevel reasoning

Examples          BioPortal, (40) PulseNet, (43)
                    MLSTNet, (44) EpiPath (58)
Gale Copyright: Copyright 2009 Gale, Cengage Learning. All rights reserved.