**DNA, STATISTICS AND THE LAW: A CROSS-DISCIPLINARY APPROACH TO FORENSIC INFERENCE**

**Topic Editors Alex Biedermann, Joëlle Vuille and Franco Taroni**

#### *FRONTIERS COPYRIGHT STATEMENT*

© Copyright 2007-2014 Frontiers Media SA. All rights reserved.

All content included on this site, such as text, graphics, logos, button icons, images, video/audio clips, downloads, data compilations and software, is the property of or is licensed to Frontiers Media SA ("Frontiers") or its licensees and/or subcontractors. The copyright in the text of individual articles is the property of their respective authors, subject to a license granted to Frontiers.

The compilation of articles constituting this e-book, wherever published, as well as the compilation of all other content on this site, is the exclusive property of Frontiers. For the conditions for downloading and copying of e-books from Frontiers' website, please see the Terms for Website Use. If purchasing Frontiers e-books from other websites or sources, the conditions of the website concerned apply.

Images and graphics not forming part of user-contributed materials may not be downloaded or copied without permission.

Individual articles may be downloaded and reproduced in accordance with the principles of the CC-BY licence subject to any copyright or other notices. They may not be re-sold as an e-book.

As author or other contributor you grant a CC-BY licence to others to reproduce your articles, including any graphics and third-party materials supplied by you, in accordance with the Conditions for Website Use and subject to any copyright notices which you include in connection with your articles and materials.

All copyright, and all rights therein, are protected by national and international copyright laws.

The above represents a summary only. For the full conditions see the Conditions for Authors and the Conditions for Website Use.

Cover image provided by Ibbl sarl, Lausanne CH

**ISSN** 1664-8714 **ISBN** 978-2-88919-250-2 **DOI** 10.3389/978-2-88919-250-2

# *ABOUT FRONTIERS*

Frontiers is more than just an open-access publisher of scholarly articles: it is a pioneering approach to the world of academia, radically improving the way scholarly research is managed. The grand vision of Frontiers is a world where all people have an equal opportunity to seek, share and generate knowledge. Frontiers provides immediate and permanent online open access to all its publications, but this alone is not enough to realize our grand goals.

# *FRONTIERS JOURNAL SERIES*

The Frontiers Journal Series is a multi-tier and interdisciplinary set of open-access, online journals, promising a paradigm shift from the current review, selection and dissemination processes in academic publishing.

All Frontiers journals are driven by researchers for researchers; therefore, they constitute a service to the scholarly community. At the same time, the Frontiers Journal Series operates on a revolutionary invention, the tiered publishing system, initially addressing specific communities of scholars, and gradually climbing up to broader public understanding, thus serving the interests of the lay society, too.

# *DEDICATION TO QUALITY*

Each Frontiers article is a landmark of the highest quality, thanks to genuinely collaborative interactions between authors and review editors, who include some of the world's best academicians. Research must be certified by peers before entering a stream of knowledge that may eventually reach the public - and shape society; therefore, Frontiers only applies the most rigorous and unbiased reviews.

Frontiers revolutionizes research publishing by freely delivering the most outstanding research, evaluated with no bias from both the academic and social point of view.

By applying the most advanced information technologies, Frontiers is catapulting scholarly publishing into a new generation.

# *WHAT ARE FRONTIERS RESEARCH TOPICS?*

Frontiers Research Topics are very popular trademarks of the Frontiers Journals Series: they are collections of at least ten articles, all centered on a particular subject. With their unique mix of varied contributions from Original Research to Review Articles, Frontiers Research Topics unify the most influential researchers, the latest key findings and historical advances in a hot research area!

Find out more on how to host your own Frontiers Research Topic or contribute to one as an author by contacting the Frontiers Editorial Office: researchtopics@frontiersin.org

# **DNA, STATISTICS AND THE LAW: A CROSS-DISCIPLINARY APPROACH TO FORENSIC INFERENCE**

Topic Editors:

**Alex Biedermann,** University of Lausanne, Switzerland **Joëlle Vuille,** University of California, Irvine, USA **Franco Taroni,** University of Lausanne, Switzerland

From ABO typing during the first half of the 20th century, to the use of enzymes and protein contained in blood serums and finally direct DNA typing, biology has been serving forensic purposes for many decades. Statistics, in turn, has been constantly underpinning the discussions of the probative value of results of biological analyses, in particular when defendants could not be considered as excluded as potential sources because of different genetic traits. The marriage between genetics and statistics has never been an easy one, though, as is illustrated by fierce arguments that peaked in the so-called "DNA wars" in some American courtrooms in the mid-1990s. This controversy has contributed to a lively production of research and publications on various interpretative topics, such as the collection of relevant data, foundations in population genetics as well as theoretical and practical considerations in probability and statistics.

Both DNA profiling as a technique and the associated statistical considerations are now widely accepted as robust, but this does not yet guarantee or imply a neat transition to their application in court. Indeed, statistical principles applied to results of forensic DNA profiling analyses are a necessary, yet not a sufficient preliminary requirement for the contextually meaningful use of DNA in the law. Ultimately, the appropriate use of DNA in the forensic context relies on inference, i.e. reasoning reasonably in the face of uncertainty. This is all the more challenging that such thought processes need to be adopted by stakeholders from various backgrounds and holding diverse interests.

Although several topics of the DNA controversy have been settled over time, some others are still debated (such as the question of how to deal with the probability of error), while yet others - purportedly settled topics - saw some recent revivals (e.g., the question of how to deal with database searches). In addition, new challenging topics have emerged over the last decade, such as the analysis and interpretation of traces containing only low quantities of DNA where artefacts of varying nature may affect results. Both technical and interpretative

research involving statistics thus represent areas where ongoing research is necessary, and where scholars from the natural sciences and the law should collaborate.

The articles in this Research Topic thus aim to investigate, from an interdisciplinary perspective, the current understanding of the strengths and limitations of DNA profiling results in legal applications. This Research Topic accepted contributions in all Frontiers article type categories and placed an emphasis on topics with a multidisciplinary perspective that explore (while not being limited to) statistical genetics for forensic scientists, case studies and reports, evaluation and interpretation of forensic findings, communication of expert findings to laypersons, quantitative legal reasoning and fact-finding using probability.

# Table of Contents

*05 DNA, Statistics and the Law: A Cross-Disciplinary Approach to Forensic Inference* Alex Biedermann, Joëlle Vuille and Franco Taroni *07 DNA and the Law in Italy: The Experience of "The Perugia Case"* Carla Vecchiotti and Silvia Zoppis *10 Is Human DNA Enough?—Potential for Bacterial DNA* Sarah L. Leake *13 The Role of Prior Probability in Forensic Assessments* William C. Thompson, Joëlle Vuille, Alex Biedermann and Franco Taroni *16 The Impact of Commercialization on the Evaluation of DNA Evidence* Graham Jackson *19 Understanding DNA Results Within the Case Context: Importance of the Alternative Proposition* Louise McKenna *22 DNA Transfer: Informed Judgment or Mere Guesswork?* Christophe Champod *25 The Nucleic Acid Revolution Continues – Will Forensic Biology Become Forensic Molecular Biology?* Peter Gunn, Simon J. Walsh and Claude Roux *29 The National DNA Data Bank of Canada: A Quebecer Perspective* Emmanuel Milot, Marie Lecomte, Hugo Germain and Frank Crispino *36 Your Uncertainty, Your Probability, Your Decision* Alex Biedermann *38 The Evidential Foundations of Probabilistic Reasoning: Toward a Better Understanding of Evidence and its Usage* Patrick O. Juchli

# DNA, statistics and the law: a cross-disciplinary approach to forensic inference

#### *Alex Biedermann1 \*, Joëlle Vuille2 and Franco Taroni <sup>1</sup>*

*<sup>1</sup> Faculty of Law, Criminal Justice and Public Administration, School of Criminal Justice, Institute of Forensic Science, University of Lausanne, Lausanne, Switzerland*

*<sup>2</sup> Department of Criminology, Law and Society, School of Social Ecology, University of California, Irvine, CA, USA*

*\*Correspondence: alex.biedermann@unil.ch*

#### *Edited and reviewed by:*

*Hemant K. Tiwari, University of Alabama at Birmingham, USA*

**Keywords: forensic DNA profiling, interpretation, probability theory, commercialization, DNA transfer, low-template DNA analysis, forensic molecular biology, bacterial DNA**

The use of results of DNA analyses in the legal process is a highly ambivalent topic. On the one hand, scientists have never been in a better position to analyse biological matter of various natures, even in limited quantities and degraded conditions. On the other hand, the increasing amounts of scientific data that can be generated through modern analytical processes do not necessarily imply that evaluative questions that arise in the legal context are given more satisfactory answers. A fundamental question that has accompanied DNA analyses since the early days of their use in the legal process thus remains: how do we handle the challenges presented to us by the use of contemporary scientific and technological developments in the field of law? Under the general theme "DNA, statistics and the law," the collection of articles in this Frontiers Research Topic pursues the goal of investigating this question from an interdisciplinary perspective, and with an emphasis on both current and future challenges.

As pointed out by Gunn et al. (2014) and Leake (2013), the forensic interest in DNA goes well beyond the standard approaches to DNA profiling that represent the current stateof-the-art in many contemporary legal systems, and this raises questions as to how new forms of data ought to be dealt with in an operational perspective (Milot et al., 2013). Although these frontiers topics clarify the extent to which there is room for exciting future research in this area, it should not distract us from the fact that even in the current state of forensic practice, there are hurdles and pressing topics that ask for efficient answers. Controversies over legal cases, such as the Perugia case (Vecchiotti and Zoppis, 2013), reveal that the field is still facing difficulties in setting the meaning of DNA profiling results appropriately into context (Champod, 2013; McKenna, 2013). One might be tempted to conclude that this is an issue that is confined to (and could thus be resolved within) the intersection between forensic science and the law. This perspective might, however, fall short of further dimensions, such as commercialization (Jackson, 2013). The publication of opinion pieces on this topic helps raise awareness on this topic and address some of this deficit.

On a methodological account, the field of statistics is often invoked as a remedy to deal with evaluative questions and many discussants tend to emphasize its traditional facet concerned with data processing. The case of statistics is more general, however, because it is a branch that involves an additional characterizing feature: reasoning coherently in the face of uncertainty (known in the context as *forensic inference*), using probability theory. Indeed, existing literature abounds in rigorous and coherent approaches to cope with intricate evaluative questions (Biedermann, 2013; Juchli, 2013) of the kind that are also encountered in connection with forensic DNA. It is with some frustration, however, that we note that discussions surrounding evaluative questions, using probability, are still fraught with problems that have debates for a very long time. Prior probabilities are one example for this (Thompson et al., 2013).

In summary, the contributions in this Research Topic convince us that the extension of technical frontiers should also be accompanied by conceptual developments and understandings. Indeed, during personal discussions with the Topic Editors, one reviewer (Sheila Willis, Eolaíocht Fhóiréinseach Éireann, Forensic Science Laboratory, Ireland) raised cultural understandings as a further relevant factor: "I think the problem is much deeper. The use of matching DNA as a heuristic for a definite link between person and place is embedded in the minds of scientists as well as jurors in spite of the scholarship to the contrary. The discriminating power of DNA has had a paradoxical effect in the development of forensic science. On one hand it prompted forensic science to be valued and used in a very widespread manner but on the other hand it promoted the commodisation of forensic science with the belief that the test result is all-important and the context irrelevant. This latter view prompts the approach that the test can be produced anywhere and loses sight for the need of the very evaluation (*...*). It is vital that we address this. It is mixed with the commercialization issues but to focus too much on that aspect is to ignore the wider issues that also need to be addressed by: the publication of high profile cases where this approach has unfortunate consequences; increased education; critical mass of scientific opinion in favor of the approach argued for (*...*)."

We cannot but agree and hope that the collected papers in this Research Topic will be of interest to both scientists and other participants in the legal process. We thank all contributors and distinguished reviewers for their efforts to make this original collection timely and highly useful.

# **ACKNOWLEDGMENTS**

Alex Biedermann was supported by the Swiss National Science Foundation (grant IZ32Z0\_143013). Joëlle Vuille was supported by the Swiss National Science Foundation (grants PBLAP1- 136958, PBLAP1-145850).

# **REFERENCES**


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 17 March 2014; accepted: 25 April 2014; published online: 15 May 2014. Citation: Biedermann A, Vuille J and Taroni F (2014) DNA, statistics and the law: a cross-disciplinary approach to forensic inference. Front. Genet. 5:136. doi: 10.3389/ fgene.2014.00136*

*This article was submitted to Statistical Genetics and Methodology, a section of the journal Frontiers in Genetics.*

*Copyright © 2014 Biedermann, Vuille and Taroni. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# DNA and the law in Italy: the experience of "the Perugia case"

# *Carla Vecchiotti\* and Silvia Zoppis*

*Laboratory of Forensic Genetics, Section of Legal Medicine, Department of Anatomy, Histology, Forensic Medicine and Orthopaedics, University of Rome "Sapienza", Rome, Italy*

*\*Correspondence: carla.vecchiotti@uniroma1.it*

#### *Edited by:*

*Alex Biedermann, University of Lausanne, Switzerland*

#### *Reviewed by:*

*Joelle Vuille, University of California, Irvine, USA David Balding, University College London, UK*

**Keywords: genetic analyses, DNA testing, forensic caseworks, reliability, interpretation**

Today DNA analyses represent a method of exceptional importance for the resolution of judicial cases. On the one hand, they allow courts to secure criminal convictions, while on the other hand they can help exonerate innocent suspects. Unfortunately, DNA analyses are often considered an unbeatable and infallible method to discover the truth, with the consequence that judges feel forced either to "bow to science" or to totally refuse the genetic evidence when it is considered too complex. On the contrary, genetic investigations have limits that must always be considered and properly explained to the fact-finder by the forensic geneticist. Courts need to know what results were observed and how likely it is to observe such results under both the prosecution and defense hypotheses. This may be particularly challenging for low quantity, degraded or mixed genetic material, and is further complicated by the need to take into account the potential of (laboratory) error. Despite such circumstances, the evidence can still be informative although its probative value may be reduced.

The murder of British student Meredith Kercher in Perugia (Italy) in 2007 and the case that ensued have highlighted the limits of genetic analyses. Throughout Italy, this case has caused an intense scientific and (through the media) popular debate on the correct application of internationally recommended protocols and procedures as a preliminary quality and reliability guarantee for results presented in court. Particular attention has been drawn to the interpretation of genetic profiles derived from Low Template (LT) or Low Copy Number (LCN) DNA and mixed samples. The two defendants, Amanda Knox and Raffaele Sollecito, were convicted after the first trial but then acquitted on appeal in 2011. The Italian Supreme Court overturned the acquittal in 2013, and a new trial will be held soon.

The Appellate Court experts (author Carla Vecchiotti was one of the two experts who reviewed the case for the Court of Appeal) were asked to repeat, if possible, the genetic analyses carried out during the initial investigation on certain items and whose results led to the conviction of the two defendants: a knife, considered by the prosecution to be the murder weapon, and a bra clasp belonging to the victim. If a repetition of the analyses was impossible due to insufficient biological material, the experts were asked to examine the technical report drawn up by the scientific police in the course of the first trial. According to this document, the scientists had observed DNA profiles corresponding to the victim on the knife blade, to the defendant Amanda Knox on the knife handle and to the defendant Raffaele Sollecito on the bra clasp. The report also concluded that the correspondences between the traces and the various people involved meant that these people were the source of the DNA in question.

As for the knife, collected from the inside of a drawer in Sollecito's kitchen, the Appellate Court experts found neither traces of blood nor the presence of cellular material on the blade. The quantification analysis performed on the material collected from the blade provided a value of 5 pg/μl just in one sample, a result far below the value recommended in the technical protocols of the new generation commercial kits for STR analysis (i.e., 0.25–0.5 ng of template DNA in the PCR reaction in a maximum input volume of 17.5μl for the PowerPlex® ESI 17 and ESX 17 System; 1 ng of template DNA in the PCR reaction in a maximum input volume of 10μl for the AmpFlSTR® NGM SElect™ PCR Amplification Kit). Since the amount of extracted DNA would not allow the required repetition of amplification, the Appellate Court experts decided not to proceed with the genetic analyses on the swabs taken from the knife (Butler and Hill, 2010). As for the bra clasp, it was recovered and collected from the crime scene floor 46 days after the murder. It could not be analyzed by the Appellate Court experts as it had been stored by the scientific police in a tube containing extraction buffer, which made it completely rusty. Consequently, the Court experts proceeded to examine the abovementioned technical report in order to evaluate the results obtained from the analysis of the two items.

The knife was examined first. According to the technical report, the two samples of interest were sample A, taken from the handle, and sample B, taken from the blade. Regarding the nature of the recovered material, there was no scientifically conclusive evidence to support the possible blood nature of the sample taken from the blade (sample B) in that both the generic blood test and the human species test were negative. The conclusion that exfoliated cells were present on the sample taken from the handle (sample A) was equally lacking in scientific basis. No reliable method for quantifying the DNA Vecchiotti and Zoppis DNA and the law in Italy

was employed, and the quantification performed with the Qubit Fluorimeter™ gave the result "too low" for sample B (knife blade), indicating a DNA amount below the sensitivity threshold of the Fluorimeter (200 pg/μl); therefore, presumably, a LT-DNA sample. In relation to the same sample B (knife blade), the electrophoretic graph showed peaks far below the 50 RFU threshold and allele imbalance (Hb = ϕa/ϕ b *>*0.60) for most of the alleles, thus indicating a LT-LCN sample. Yet, none of the recommendations issued by the international scientific community and aimed at obtaining scientifically reliable results when treating this challenging kind of samples were followed. Replicate analyses could have been performed at the time, although experts' views on how to analyze LT-DNA have been evolving since then. The main issue with that type of samples is contamination: consequently, strict protocols must be applied during the inspection, collection, and sampling of such items at the crime scene (Giardina et al., 2011). The procedures recommended to reduce laboratory contamination are equally rigorous as it is well-known that contaminant DNA at low levels may derive from reagents and other laboratory consumables, from the technical staff and from cross-contamination from sample to sample. Indeed, in the context of the Kercher murder case, transfer of a suspect's DNA into a crime scene sample was of particular importance: in fact, it appears that crime scene inspection procedures destined to minimize contamination were not carried out according to international protocols (Fischer, 2003; Laboratory Division of the Federal Bureau of Investigation, 2007; ICPO-Interpol, 2009). Furthermore, it seems that no attempts were made to discover such events.

As for the bra clasp, regarding the nature of the material recovered, there was no scientific evidence supporting the notion that flaking cells were present in the sample. The hypothesis formulated by the scientific police technical consultant about the nature of the material collected from the clasp is thus arbitrary, since it was not supported by any actual findings. After examining the electropherograms obtained from the autosomal STR analyses, the Appellate Court experts were able to assert that, for the markers D8S1179, D21S11, D19S433, D5S818, allelic peaks were interpreted in a manner that did not conform to the recommendations made in current literature/practice. In particular, peaks were considered to be stutters whose heights were above 50 RFU (D19S433), exceeded the threshold of 15% of the major allele (D8S1179, D21S11, D5S818), or were not in a stutter position (D5S818), and thus should have been considered to be alleles (Gill et al., 2006). The DNA extracted from the bra clasp thus indicates the presence of several minor contributors, which was not disclosed by the scientific police. The electropherograms obtained from Y-STRs analysis also showed (besides the peaks designated as alleles in the technical report of the scientific police) the presence of additional peaks with heights exceeding the threshold of 50 RFU (**Table 1**). Despite not being in a stutter position, they were not taken into consideration. Instead, the report(ing) was limited to what was in agreement with the observations on the electropherograms of the autosomal STRs. The genetic profile thus derived from a mixture of unidentified biological substances, whose larger component corresponded to the profile of the victim and whose smaller components suggest the contribution of several male sources. Defendant Raffaele Sollecito showed a profile that was compatible with the profiling results for the trace found on the bra clasp. However, considering the particular circumstances under which the item was recovered and collected, it could not be ruled out that the results obtained from the analysis of the bra clasp derived from environmental contamination and/or contamination in some phase of the collection and/or handling of the item.

In conclusion, it is important to highlight some relevant issues concerning the interpretation of genetic profiles obtained from LT-LCN DNA and mixed samples. First of all, interpretation of a profile obtained for a particular item that is deciding which electrophoretic peaks are allelic and which are stutter or other artifact, must be done without reference to the suspect's profile: it is the only way to minimize the risks of bias in the interpretation of the profile derived from the evidentiary sample. Interpreting a profile derived from a sample with the suspect's reference profile at hand conflicts with the principles of scientific integrity, balance, and coherence that should underlie the practice of forensic science (Budowle et al., 2009; Thompson, 2009). It is also clear that the weight of the evidence is a fundamental issue (Gill and Buckleton, 2010), as widespread public opinion holds that if DNA found on the crime scene matches the suspect, then he must be guilty of the crime. This logically wrong understanding unfortunately also extends to a considerable number of scientists, judges, and lawyers. In fact, there is a perception that failure to convict implies a failure of science. Such a view is extremely dangerous and it is therefore important to defend the idea that whether or not a suspect is convicted is an irrelevant question for the scientist, whose responsibility must only be to correctly explain the evidence in the context of the specific case. The question of how DNA corresponding to the suspect was transferred onto an item must therefore be assessed by the judge and not by the scientist, whose role is limited to presenting the various ways in which transfer can happen and the strength of support for each of the various scenarios (Gill and Buckleton, 2010).

In Italy, the Kercher case has defined a new way of conceiving of and addressing the scientific evidence in the context of a criminal trial (Montagna, 2012): the scientific and, subsequently, legal quality of the investigations performed at the crime scene depends on the compliance with internationally standardized procedures. There is now a better awareness of the importance to follow correct crime scene procedures in order to minimize the risk of contamination and, subsequently, the loss of reliability of any results obtained. Another element that has emerged during this debate is the increased awareness, in the international scientific community, of the need to develop structured reasoning models. These should assist in the evaluation of propositions according to which the suspect is or is not one of the persons who contributed to a particular mixed biological trace, in particular in the context of LT-LCN (including additional phenomena such as drop-in, drop-out, etc.). Finally, it is worth recalling a key principle of the


**Table 1 | Summary of the similarities and differences between the conclusions drawn by the technical consultant of the prosecution (column two) and the Appellate Court experts (column three) regarding the electropherograms obtained from the Y-STR analysis performed on the bra clasp by the scientific police.**

*In the brackets in column four, the symbol* ↑ *and the following number indicate the peak height in Relative Fluorescence Units (RFU). As for the markers DYS393, DYS437, and DYS438, the height ratio (percent) of the observed alleles is also reported.*

Italian criminal justice system, the presumption of innocence: a defendant can only be declared guilty if the prosecution proves beyond any reasonable doubt that he committed the crimes for which he is being prosecuted. If a single doubt remains, even the slightest, the defendant must be acquitted. Judges who convict in the absence of strong, unambiguous and consistent evidence violate the law (Grosso, 2011).

# **REFERENCES**


Fischer, B. A. J. (2003). *Techniques of Crime Scene Investigation.* Boca Raton, FL: CRC Press.


*Received: 12 July 2013; accepted: 23 August 2013; published online: 12 September 2013.*

*Citation: Vecchiotti C and Zoppis S (2013) DNA and the law in Italy: the experience of "the Perugia case". Front. Genet. 4:177. doi: 10.3389/fgene.2013.00177*

*This article was submitted to Statistical Genetics and Methodology, a section of the journal Frontiers in Genetics.*

*Copyright © 2013 Vecchiotti and Zoppis. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# *Sarah L. Leake\**

*School of Criminal Justice, Institut de Police Scientifique, University of Lausanne, Lausanne, Switzerland \*Correspondence: sarahlouise.leake@unil.ch*

#### *Edited by:*

*Joelle Vuille, University of California, Irvine, USA*

#### *Reviewed by:*

*Lipi Acharya, Dow AgroSciences, USA Joelle Vuille, University of California, Irvine, USA* **Keywords: forensic, saliva, microbiome, interpretation, human Identification**

Human identification has played an important role in forensic science for the past two decades and it will continue to do so. However, there are certain types of traces, for example, low quality and low quantity of DNA, often associated with violent crimes, which cannot always be satisfactorily exploited by current techniques. So what is next? Do we try to push these techniques beyond their limit or do we look to something else? I propose turning to a new source of information bacterial DNA. I do not suggest bacterial DNA analysis will replace standard DNA typing but it would be a complimentary technique for when the latter provides only limited information (Leake, 2012).

Since the 1980's, there has been a considerable increase in the capacity of human DNA analyses to contribute to the process of individualization. With advances in technology two new breakthroughs, in the late 80's to early 90's, changed the techniques used for DNA analysis. The first, a new marker for DNA analysis, the microsatellite or Short Tandem Repeat (Jeffreys et al., 1985). The second, a new method of visualization based on fluorescent labeling which when combined with PCR increased the sensitivity of the technique enabling low quantities of DNA to be analyzed (Frégeau and Fourney, 1993). Improved sensitivity extends the methods to traces with low template level material and traces containing degraded DNA. A number of techniques exist to help exploit such traces; Y-STRs, mini-STRs, and mitochondrial DNA (mtDNA). The first two exploit nuclear DNA so are subject to the same constraints as standard DNA typing. mtDNA is found in much higher quantities than nuclear DNA and is thus, well adapted for analyzing degraded DNA. However, it is different to nuclear DNA in that results are less informative for a particular person; instead, they typically characterize a maternal lineage. Therefore, it is of continuing interest to think about novel ways to exploit forensic samples to compliment current methods. I propose the analysis of parts of the human microbiome, in particular saliva. This will be accompanied by challenges in interpretation, such as the combination of evidence (i.e., standard DNA typing with results of microbiome analyses), which thus, represents a field that should receive further attention (Juchli et al., 2012).

What is the so-called human microbiome? In brief, the human microbiome describes all the microbiomes found within and across the human body (Turnbaugh et al., 2007). Each distinct area of the human body (for example, the oral cavity, forearm, hand, and gut) have their own microbiome. Each microbiome consists of different combinations of bacteria with, in theory, each person having a slightly different ratio or combination of bacteria at each site. Fierer et al. (2010) investigated the use of bacteria for human identification concentrating on the potential of analyzing skin bacterial communities. They suggested that the bacteria left behind after touching a surface could be used to trace it back to its source. The analysis of the whole salivary microbiome has not yet been applied to forensic science. However, Kennedy et al. (2012) investigated the microbial analysis of bite marks, specifically streptococcal DNA, in order to compare bacteria in the bite mark to those of a potential source. They concluded that this was a feasible comparative analysis and the results could also provide valuable information when the perpetrator's DNA cannot be recovered. Saliva is commonly found at crime scenes and is often transferred from the perpetrator to the victim, especially in sexual assault cases. Due to a number of factors including environmental, poor DNA transfer and the major contributor masking the minor contributor, human DNA analysis does not always work, demonstrating the need for an alternative technique. One of the major advantages of bacteria is that they are more resistant to environmental factors than human DNA and so could persist longer on a surface. Another potential advantage concerns mixtures. Human DNA is the same regardless where it comes from,i.e., skin or saliva, and this can cause problems when analyzing mixtures. Whereas, the bacteria found in saliva is different from bacteria found on skin (Costello et al., 2009). Thus, it is reasonable to think that it could be possible to extract the salivary microbiome profile of one person from the skin microbiome profile of another. However, if a mixture was formed from the same trace type then mixture analysis will clearly increase the complexity of the evaluative task.

A combination of PCR and high throughput sequencing is used to analyze these types of traces. Specifically, a target sequence is chosen which can, after analysis, be used to distinguish as many bacterial taxa as possible. The most commonly used target is 16SrRNA, however, a combination of targets may produce more detail and hence a more accurate picture of the microbiome. After the sequences have been quality filtered and then clustered together the final dataset produced is in the form of a table containing bacterial species abundance for each trace or target (if more than one target is analyzed) and the taxa name. This table can then be used for all downstream analysis/interpretation. One drawback of high throughput sequencing is the number errors. Unlike standard DNA typing which uses one round of PCR followed by capillary electrophoresis, high throughput sequencing uses 2 rounds of PCR, one to amplify specific targets and one during the sequencing process. To try and overcome this when the data is quality filtered a certain number of sequences are removed according to a chosen threshold. The questions then posed are: what threshold should be chosen to remove as many erroneous sequences as possible without impeding downstream analysis and how to incorporate this into data interpretation?

The interpretation of microbiome data for the purpose of forensic science has not yet been addressed. Forensic science is different to most other science in that the final results have to be presentable to a court and therefore, understandable to lay people. This is where inference and statistics come into play. For standard DNA typing, current practice focuses on a likelihood ratio (LR) assignment based mainly on allele proportions for the relevant population. This is used when the court is interested in discriminating between hypotheses relating to the source of the recovered stain. The allele proportions are calculated by analyzing a certain number of people from the relevant population. These population specific data enable an acceptably accurate measure of the rarity of a DNA profile. Behind these allele proportions is a well-understood model of inheritance, which forms the backbone of all calculations. Furthermore, to make this measure as independent as possible all the STRs used are either on different chromosomes or so far apart that linkage is very unlikely. With microbiome data this is more difficult to achieve. Over 700 bacterial species have been found in the mouth (Parahitiyawa et al., 2010) and it is inevitable that some of these species will be co-dependent (Lamont and Jenkinson, 2010). The question then becomes: how is one to account for such data to determine a probabilistic measure to discriminate between the hypotheses of interest? If it is possible to characterize the rarity of a microbiome profile using, for example, the presence/absence of species then a similar method to that used for standard DNA typing could be employed. However, this would involve analyzing a large number of samples to get accurate figures for the proportions of bacteria in the relevant populations. With the current costs of high-throughput sequencing this is not a feasible option. As the cost of analysis decreases more samples can be analyzed for less and this technique may become more viable. There has been an increased interest in microbiome analysis in dentistry (Aas et al., 2005) so in the future it might be possible that everybody will have their oral microbiome analyzed for such a purpose and hence more accurate population proportions for species could be obtained.

A second approach could focus on data from populations and their use for classification to support relatedness to a given cluster. In this context, a question of interest is whether a given trace, say X, fits into either cluster A or cluster B, for example. It then becomes an issue for the scientist to give a value to such an association. The intra- and inter-variability of microbiomes (i.e., the variation for a given person and the variation between different people) play a fundamental role in this task. Previous studies have shown that for both skin and saliva bacterial communities intra-variability is smaller than intervariability (Fierer et al., 2010; Lazarevic et al., 2010). Therefore, it should be possible to cluster samples, taken from the same person, together, and to support a distinction with respect to samples taken from a different person. However, it also appears relevant to extend research to additional factors, such as diet, antibiotic use, and smoking habit, because these factors can affect microbiome composition.

The challenges associated with this technique are 2-fold: the first relate to the stability of the saliva microbiome and the second to the sequencing method used. The saliva microbiome has been shown to be relatively stable over time (Costello et al., 2009) however, is relative stability good enough for forensic use? More research needs to be carried out to investigate the effect additional factors have on both short term and long-term microbiome stability. One could suppose that the effect of smoking for example would be continuous as long as the person smoked regularly and therefore, would not affect the overall stability. However, for someone with a sporadic smoking habit the effect could be more pronounced. A recent study has shown that people who live together share certain bacteria with each other and their pet dogs (Song et al., 2013). Therefore, knowledge of a person's lifestyle would be very useful when interpreting data. However, these additional factors could also help to discriminate two people with different lifestyles, for example, if a number of canine bacteria were found this would indicate that the person has a pet dog providing additional information to law enforcement agencies when searching for a suspect. As mentioned above there are errors associated with the sequencing method used mainly due to the two rounds of PCR. These errors principally impact upon the rare microbiome i.e., rare bacteria that are represented by only a few sequences. Consequently, how can one differentiate rare bacteria from sequencing errors? For forensic purposes I think the best option is to be conservative and remove most of the rare microbiome helping to ensure as many errors as possible have been removed.

To implement this technique into real casework the additional factors mentioned above need to be investigated and an evaluative framework developed. At the equipment level, with the advances in sequencing technologies and their rising popularity bench-top high-throughput sequencing machines have been developed making this technique more affordable and accessible. The development of a standard operating protocol would enable the exchange of data between laboratories and consequently a database could be built. Once the evaluative framework has been developed this technique could start to be used for cases where all other options have been exhausted, potentially helping with human identification and/or lifestyle indicators.

In conclusion, microbial analysis of body sites could provide additional information where conventional human DNA analysis has failed. However an appropriate evaluative framework needs to be established to interpret the resulting data. Due to the nature of the experiments, and the questions to be asked, it seems reasonable to suggest that current statistical inferential methods could provide the necessary frameworks of thinking to streamline the analysis route.

# **REFERENCES**


combination of items of evidence. *Law Prob. Risk* 11, 51–84. doi: 10.1093/lpr/mgr023


Cohabiting family members share microbiota with one another and with their dogs. *Elife* 2:e00458. doi: 10.7554/eLife.00458

Turnbaugh, P. J., Ley, R. E., Hamady, M., Fraser-Liggett, C. M., Knight, R., and Gordon, J. I. (2007). The human microbiome project. *Nature* 449, 804–810. doi: 10.1038/nature06244

*Received: 19 July 2013; accepted: 25 November 2013; published online: 13 December 2013.*

*Citation: Leake SL (2013) Is human DNA enough? potential for bacterial DNA. Front. Genet. 4:282. doi: 10.3389/fgene.2013.00282*

*This article was submitted to Statistical Genetics and Methodology, a section of the journal Frontiers in Genetics.*

*Copyright © 2013 Leake. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

**OPINION ARTICLE** published: 28 October 2013 doi: 10.3389/fgene.2013.00220

# The role of prior probability in forensic assessments

#### *William C. Thompson1, Joëlle Vuille2 \*, Alex Biedermann3,4 and Franco Taroni <sup>3</sup>*

*<sup>1</sup> Department of Criminology, Law and Society and School of Law, University of California, Irvine, Irvine, CA, USA*

*<sup>2</sup> Department of Criminology, Law and Society, University of California, Irvine, Irvine, CA, USA*

*<sup>3</sup> Faculty of Law and Criminal Justice, School of Criminal Justice, Institute of Forensic Science, University of Lausanne, Lausanne, Switzerland*

*4 Department of Economics, Università Ca*' *Foscari Venezia, Venice, Italy*

*\*Correspondence: jvuille@uci.edu*

#### *Edited by:*

*Qizhai Li, Chinese Academy of Sciences, China*

#### *Reviewed by:*

*Wenjun Xiong, Chinese Academy of Sciences, China*

**Keywords: DNA, Bayes Theorem, prior probability, expert testimony, forensic science**

As the importance of forensic science in the legal system has grown, debate has arisen about the way forensic scientists should characterize their findings in order to communicate most effectively with legal fact-finders. This article will focus on one aspect of that debate: the framing of conclusions involving elements of probability. In particular, we will examine the contentious issue of whether forensic scientists, when asked to provide evidence that will be used to evaluate various competing propositions about physical evidence, should consider the prior probabilities that those propositions are true. Disputes about this issue have arisen in a number of contexts and recent examples suggest that opinions still diverge (e.g., Budowle et al., 2011; Biedermann et al., 2012). In this comment, we will argue that a reasoned approach to this issue depends on the role that forensic scientists are expected to play in the legal system.

To illustrate the underlying issues, let us begin with a generic example. A forensic scientist is asked to perform DNA profiling analyses of blood found at a crime scene and to compare the result to the DNA profile of a defendant who is charged with the crime. The defendant's guilt or innocence will be determined by a jury. The jurors' decision will depend in part on *their* assessment of two propositions of interest—H1: that the defendant was the source of the blood; and H2: that someone else was the source of the blood. What should the forensic scientist tell the jurors about the results of the DNA analysis?

The jurors might want the expert to tell them definitively which hypothesis is true, or to give them particular values for the so-called source probabilities—saying, for example, that there is a 0.998 probability the defendant is the source of the blood and only a probability of 0.002 that someone else was the source. But there is no way for the forensic scientist to reach such conclusions based on the forensic findings alone. To assess source probabilities, the forensic scientist must also consider other evidence in the case.

Suppose, for example, that the expert found that the defendant and the blood from the crime scene share a set of genetic markers found in one person in 1 million in the relevant population. Without considering other evidence in the case, the expert might make statements about the conditional probability of finding these results under the two hypotheses of interest. For example, the expert might conclude that the shared genetic markers were virtually certain to be found under H1 (defendant was the source), but had only 1 chance in 1 million of being found under H2 (someone else was the source). Based on this assessment the expert might also provide to the jury a so-called likelihood ratio—saying, for example, that the DNA profiling results are 1 million times more probable if the defendant rather than some other person was the source of the blood. But a likelihood ratio is not the same thing as a source probability. The likelihood ratio reflects the relative probability of the findings under the relevant propositions, not the probability that the propositions are true.

The only coherent way to draw conclusions about source probabilities on the basis of forensic evidence is to apply Bayes' rule, which requires that one begins with an assignment of prior probabilities to the propositions of interest (e.g., Robertson and Vignaux, 1995; Finkelstein and Fairley, 1970). Bayes' rule specifies how one ought to combine prior probabilities with the results of a DNA profiling analysis in order to find the so-called posterior probabilities that the defendant is the source of the blood. But the Bayesian approach will only work if the expert can begin with a prior probability.

This brings us to the crux of the debate: whether forensic scientists should even try to specify prior probabilities and, if so, how. It is occasionally suggested that forensic scientists should *assume* equal prior probabilities. This is sometimes described as a position of neutrality and is often justified with references to vague accessory "principles," such as the "Principle of Indifference" or the "Principle of Maximum Entropy," borrowed from other disciplines and contexts (Biedermann et al., 2007).

A prominent illustration can be found in paternity cases. When DNA analysts are asked to assist in the assessment of whether a particular man is the father of a child, they usually analyze the profiles of the mother, child, and the accused man, and assign conditional probabilities that the genetic characteristics found in the child (Ec) would be observed under two relevant hypotheses specifying that the accused is the father (H1) and that some other man (from a particular reference population) is the father (H2) conditioned on the alleged parents' DNA profiles (Em and Eam, for the mother and the accused man, respectively). In some cases, the analysts limit themselves to reporting the ratio of these conditional probabilities—i.e., Pr(Ec|Em,Eam,H1)/Pr(Ec|Em,H2)—which is a likelihood ratio (although it is also referred to as the paternity index). But quite often, analysts go farther. They assume that the prior odds of H1 and H2 are equal and then, in accordance with Bayes' rule, they multiply the prior odds by the likelihood ratio (paternity index) to determine the posterior odds of paternity. Recall that odds are defined as a ratio between two probabilities; in this particular scenario, it is the ratio between Pr(H1) and Pr(H2). The posterior odds are typically restated as a probability. For example, if the DNA evidence supports paternity with a likelihood ratio of 1 million some analysts would report a probability of 0.999999 that the accused is the father.

While this approach is commonly used in civil paternity cases, courts in the United States have generally not allowed analysts to characterize their findings in this manner when paternity tests are offered as evidence in criminal cases—e.g., to prove the defendant committed rape or incest by showing he fathered a particular child. The assumption of equal prior odds appears to conflict with the presumption of innocence to which defendants in criminal trials have traditionally been entitled. In the view of most commentators, assuming that the accused starts with a probability of guilt of 0.5 falls far short of presuming him innocent. More fundamentally, making *any* default assumption about the prior probability is seen as violating the obligation of the legal system to deliver individualized justice based on the facts of each case (the attentive reader might have noted that circumstantial information *I* was omitted from the above mathematical notation). Consider that an assumption of equal priors is applied regardless of any other evidence in the case: an accused man who offers proof that he is infertile due to azoospermia and was not on the same planet as the mother at time of conception (i.e. an azoospermic cosmonaut) is treated the same as any other man. While the jury can take the other evidence into account they may have difficulty integrating it with the "probability of paternity" delivered by the forensic expert, or they may mistakenly assume that the "probability of paternity" is all they need consider.

Another suggested approach is that forensic scientists take upon themselves the responsibility for assessing the prior probability of the relevant hypotheses before updating them based on the scientific findings in accordance with Bayes' rule. For example, in the context of missing person identification, commentators declared that "[t]he forensic DNA community needs to develop guidelines for objectively computing prior odds" (Budowle et al., 2011, p. 15). The major objection to this approach, in the context of a criminal trial, is that it may result in forensic scientists going beyond their scientific expertise and usurping the role of the fact-finder. In order to assign prior contextually meaningful probabilities, the expert would need to take into account all of the evidence in the case. But experts are rarely in a good position to evaluate the non-scientific evidence and have no business doing so. The legal system places the responsibility for evaluating the evidence in a case on the fact-finder, whether judge or jury, not the expert witness. Jurors are carefully chosen for the task, are often shielded by evidentiary rules from information that the legal system determines that they should not consider, and are carefully instructed on the presumptions to make and standards to apply in reaching a verdict; experts are not. Allowing expert witnesses to take into account prior odds when considering the probative value of a scientific observation also raises the danger of double-counting certain pieces of evidence (Thompson, 2011).

Consequently, many commentators have suggested that forensic experts have no role in assessing prior probabilities. Because posterior probabilities can only be arrived at by assessing prior probabilities, they argue that experts cannot legitimately make statements about posterior probabilities either. As Redmayne explains (2001, p. 46): "(*...*) *the expert should not testify in terms such as (...) 'the blood probably came from the defendant', because one can only reach conclusions of this sort by making assumptions about the strength of other evidence against the defendant."*

There may, however, be circumstances in which a forensic scientist could appropriately assign prior probabilities and use them as a basis for reaching other conclusions. One such circumstance arises when the expert is given the responsibility of making an overall evaluation of a case. For example, coroners are sometimes given full responsibility for determining the cause and manner of a death for legal purpose. (In jurisdictions of the Anglo-Saxon tradition, a coroner is a government official who investigates human deaths and makes independent determinations as to their time, manner, and cause. He should not be confused with the medical examiner, who merely provides information to a court in the course of criminal prosecution or civil litigation but has no judicial authority of his own). In such cases, the expert should certainly take account of all relevant evidence, including both scientific and non-scientific factors. There is no danger of the expert usurping the factfinder when the expert *is* the fact-finder. The matter becomes more complicated, however, when an expert who has made a determination in the role of fact-finder is subsequently asked to present evidence to another fact-finder, as when a coroner who has determined that a death was due to homicide rather than suicide in an inquest is asked to testify in a subsequent criminal trial. In such cases, the dangers of usurpation and double-counting of evidence discussed above may still loom large.

Whether forensic scientists should take account of the prior probability of the hypotheses they are asked to help evaluate is a complicated question. The answer depends on the role the forensic scientist will be playing in the legal system. If forensic scientists will make the ultimate determination, for legal purposes, with regard to a particular proposition of interest, then they should, and indeed must, consider their prior probabilities that the hypotheses are true. If, however, the truth of the hypotheses will be addressed by someone else—e.g., a judge or jury—and the forensic scientists' role is limited to providing expert assistance, then forensic scientists should generally confine themselves to assign the conditional probability of the scientific findings under the given hypotheses of interest, and should leave to the legal decision maker the task of assessing prior and posterior probabilities.

# **ACKNOWLEDGMENTS**

William C. Thompson was supported by the UC Lab Fees Research Program. Joëlle Vuille was supported by the Swiss National Science Foundation (grants PBLAP1-136958, PBLAP1-145850). Alex Biedermann was supported by a Research Mobility Grant of the Société Académique Vaudoise.

# **REFERENCES**


use of prior odds for missing persons identifications. *Investig. Genet.* 3, 1–7. doi: 10.1186/2041- 2223-3-2


*Received: 28 August 2013; accepted: 08 October 2013; published online: 28 October 2013.*

*Citation: Thompson WC, Vuille J, Biedermann A and Taroni F (2013) The role of prior probability in forensic assessments. Front. Genet. 4:220. doi: 10.3389/fgene. 2013.00220*

*This article was submitted to Statistical Genetics and Methodology, a section of the journal Frontiers in Genetics.*

*Copyright © 2013 Thompson, Vuille, Biedermann and Taroni. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# The impact of commercialization on the evaluation of DNA evidence

# *Graham Jackson1,2\**

*<sup>1</sup> Advance Forensic Science, St. Andrews, UK*

*<sup>2</sup> School of Science, Engineering and Technology, University of Abertay Dundee, Dundee, UK*

*\*Correspondence: graham@advanceforensicscience.com*

#### *Edited by:*

*Alex Biedermann, University of Lausanne, Switzerland*

*Reviewed by:*

*Joelle Vuille, University of California, Irvine, USA Sheila Willis, Eolaíocht Fhóiréinseach Éireann, Forensic Science Laboratory, Ireland*

**Keywords: likelihood ratio, hierarchy of issues, evaluation, sub-source, source, activity**

# **BACKGROUND**

Guidance on designing cost-effective examinations and on interpretation of expert observations has been available since the late 1990s in the form of a model framework called Case Assessment and Interpretation (CAI) (Cook et al., 1998a,b; Jackson and Jones, 2009). The underlying principles of the guidance were subsequently encoded in a published standard written by the Association of Forensic Science Providers (AFSP, 2009) and have been incorporated in draft guidance from the European Network of Forensic Science Institutes (ENFSI). The guidance is predicated on a logical approach to the evaluation of evidence, requiring examiners to have an understanding and acceptance of the laws of probability. A key element in such an evaluation is the assessment of a likelihood ratio (LR) as the basis of providing logical, balanced, robust, expert opinion.

In addition to the use of an LR approach, a second notion, that of the hierarchy of issues, is a vital element of evaluation. The hierarchy of issues is a scheme that helps identify the case issue that the expert evidence is addressing and thereby clarifies the contribution that evidence is making to the judicial process (Cook et al., 1998b; Evett et al., 2000; Jackson et al., 2006). Using the hierarchy, case issues are classified as belonging to one of four levels—"offence," "activity," "source," or "sub-source" (Jackson, 2009).

This article discusses how commercialization of forensic services, whilst not posing any immediate threat to an evaluation of an LR for scientific evidence, does pose a risk of misleading evidence being adduced if the issue being addressed by the scientist is at too low a level in the hierarchy of issues.

# **THE HIERARCHY AND DNA EVIDENCE**

Most DNA cases are reported at, and therefore help the fact-finder (not the scientist) to address issues at a source or sub-source level. If a DNA-profile from a questioned sample can be attributed with a high degree of confidence to a particular body-fluid stain or material, then the results of DNA-profiling help address the issue of the *source* of the *body-fluid*. However, if the DNA-profile cannot be attributed with confidence to a particular body-fluid, then the DNA results help address only the origin of the DNA, i.e., a *sub-source* issue. In a way analogous to Bayesian networks, consideration of subsource and source level issues then feed into, and inform, consideration of activity level issues and, ultimately, offence level issues.

In some cases, the probative force of matching DNA-profiles for sub-source and source level issues transfers directly, and largely unchanged, to the probative force at activity level, and possibly also to offence level. As an example, consider a case in which a defendant was being tried on a charge of rape. A DNA-profile had been obtained from semen-bearing vaginal swabs taken from the complainant within a few hours of the incident. The profile was found to match that of the defendant. He denies the allegation and declared that he did not know, and had never met, the complainant. Let us assume that the defense are not challenging the prosecutions contentions that:


In these circumstances, the probative force, in terms of an LR of the order 1 billion provided by the matching DNAprofiles at sub-source level, translates unchanged to a probative force of 1 billion at offence level. Whilst the scientist should preferably be focusing on activity level, she could report the LR at sub-source, source or activity level and there would be little, if any, risk that the court would be misled about the probative force that the matching DNA-profiles provide at offence level, i.e., 1 billion.

Compare that case with one of a burglary in which a scarf was found at the scene. The occupants of the attacked property say that the scarf was not present when they left the premises and it must therefore be a reasonable, but not certain, assumption that the scarf was left by the burglar(s). The scarf was in a dirty, well-worn condition and it bore one small bloodstain. No other blood was found at the scene. DNA-profiling of material cut from the bloodstain on the scarf gave a weak DNA-profile and that was subsequently found to match a suspect. He had no fixed address but shared various flats and "squats" with a number of vagrants and known criminals, often sharing items of clothing. He denied the burglary and said that he couldn't recall wearing a scarf like the one at the scene but did say that he had occasionally worn scarves in the past but that he was not a habitual wearer. No other DNA-analyses were performed to see what other DNA-profiles were present or, indeed, whether the suspect's profile was also present on other non-bloodstained areas of the scarf. Therefore, there is significant uncertainty that the profile could be attributed to the bloodstain. Let us assume that the scientist in this case evaluated and reported the matching DNA-profiles at sub-source level, i.e., helping to address the sub-source issue of "from whom has the DNA originated?" Given a full, matching profile, the scientist reported the LR of a billion as providing "extremely strong support for a view that the DNA originated from the suspect rather than from an unknown, unrelated person." Without further explanation by the scientist, or guidance from the prosecution, this "value" at sub-source level could be taken by the court and applied erroneously to the "value" that the matching DNA-profiles provided in addressing the offence level issue of whether the suspect committed the burglary. If the scientist wanted to provide more effective, more balanced and robust help to the court, then she should be evaluating and reporting the matching DNA-profiles at activity level, as required by the AFSP standard and CAI principles.

In this last case, specifying an activity level issue would not be a trivial matter. There were no witnesses to the crime and therefore there are no clear activities that constitute the crime and which relate to the scarf. Perhaps the best that scientist could offer would be to consider an issue of whether the suspect was a habitual wearer of the scarf. A pair of appropriate propositions based on the prosecution and defense positions, and conditioned on the relevant background circumstances of the case, could be defined along the lines of:

H*P*—The suspect is a habitual wearer of the scarf.

H*D*—The suspect is not a habitual wearer of the scarf; someone else is the habitual wearer.

Of course there are problems with defining what "habitual" means in terms of the length of time and the degree of contact that would be classified as "habitual" but let us assume that these variables had been defined broadly. There is also the issue of whether the scarf had been worn habitually by anyone at all. Again, let us assume it would be accepted that it had been worn in such a way. Given sufficient, reliable knowledge of transfer, persistence and detection of DNA-profiles, and on background levels of DNA-profiles, then the scientist may be able to assign probabilities for her observations given the truth of the competing propositions. The observations should include not only the "match" of the profiles but also the quantity and distribution of DNA across the scarf. However, in this case, there is only the observation of a "match"; there is no information on the quantity or distribution of DNA-profiles across the scarf. The scientist is therefore unable to evaluate robustly an LR at this activity level and, in turn, the court does not have the expert help it requires in order to evaluate properly, at offence level, the DNA evidence that has been provided at sub-source level.

Evaluation at activity level of cases in which there is uncertainty on:


will inevitably mean that the LR provided by matching DNA-profiles at subsource level, typically of the order 1 billion, will be reduced, sometimes markedly, when that evidence is evaluated at activity and offence levels. Evett et al. (2002) provide examples of two such cases while Biedermann and Taroni (2011) provide a thorough analysis of the relationships and dependencies of the variables involved.

#### **PRACTICE IN ENGLAND AND WALES**

From anecdotal evidence, particularly from experts working on behalf of the defense, there appears to be a large number of cases, if not the majority of cases, reported at sub-source level with very powerful LRs. However, a significant number require more sophisticated appraisal at activity level in order that the court is not misled on the probative force of the matching DNA-profiles.

In the English and Welsh jurisdiction, police forces pay private companies for the provision of forensic science services. Essentially, under the terms of contracts between the police and the providers, an evaluation of an LR for activity level propositions is generally more costly than for an evaluation at sub-source level. Even if the police or prosecution realize they need an evaluation at activity level, budgetary considerations may deter a request for such an evaluation. Furthermore, even though the AFSP standard requires the scientist to consider activity level, and to advise the customer of the importance of doing so, there is little evidence that providers are able, or willing, to follow that requirement. This may be because the police have submitted for analysis only a sample, such as a swab or piece of fabric, taken from a larger item, depriving the scientist of vital information on the quantity and distribution on that larger item that is necessary for evaluation at activity level.

Providing an evaluation of an LR only at sub-source or source level deprives the court of important information that, in some case, has a direct bearing on the decision of whether the defendant is guilty.

Arguably, many defendants simply plead guilty in the face of expert reports that contain the acronym "DNA" and the figure "1 billion." This may be because they truly are guilty, or it may be because their lawyers advise them to do so. Or it may be because, while the defendant is innocent, both the defendant and his/her lawyer do not realize that they can challenge this apparently overwhelming figure and that a proper appraisal of the evidence at a more appropriate level would result in much less powerful probative force.

### **DISCLAIMER**

The opinions expressed in this article are the personal views of the author. He does not represent any official organization.

### **REFERENCES**


and propositions. *Sci. Justice* 40, 3–10. doi: 10.1016/S1355-0306(00)71926-5


*Received: 01 September 2013; accepted: 17 October 2013; published online: November 2013. 06*

*Citation: Jackson G (2013) The impact of commercialization on the evaluation of DNA evidence. Front. Genet. 4:227. doi: 10.3389/fgene.2013.00227*

*This article was submitted to Statistical Genetics and Methodology, a section of the journal Frontiers in Genetics.*

*Copyright © 2013 Jackson. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# importance of the alternative proposition

# *Louise McKenna\**

*Justice, Forensic Science Laboratory, Dublin, Ireland \*Correspondence: mckennal1@eircom.net*

#### *Edited by:*

*Alex Biedermann, University of Lausanne, Switzerland*

#### *Reviewed by:*

*Tacha H. Champod, University of Lausanne, Switzerland*

**Keywords: evidence evaluation, alternative proposition, pre-case assessment, DNA, nearby defence**

# **INTRODUCTION**

There is a growing awareness of the dangers of reporting DNA profiling results in criminal investigations without consideration of the implications of the finding within the case context. A recent article in the New York Times (July 24 2013) described a robbery that resulted in the death of Raveesh Kumra. Foreign DNA on the victim's fingernails corresponded with the profile of a local man, Lukis Anderson who was charged with murder. Following 5 months in prison, it was found that Mr. Anderson could not have committed the crime as he was in hospital at the time of the robbery. This and other cases demonstrate that reporting the DNA profile results alone can be misleading. The investigators and courts may be impressed by the probity of the DNA result in isolation and not think about other issues such as the possibility of secondary transfer. The Association of Forensic Science Practitioners, UK and Ireland (2009) attempted to address this problem for trace evidence through the introduction of Standards for the Formulation of Evaluative Forensic Science Expert Opinion. These standards require that the scientific finding is considered relative to two mutually exclusive propositions, one from the prosecution and one from the defence [based on the work of Evett et al. (2000)]. Within the case context, the probability of the evidence given the prosecution proposition divided by the probability of the evidence given the defence proposition produces the Likelihood Ratio (LR). The magnitude of the LR indicates the degree of support for one proposition vs. the other. This approach allows the scientist to help the court to understand the implications of the findings for the particular circumstances of each case.

Since the publication of the AFSP standard, EFE (Forensic Science Laboratory, Dublin, Ireland) has worked at applying the criteria to its casework. The following examples are drawn from EFE casework and illustrate how the alternative proposition can significantly affect the impact of the DNA finding.

# **ASSAULT CASE**

Mr. G was walking down a street in his home town when he was approached by three males. One of the men punched Mr. G, knocked him to the ground and kicked him a number of times in the head and chest area. Mr. G bled as a result of his injuries.

The police identified Mr. T as a suspect for this assault. They arrested Mr. T within 8 h of the incident and took his clothes and shoes. The laboratory found a single blood stain on Mr. T's jeans and the DNA profile corresponded with that of Mr. G. In the past, the EFE would have reported this observed correspondence as well as a probability assignment for the event that another unrelated person has the same DNA profile, in the order of less than one in a thousand million.

This is useful information if Mr. T claims that he had nothing to do with the incident and was not present when it occurred. But it could be misleading information if Mr. T has an explanation that results in a different alternative proposition.

In this case, Mr. T said he was one of the three men who approached Mr. G but he did not assault him. Mr. T says he ran away after the incident and never made contact with Mr. G. Therefore, the issue is whether Mr. T assaulted Mr. G or he was close by when the assault occurred. The appropriate propositions are:

Prosecution proposition: Mr. T punched and kicked Mr. G

Defence proposition: Mr. T was close by during the assault and someone else punched and kicked Mr. G

Prior to the examination of the clothes, the AFSP Standard requires scientists to consider their expectations for observing blood with a corresponding profile given these propositions. This is called the precase assessment. If Mr. T assaulted Mr. G, the scientist considers whether blood may or may not have transferred to Mr. T. For example, blood may not have transfered to Mr. T if the bleeding commenced after the assault ceased or if Mr. T's kicks did not make contact with the bloodstained area(s).

The scientist also considers the type of blood staining he would expect to observe given both propositions. For example, if Mr. T assaulted Mr. G, wet blood may have transferred to Mr. G's clothes or shoes as a result of contact. If Mr. T did not assault Mr. G but was nearby, then he is very unlikely to have made contact with a blood stained surface but airborne blood drops generated during the assault may have landed on his clothes. The trained scientist can usually distinguish contact from airborne stains as long as the stains are not so small, that smeared airborne stains could be confused with contact stains.

Using his/her understanding of how blood transfers in assaults, the scientist assigns probabilities for the presence of contact and/or airborne blood stains on the suspect's clothes given the two propositions (**Table 1**). They are not precise but help the scientist to understand that

#### **Table 1 | Pre-case assessment.**


most outcomes (no blood, airborne blood stains, small contact stains) provide little assistance in the addressing the issue of whether Mr. T assaulted Mr. G or was close by during the assault. The exception is the presence of a large contact bloodstain or a number of small contact stains, which are unlikely if someone else assaulted Mr. G and Mr. T was close by. Therefore, it is worthwhile examining the clothes to see if this type of staining is present.

In this particular case, the single bloodstain found on Mr. T's jeans was an airborne stain. From the assigned probability for airborne blood stains and no contact stains (**Table 1**), the finding provides little assistance on the issue of whether Mr. T assaulted Mr. G or was standing nearby during the assault. However, if a large contact stain was found on Mr. T, then the reported conclusion would be that the finding provided moderately strong support for the proposition that Mr. T punched and kicked Mr. G rather than Mr. T was standing close by and someone else punched and kicked Mr. G.

The application of this type of logical reasoning demonstrates the importance of identifying the appropriate alternative. The previous practice of reporting the DNA result as a Conditional Profile Probability (CPP) without considering the case circumstances was at best unhelpful and could have been misleading. In fact, the CPP is of no value when the defence proposition allows for the presence of a corresponding profile. Another advantage is that the decision on the significance of different outcomes, before doing the examination, avoids the danger of *post-hoc*

rationalization or bias on the part of the scientist.

### **FIREARM CASE**

Police frequently submit firearms for DNA analysis and comparison with suspects. Take for example the situation where Mr. H was shot while driving his car through his gateway. A witness observed the shooter running to the get-away car and noted the registration number. A short time later, the police found the car. There had been an unsuccessful attempt to set the car on fire. A gun found in the car was submitted to the laboratory. Following their enquiries, the police identified Mr.M as a suspect. Mr. M says he had nothing to do with the shooting or the gun in question. The police requested that the laboratory examine the gun for DNA and if found to compare it with Mr. M's profile.

The laboratory got a single DNA profile from the gun and found that it did not correspond with Mr. M's profile. If they report this factually, will the police think that Mr.M should be excluded from their enquiries?

The propositions in this case are:

Prosecution proposition: Mr.M fired the gun.

Defence proposition: Mr.M had nothing to do with the gun, someone else fired it.

If Mr.M fired the gun, what is the probability of not finding his DNA (to) and finding somebody else's DNA as background (b)? Background DNA is defined as the interpretable DNA present on the gun that was not deposited by the person who last fired the gun.

If Mr.M did not fire the gun and some one else fired it, what is the probability of finding DNA different to Mr. M? This is the probability that DNA transferred from the person who fired the gun (t) and there is no background present (bo) or the probability that the DNA did not transfer from the person who fired the gun (to) and there is background DNA present (b). (For simplicity, the conditional probability of the non-matching DNA profile is omitted as this cancels out).

$$LR = \frac{{}^{\text{d}}\mathbf{b}^{\text{o}} + {}^{\text{o}}\mathbf{b}}{{}^{\text{t}}\mathbf{b}}.$$

Polley et al. (2006) examined the transfer rates of DNA from shooter to gun and observed an association between the shooter and the DNA profile approximately 30% of the time. This is also supported by other studies on transfer of DNA following handling (Phipps and Petricevic, 2007). These studies suggest 0.3 as the probability for transfer of the shooter's DNA to the gun (t) and 0.7 for the probability that the shooter's DNA did not transfer to the gun (to).

Assigning a probability for the occurrence of background DNA is more difficult. DNA results from firearms in EFE show that no profile was obtained for 26% of firearms, mixed profiles in 35% and single profiles in 24% of firearms (and the DNA on the remainder could not be interpreted). It can then be deduced that there is interpretable background DNA on all the guns with mixed profiles and on some of the guns with single profiles. Therefore, the approximate range for background DNA on guns is between 0.35 and 0.6 (b).

We now see that the presence of DNA on the gun that does not correspond with Mr. M's profile is only slightly more likely if he did not fire the gun than if he did, suggesting that it would be unwise for the police to eliminate Mr.M from their enquiries on the basis of the DNA exclusion alone.

The example also illustrates that the frequency of background DNA on firearms, rather than the CPP, is the information required to assist the police investigation.

#### **CONCLUSION**

When forensic science laboratories limit their DNA statements to reports of matching or non-matching DNA, the investigator and courts are deprived of the scientist's understanding of body fluid transfer, DNA transfer, DNA persistence and presence as background. The Standards for the Formulations of Evaluative Forensic Science Expert Opinion give the scientist guidance on how to interpret his or her findings in order to better assist the investigator and the court.

### **ACKNOWLEDGMENTS**

Professor Christophe Champod for his assistance with evaluation of the nearby defence.

# **REFERENCES**


*Received: 01 September 2013; accepted: 25 October 2013; published online: 06 December 2013.*

*Citation: McKenna L (2013) Understanding DNA results within the case context: importance of the alternative proposition. Front. Genet. 4:242. doi: 10.3389/ fgene.2013.00242*

*This article was submitted to Statistical Genetics and Methodology, a section of the journal Frontiers in Genetics.*

*Copyright © 2013 McKenna. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# DNA transfer: informed judgment or mere guesswork?

# *Christophe Champod\**

*Faculty of Law and Criminal Sciences, School of Criminal Justice, Forensic Science Institute, University of Lausanne, Lausanne-Dorigny, Switzerland \*Correspondence: christophe.champod@unil.ch*

#### *Edited by:*

*Alex Biedermann, University of Lausanne, Switzerland*

#### *Reviewed by:*

*Sue Pope, Principal Forensic Services, UK James M. Curran, University of Auckland, New Zealand Paul Roberts, University of Nottingham, UK*

**Keywords: contact DNA, evaluation, forensic science, weight of evidence, expert testimony**

With the advances of analytical sensitivity, it is now possible to detect a DNA profile from minute quantity of DNA. It opens new investigative avenues (in cold cases for example), but also new interpretative challenges. Here, forensic scientists deal with items bearing DNA cellular material from areas showing no visible stain and have limited means to identify the nature of the body fluid involved. Such DNA cells can be considered as trace evidence that can be exchanged for reasons connected to the alleged facts under investigation (generally a direct transfer) but also following alternative and versatile ways (through secondary or tertiary transfer) that have no connection to the facts under investigation. The trace becomes an ubiquitous material that can be found for unconnected reasons. In addition, and especially with trace quantities of DNA, the debate in court is less focused on the issue of the source of the DNA (often the parties will not dispute it), but on the mechanisms whereby the biological material has been transferred (Taroni et al., 2013). In other words, the well-known territory of source level DNA statistics (see Buckleton et al., 2005 for example) does not help with the interpretation process, but the forensic scientist is invited to assess how likely it would be to observe this amount of DNA given various transfer mechanisms. The review by Meakin and Jamieson (2013) led them to conclude that the quantity of DNA or the quality of the profile cannot be used "to reliably infer the mode of transfer by which the DNA came to be on the surface of interest."

This rather complicated new landscape leads to two questions:


Regarding the first, my view is that it is definitely the role of the forensic scientist to provide as much guidance to the trier of facts if the knowledge he/she may bring is outside the general knowledge of the court and relevant to the task at hand. Shying away from this duty on the ground that considerations regarding transfer of trace DNA is less known than source level DNA statistics is not acceptable. There is a risk with leaving the presence of DNA to be assessed by others, left to advocacy, when the scientist can bring decisive knowledge (let alone the papers reviewed by Meakin and Jamieson), including highlighting how complex the task may be. We want to avoid the simplistic line of argument that I have heard at times: "We have found DNA corresponding to the defendant on the trigger of firearm, hence he manipulated the gun." It is crucial for a fair administration of justice that forensic scientists weigh their expectations of the amount of DNA recovered given both views. Hence scientists' guidance is required when the consideration of transfer mechanisms, persistence and background levels of the material has a significant impact on the understanding of the alleged activities and requires expert knowledge. But to provide guidance, the scientist will need information regarding the alleged case circumstances from both prosecution and defense's perspective. The duty may also require the scientist to highlight how little is known on transfer mechanisms and urge for a very careful assessment of the evidential contribution of the forensic findings, regardless of their strength with regards to the issue of the source itself. The absence of knowledge should not be an excuse for a guilty silence and for delegating the task to the fact finder without making explicit the complexity surrounding such an assessment.

In relation to the second question, Risinger (2013) warns against the "abuse of the notion of subjective probability," . . . "by simply making their best guess from experience when more should be required." In contrast, courts (I will concentrate here on the jurisdiction of England and Wales) have recently given a lot of freedom or authority to DNA scientists to exercise their professional judgement even when limited or no published data were available. In R v Reed and Reed (2009), the court ruled that in the context of the analysis of minute quantity of DNA, a reporting scientist is fully entitled to assess and weigh the relative merits of the possible mechanisms whereby cellular material can be exchanged. In that case the forensic scientist testified that, in her experience, it was highly unlikely that the appellants had innocently touched the knives and it was unrealistic that each appellant had passed their DNA to someone else who then transferred it to the pieces of plastic which were found at the victim's address. The court while recognizing that the scientific knowledge on transferability was incomplete, ruled that enough reliability had been demonstrated when the scientist is asked to consider cases where more than 200 picograms of DNA had been recovered. The court however stressed that "care must be taken to guard against the dangers of that evaluation being tainted with the verisimilitude of scientific certainty." The scientist is then authorized to comment on the probability of the forensic results given various transfer mechanisms as long as he/she makes it clear that we are dealing here with large uncertainty. This judgement led to a few commentaries. Jamieson (2011) highlighted the limited body of evidence represented by the few papers quoted by the court to support their opinion and warned against the view that the personal experience might override scientific research. A worry echoed in an editorial (Nic Daéid, 2010) in *Science and Justice* following the next case against Weller.

In R v Weller (2010), (a case involving the transfer of a reasonable quantity of DNA under the fingernails of the defendant), the defense appealed on the ground that knowledge regarding transfer and persistence mechanisms of DNA was not sufficient for experts to have been able to express an evaluation of the relative merit of the alleged activities. The Court of Appeal confirmed the positions taken in *Reed and Reed*. Given the difficulties of conducting systematically experiments replicating the circumstances in a particular case, the court recognized that a scientist is fully entitled to express a professional opinion on his/her expectation of DNA quantities given each mechanism envisaged by the court if the scientist has sufficient casework day-to-day experience. Jamieson and Meakin (2010) expressed their concerns after *Weller* seeing courts in England and Wales putting more trust in claimed experience than in published, peer-reviewed, publications. Rudin and Inman (2010a) also insisted on the fact that bald experience is not an acceptable substitute for experimental data.

Following these two cases, the Court of Appeal confirmed that view in subsequent rulings, not only in relation to consideration of DNA transfer but also of the sources of complex DNA mixtures. In R v Thomas (2011), a DNA scientist invoked her 12 year experience (and some unpublished and undisclosed data) to suggest that in a three-person DNA mixture, there was a low expectation of finding components matching all those of the appellant adventitiously. In R v Dlugosz et al. (2013), the court, recognizing their extensive professional experience, allowed two DNA scientists to qualify the occurrence of alleles in a complex mixture corresponding to the defendant as a "rare" for one and "somewhat unusual" for the other. The qualitative opinion expressed by the scientists was offered as an acceptable substitute in cases where the mixture is too complex for a quantitative assessment. However, as pointed out by Evett and Pope (2013), "there is no scientific basis for this belief—no scientific literature provides a reliable methodology, scientists are not trained to make such assessments and there is no body of standards to support them. Casework experience is not a substitute." One needs to assess the robustness of such qualitative opinion through a structured program of proficiency tests: it should not be based on casework data, but on DNA mixtures obtained under controlled conditions. Expressing qualitative judgments on the basis (or assumptions) of casework samples, without any calibration mechanism, is dangerous in my view. The expressed opinion could be the expression of nothing more than the *ipse dixit* of the expert.

The Court of Appeal endorsed such a *laissez faire* approach drawing from a much larger jurisprudence applicable to expertise in general, with some decisions relating to other areas of forensic disciplines. For example, in R v Otway (2011), involving gait analysis, and two other cases, namely R v Atkins and Atkins (2009), (face recognition) and R v T (2010), (footwear mark), the court recognized that an expert may express a qualitative opinion in the absence of quantitative (or statistical) supporting data as long as the subjective nature of the opinion and its foundation are transparently presented without giving more scientific weight to the judgment than it disserves. Edmond et al. (2010)remains rightly skeptical with the approach of dressing an opinion with all the concessions of limitations, but still allowing it, when the real significance of the forensic findings remains simply unknown.

In my view, what is critical, when it comes to offer expert opinions (in the present discussion regarding DNA transfer), is striking the appropriate balance between structured documented data (published or not) and unfettered personal opinion. Should these opinions be based *in extenso* on experience? My answer is clearly negative. I believe that experience constitutes a poor substitute to a systematic and structured acquisition of data. Any scientist offering views as to his/her expectations for the forensic findings under given case-related circumstances should be able to put forward documented sets of controlled experiments whose relevancy to the case under dispute can be argued. A further question is how many controlled experiments should be conducted and how close should they be to the alleged circumstances. In my view that question should be approached on a case-by-case basis using the adversarial mechanisms available to the parties. The major improvement here is that all parties can access and challenge the body of knowledge available to the expert proffering an opinion. As Rudin and Inman (2010a) indicated, the problem with experience only based opinions is that it cannot be challenged beyond the sterile opposition between mere opinions. Requiring the disclosure of structured data opens the route to a new type of debate regarding the relative merits of the assessments provided.

We could legitimately ask, as did Rudin and Inman (2010b), whether or not forensic science has gone too far in terms of sensitivity, meaning that the risks associated with the analysis of irrelevant (meaning not associated with the criminal activities under investigation) items are too high. I believe that the problem lies more in the usage made by law enforcement authorities of such sensitive technologies. There are only gains in terms of investigative leads if we take advantage of sensitive techniques, but maybe these methods should be used only in the investigative phase, not as a basis for evidence relied on at trial. Highly sensitive DNA analysis offers extraordinary ways to enhance an investigation through the suggestion of potential named sources (through DNA databases) for the inquiry to consider. I am not calling for limiting such opportunities. However, moving from such investigative information toward elements of evidentiary purposes to be used in court requires very careful attention. It may well be the case that a decisive investigative information will not be brought to court because of the issues discussed above. This is not a failure of forensic science, but simply an appropriate and fair (re-)positioning of the scientific techniques within the criminal justice process.

# **REFERENCES**


online at: http://www*.*bailii*.*org/ew/cases/EWCA/ Crim/2010/2439*.*html


**Conflict of Interest Statement:** The author and editor declare that while the author Christophe Champod and the editor Alex Biedermann are currently employed by the Université de Lausanne, Institut de police scientifique, Switzerland, there has been no conflict of interest during the review and handling of this manuscript.

*Received: 06 October 2013; accepted: 08 December 2013; published online: 25 December 2013.*

*Citation: Champod C (2013) DNA transfer: informed judgment or mere guesswork? Front. Genet. 4:300. doi: 10.3389/fgene.2013.00300*

*This article was submitted to Statistical Genetics and Methodology, a section of the journal Frontiers in Genetics.*

*Copyright © 2013 Champod. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# The nucleic acid revolution continues – will forensic biology become forensic molecular biology?

# *Peter Gunn1\*, SimonWalsh2 and Claude Roux1*

*<sup>1</sup> Centre for Forensic Science, University of Technology Sydney, Sydney, NSW, Australia <sup>2</sup> Forensics, Australian Federal Police, Canberra, ACT, Australia*

#### *Edited by:*

*Joelle Vuille, University of California, Irvine, USA*

#### *Reviewed by:*

*Koith Inman, California State University East Bay, USA Greg Hampikian, Boise State University, USA*

#### *\*Correspondence:*

*Peter Gunn, Centre for Forensic Science, University of Technology Sydney, P.O. Box 123, Broadway, NSW 2007, Australia e-mail: peter.gunn@uts.edu.au*

Molecular biology has evolved far beyond that which could have been predicted at the time DNA identity testing was established. Indeed we should now perhaps be referring to "forensic molecular biology." Aside from DNA's established role in identifying the "*who*" in crime investigations, other developments in medical and developmental molecular biology are now ripe for application to forensic challenges.The impact of DNA methylation and other post-fertilization DNA modifications, plus the emerging role of small RNAs in the control of gene expression, is re-writing our understanding of human biology. It is apparent that these emerging technologies will expand forensic molecular biology to allow for inferences about "*when*" a crime took place and "*what*" took place. However, just as the introduction of DNA identity testing engendered many challenges, so the expansion of molecular biology into these domains will raise again the issues of scientific validity, interpretation, probative value, and infringement of personal liberties. This Commentary ponders some of these emerging issues, and presents some ideas on how they will affect the conduct of forensic molecular biology in the foreseeable future.

**Keywords: forensic molecular biology, epigenetics, RNA, DNA, methylation**

# **INTRODUCTION**

The advent of forensic DNA testing not only revolutionized forensic science but also its contribution to investigations and court proceedings. It was also a key component in the elevation of the field from a niche science to a far reaching public-good science perceived very favorably by the general public. Notwithstanding this success, it is also necessary to highlight some challenges related to this development: exponential drain on resources, overemphasis on source attribution, related complex mathematical modeling and overemphasis on one single dimension of the informational content of a trace1. It is fair to say that, up to now, apart from its established role in identifying the "who" in criminal investigations and in Court, molecular biology played a relatively timid role in addressing other questions relevant to investigative and legal dimensions. A significant focus of development has been the sensitivity and discriminating power of core techniques, which, in the absence of concomitant research on the trace evidence properties themselves leads to the situation where DNA is often detected and profiled but no comment at all can be made about the nature of the biological material, or when or how it was deposited. However, recent developments in medical and developmental molecular biology are pushing the current frontiers of forensic applications and novel nucleic acid technologies are now ripe for application to expanded forensic challenges. This commentary presents some of these developments and illustrates how they may assist scientists to answer investigative, legal, or broader security questions in the future.

# **CONTEXT AND PRESENT LIMITATIONS**

The current applications of molecular biology to the investigation of crime have evolved over the past 25 years from the academic investigations of DNA structure and function, in particular that of non-coding DNA sequences (Jeffreys et al., 1985; Wolff et al., 1991). The predominant forensic application of molecular biology has been in human identity testing, using non-phenotypic markers such as restrictionfragment length polymorphisms, miniand micro- satellite sequences, and single nucleotide polymorphisms to identify the source of biological material. The choice of non-phenotypic markers for forensic analysis was driven primarily by their polymorphic diversity, coupled with the ethically acceptable lack of personal or medical information which they convey.

The analysis of these markers has become the cornerstone of forensic DNA testing, and will likely remain so for the foreseeable future, given their power of discrimination and the enormous financial and social investment that has been made worldwide in commissioning and supporting the technology and its analytical tools (databases, search engines, predictive software, etc.). There are many commercially produced DNA systems that are validated for forensic use; all are based on the analysis of panels of locus-specific microsatellite sequences known as short tandem repeats (STRs) by multiplex PCR, followed by capillary electrophoresis. See for example Hill et al. (2011) and the website of the National Institute of Standards and Technology (http://www.cstl.nist.gov/biotech/strbase/multiplx.htm) for an extensive (but by no means complete) catalog of available markers. Furthermore, these relatively stable DNA markers are transmitted by classical Mendelian genetics, which not only makes the analysis of parentage, kinship and population studies

<sup>1</sup>The term "trace" is applied here to any mark, material or remnant of an activity or presence. It is used independently from the actual size of this remnant (Margot, 2011).

comparatively accessible to biologists, but is also in line with accepted biological models of inheritance.

When STR analysis is difficult or impossible due to the amount or quality of DNA that is recovered from an item, then examination of mitochondrial DNA (mtDNA) will sometimes be undertaken. Usually this is done by dideoxynucleotide ("Sanger") sequencing of the hypervariable control region of the mitochondrial genome. While nowhere near as informative as STR analysis, mtDNA can provide useful investigative information or confirmation of identity in certain cases. The analysis of mtDNA in forensic biology has recently been reviewed by Holland et al. (2013).

STR markers are sufficiently polymorphic and amenable to multiplexing to allow for almost unambiguous identification of an individual, and so can place that person's DNA at the scene of a crime with a high degree of certainty. However STRs do not convey to the investigator any information about:


# **RECENT DEVELOPMENTS IN MOLECULAR BIOLOGY**

Advances in other areas of molecular biology over the past decade have revealed new levels of information contained within the nucleic acids, at both the organism and cellular level, far beyond that of simple DNA structure and sequence. There are several variants of such expression, which are often grouped together as "epigenetics":

They can be *RNA-mediated*, in which for example, tissuespecific micro-RNA (miRNA) sequences, usually about 20–25 bases long, influence the expression of genes, via their interaction with messenger RNA (mRNA). See for example Lagos-Quintana et al. (2002), wherein it is postulated that the population of expressed miRNAs plays a role in tissue specification or cell lineage decisions.

Specific sequences of *DNA* (predominantly the cytosine bases in runs of CpG sequences) become methylated during the course of an organism's development. The sites of methylation are specific to chromosomal location, and to a subset of cells or tissues, and assert extensive control of the expression of genes in those cells. Methylation of specific sites can be governed by behavioral or other environmental influences (Feinberg, 2007; Goldberg et al., 2007). Intriguingly, methylated patterns of DNA can be passed on to offspring (Zhao et al., 2005), which has profound implications for such established paradigms as Mendelian and Darwinian inheritance.

Specific methylation patterns are associated with particular disease states and other phenotypic traits, and can be detected in the laboratory by variants of current classical and next-generation DNA sequencing technologies (Madi et al., 2012).

Thus these emerging technologies have implications not only for the medical sciences, (see for example Bell and Spector, 2011), but of particular significance to the authors of this Commentary

is the application of these technologies to forensic biology. Methylation epigenetics is well suited to the detection and identification of body fluids, exploiting the differential methylation of specific chromosomal sites between tissues (Madi et al., 2012) and between individuals, even including identical twins (Li et al., 2011). Similarly, the tissue-specific expression of miRNAs can be used for the identification of body fluids in a forensic setting (Zubakov et al., 2010a; Courts and Madea, 2011).

In a forensic context, these advances in molecular biology have already shown potential in the identification of tissue types, and a demonstrated role in various behavioral traits (although *cause and effect* need to be demonstrated). For relevant examples, see references Petronis et al. (2000) and Boulle et al. (2012).

# **THE SUITABILITY OF METHYLATED DNA AND miRNAs IN FORENSIC APPLICATIONS**

As Zubakov et al. (2010a) state: "MicroRNAs (miRNAs) are non-protein coding molecules with important regulatory functions; many have tissue-specific expression patterns. Their very small size in principle makes them less prone to degradation processes, unlike messenger RNAs (mRNAs), which were previously proposed as molecular tools for forensic body fluid identification."

Furthermore, the use of either miRNAs or methylated DNA to identify body fluids has the advantage that extraction and purification methods are compatible with existing DNA purification methods; thus one extraction and purification process can provide templates both for body fluid identification and classical DNA identification (Madi et al., 2012). In contrast, traditional biochemical and serological body fluid identification techniques are often destructive, and not always compatible with downstream DNA processing (Raymond et al., 2011).

### **DETERMINING THE ORIGIN OF BIOLOGICAL SPECIMENS**

Considerable research has been undertaken into trying to establish the source of cells, fluid or tissue by its specific messenger RNA profile. When DNA has been isolated and characterized from unidentified body material, such as "touch" specimens, it can become a matter of courtroom contention as to the nature of that deposit – for example was it from sweat, or saliva or some other fluid? While gross deposits can be identified, there is a need for the forensic scientist to be able to characterize specimens that are increasingly smaller in size, and are from other than the "traditional" sources of DNA, so that the origin of this material can be contextualized. Many practitioners have explored the use of fluid-specific messenger RNA profiles to perform this characterization (Juusola and Ballantyne, 2005; Hanson et al., 2009, 2012; Hanson and Ballantyne, 2013; Roeder and Haas, 2013). However there is an overarching concern about using mRNAs, due to the inherent instability of messenger RNA (Haas et al., 2011).

In contrast miRNAs are more stable than mRNAs, in particular more resistant to degradation, and hence are better able to survive a range of crime scene conditions (Hanson et al., 2009; Madi et al., 2012). Investigations of panels of miRNAs to date have concentrated on the identification of the most forensically common tissues/fluid such as blood, semen, and saliva but references to other tissue-specific miRNAs are available in the literature (Pai et al., 2011).

Likewise the origin of tissues and biofluids can be determined with a high degree of certainty by DNA of specific methylated DNA sequences (Madi et al., 2012).

By applying these emerging technologies in conjunction with "classical" DNA identification techniques (STRs, mtDNA), the forensic scientist may exploit the informational content of a biological trace beyond that of simple *identification* of a donor, to a more holistic exploitation that that may also support inferences of *activity*.

### **ESTIMATING THE AGE OF SPECIMEN DONOR**

Individual age is one of the major factors determining human appearance. Establishing the age of an unknown person may provide important leads in police investigations, disaster victim identification, identity fraud cases, or in determining whether to try defendants as adults or juveniles. Currently used methods of age determination rely mostly on odontological or anthropological analysis. These techniques require the availability of human remains such as teeth, bones, or even the whole body. They are also subject to wide tolerances in terms of the conclusions and variation across different population groups. The development of molecular methods for age estimation using specimens that possess no phenotypic information, e.g., bloodstains, is of practical value, as these types of traces commonly occur at the crime scene.

Predicting age by molecular means has been achieved at the research level by several techniques, however none of these precursor methods has the resolution of accuracy to stand alone as a reliable predictive tool.

These techniques include the analysis of T-cell receptor rearrangements ("signal joint TCP excision circles" – sjTRECs) which has shown that the decline in the number of these in body fluids (primarily but not exclusively blood) is a reasonably good predictor of age (Zubakov et al., 2010b; Xue-ling et al., 2012). Standard real-time PCR techniques are readily adaptable to quantify this predictor.

Likewise the telomeres in peripheral leucocytes have been shown to shorten in an age-dependent manner (Ren et al., 2009), again as measured by standard molecular biological techniques such as Southern blotting.

Similarly, the analysis of the methylation patterns in the promoters of the Edar-Associated Death Domain (EDARADD), TOM1L1 (a gene coding for a protein of unknown function), and Neuronal Pentraxin II (NPTX2) genes is linear with age over a range of five decades (Bocklandt et al., 2011).

# **PREDICTING PHENOTYPIC CHARACTERISTICS OF SPECIMEN DONOR**

Considerable progress has been made in predicting phenotypic characteristics of the specimen donor (other than their gender). Single nucleotide polymorphisms (SNPs) are being exploited to predict phenotypic characteristics (also referred to as externally visible characteristics – EVCs) such as skin pigmentation, eye color, and biogeographic origin. These have been extensively reviewed elsewhere (Kayser, 2011). There have recently been introduced into the forensic community SNP tests that claim to predict with considerable certainty eye color (Irisplex®), (Walsh et al., 2011), and both eye and hair color (Hirisplex®; Walsh et al., 2013). Implementation of these into casework is still problematic, and there continues to be extensive disquiet in the wider judicial community about the ethical and legal implications of these applications (Koops and Maurice, 2008), in particular the potential for misapplication as a "racial profiling" tool. (M'charek et al., 2012) contend that "...*questions about defining populations* ... *and the application of EVCs in criminal investigation – lie at the core of most social, ethical, and legal issues raised by the translation of EVCs into forensic and police practices*." There are also more practical challenges that limit the adoption of these genetic technologies such as the availability of a stable technological product (akin to the commercially produced equivalents in routine use), the level of understanding within mainstream forensic institutions of deeper scientific and bio-ethical issues, an operational framework where such applications can be effective in delivering technical intelligence to investigators, and the capacity of forensic experts to articulate and advocate issues that impact the effectiveness of the applications.

## **CONCLUDING COMMENTS**

We have presented here just a few examples of the potential for new molecular biology technologies to assist in the investigation of crime. It has been our casework experience that there is frequent need for supporting evidence, beyond the identification of source, particularly where trace DNA deposits are pivotal components of a circumstantial case. Just as the introduction of DNA identity testing prompted a number of scientific, moral and legal challenges, so the expansion of molecular biology into these domains will raise again the issues of scientific validity, interpretation, probative value, and infringement of personal liberties. It is, however, hoped that the forensic science, law enforcement and legal communities have now more experience on how to deal with such issues than when forensic DNA profiling originally came to the fore. The development and fostering of forensic science as a distinctive holistic discipline (Margot, 2011; Roux et al., 2012) and the establishment of a stronger research culture in forensic science (Mnookin et al., 2011) will also assist to achieve a relatively smooth transition.

Will these or other technologies make their way into the crime lab? Possibly not; they are specialized, and are not likely to be called upon often enough to warrant the financial and logistical commitment that would be required of an operational forensic lab. But where the expertise to undertake these tests exists in other research settings such as universities, then we foresee the day when these academies will be called upon to lend their expertise to forensic investigations.

As forensic DNA profiling technologies such as rapid DNA increase the potential of DNA 1 day being applied as another modality of biometrics, the role of forensic institutions will focus more extensively on the full exploitation of crime scene material. In this sense, modern forensic molecular biology will only be as good as it allows scientists to answer investigative, legal or broader security questions. Doing this requires new science and new knowledge and this review provides an insight into opportunities to deliver both.

#### **REFERENCES**


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 29 August 2013; paper pending published: 04 October 2013; accepted: 08 February 2014; published online: 05 March 2014.*

*Citation: Gunn P, Walsh S and Roux C (2014) The nucleic acid revolution continues – will forensic biology become forensic molecular biology? Front. Genet. 5:44. doi: 10.3389/fgene.2014.00044*

*This article was submitted to Statistical Genetics and Methodology, a section of the journal Frontiers in Genetics.*

*Copyright © 2014 Gunn, Walsh and Roux. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# The National DNA Data Bank of Canada: a Quebecer perspective

#### *Emmanuel Milot 1, Marie M. J. Lecomte2,3, Hugo Germain4 and Frank Crispino4 \**

*<sup>1</sup> Groupe de Recherche PRIMUS, Faculté de Médecine, Université de Sherbrooke, Sherbrooke, QC, Canada*

*<sup>2</sup> Mt. Albert Science Centre, The Institute of Environmental Science and Research Ltd., Auckland, New Zealand*

*<sup>3</sup> The Department of Forensic Science, School of Chemical Sciences, The University of Auckland, Auckland, New Zealand*

*<sup>4</sup> Département de Chimie, Biochimie et Physique, Université du Québec à Trois-Rivières, QC, Canada*

#### *Edited by:*

*Franco Taroni, University of Lausanne, Switzerland*

#### *Reviewed by:*

*Kesheng Wang, East Tennessee State University, USA Hsin-Chou Yang, Academia Sinica, Taiwan*

#### *\*Correspondence:*

*Frank Crispino, Département de Chimie, Biochimie et Physique, Université du Québec à Trois-Rivières, 3351 Boulevard des Forges. CP 500, Trois-Rivières, QC G9A 5H7, Canada e-mail: frank.crispino@uqtr.ca*

The Canadian National DNA Database was created in 1998 and first used in the mid-2000. Under management by the RCMP, the National DNA Data Bank of Canada offers each year satisfactory reported statistics for its use and efficiency. Built on two indexes (convicted offenders and crime scene indexes), the database not only provides increasing matches to offenders or linked traces to the various police forces of the nation, but offers a memory repository for cold cases. Despite these achievements, the data bank is now facing new challenges that will inevitably defy the way the database is currently used. These arise from the increasing power of detection of DNA traces, the diversity of demands from police investigators and the growth of the bank itself. Examples of new requirements from the database now include familial searches, low-copy-number analyses and the correct interpretation of mixed samples. This paper aims to develop on the original way set in Québec to address some of these challenges. Nevertheless, analytic and technological advances will inevitably lead to the introduction of new technologies in forensic laboratories, such as single cell sequencing, phenotyping, and proteomics. Furthermore, it will not only request a new holistic/global approach of the forensic molecular biology sciences (through academia and a more investigative role in the laboratory), but also new legal developments. Far from being exhaustive, this paper highlights some of the current use of the database, its potential for the future, and opportunity to expand as a result of recent technological developments in molecular biology, including, but not limited to DNA identification.

**Keywords: DNA database, Canada, Québec, genetic engineering, forensic challenges**

# **THE CANADIAN NATIONAL DNA DATA BANK**

At the time the UK launched its DNA database in 1995, the exonerations of two wrongly accused individuals (Morin case, 1985 and Milgaard case, 1969) and the implementation of the C-104 bill (to amend the Criminal Code and the Young Offenders Act) acknowledged the need for a similar requirement in Canada and initiated the creation of the Canadian National DNA database (NDDB) by the Identification Act (Law C-37 of Dec. 10th, 1998) (Curran, 1997).

Following a nation-wide consultation with various institutional bodies (such as the Privacy Commissioner, the Canadian Bar Association, and the Canadian Police association), to address ethical, legal, and social implications issues also tackled by the National Human Genome Research Institute's (NHGRI) during the human Genome Project, its operative use was launched immediately after the proclamation of the S-10 bill on June 2000. A number of amendments led the NDDB to store genetic traces collected at crime scenes in the Crime Scene Index (CSI) and, under court order, the DNA profiles of offenders serving any sentence of imprisonment, for various categories of offences designated in section 487.04 of the criminal code, in the Convicted Offenders Index (COI).

Under the supervision of the DNA Data Bank Advisory Committee, composed of seven authoritative personalities involved in forensic biology, human rights and laboratory management, the NDDB is operated by the Royal Canadian Mounted Police (RCMP) for the benefit of all law enforcement agencies in the country, be it federal (the RCMP), provincial [the Ontario Police force or Sûreté du Québec (SQ)] or urban (depending on the level of police a town has to deliver in regard to its population), as provided by the RCMP at provincial and urban levels if requested.

The CSI is maintained by the RCMP labs, the Center of Forensic Sciences in Toronto (CFS <sup>1</sup> ) and the Laboratoire de sciences judiciaires et de médecine légale du Québec in Montréal (LSJML<sup>2</sup> ) (**Figure 1**). On July 15th, 2013, the COI contains more than 273,000 profiles, while the CSI is nearing 87,000<sup>3</sup> .

Based on 13 DNA markers, DNA profiles are managed and compared using the Combined DNA Index System (CODIS). On a yearly basis, a RCMP report on the management of the NDDB

<sup>1</sup> http://www.mcscs.jus.gov.on.ca/english/centre\_forensic/CFS\_intro.html.

<sup>2</sup>http://www*.*securitepublique*.*gouv*.*qc*.*ca/lsjml*.*html*.*

<sup>3</sup>http://www*.*rcmp-grc*.*gc*.*ca/nddb-bndg/stats-eng*.*htm*.*

provides statistics, financial costs information, a user guide for the reader, as well as information on the changing legal frame of the database, under the auspices of the Advisory Committee (Police, 2012).

The Advisory Committee controls and actively searches and suggests legislative and regulatory changes. This transparency in the management of the NDDB is what leads to the efficiency of the Canadian system, qualified as having "an astonishing degree of consistency in sampling regimes throughout the history of the Canadian DNA database." (Walsh, 2009). Using the ratio of hits over the product of NC, (N being the numbers of profiles in the COI and C the numbers of profiles in the CSI), to assess the efficiency of DNA databases between four western countries (USA, the United Kingdom, the Netherlands, and New Zealand), the performance of the NDDB ranks just below New Zealand and is quite good, accounting for the lower proportion of the population being present in the database (0.5% for Canada instead of 2.1% for New Zealand). In regards to the public perception of civil rights, it could easily be deemed highly efficient. At least "Canada had a well-understood and effectively resourced concept of operation in place prior to the initiation of databasing" (Walsh, 2009). On such ground, Canada seems better prepared than many other countries to tackle new challenges facing forensic DNA identification.

### **THE LSJML DATABANKING STRATEGY**

As with other DNA databanks, the NDDB holds key figures to address interpretation issues (Foreman et al., 2003; Dror and Hampikian, 2011) such as low copy numbers (LCN) (Lowe et al., 2002; Phipps and Petricevic, 2007), mixed samples (Bill et al., 2005; Curran, 2008), and familial searches (Bieber et al., 2006; Reid et al., 2008; Miller, 2010; Murphy, 2010; Gershaw et al., 2011; Meyers et al., 2011; Pham-Hoi et al., 2013). While addressing the issue of familial searches is not yet on the agenda, as it would require changes to the Canadian legislation, LCN has become a routine challenge faced by forensic labs nationwide. Indeed, due to technological improvements, the detection of eversmaller traces of DNA is now possible (Kayser and de Knijff, 2011). However, because of stochastic effects (drop-outs, dropins), this comes at the cost of lower repeatability and overall completeness of genetic profiles recovered from small quantities of DNA. This problem is made worse with mixtures owing to competitive amplification. Deconvoluting the information and sorting out the alleles of each contributor in a mixture can become hard to achieve even in simpler cases such as a mixed profile from two contributors. As a consequence of these new challenges, forensic laboratories, and the database managers may use various criteria to limit the deposition of mixed or partial profiles into the NDDB. For instance, the NDDB will only accept mixtures with data for *L* STR loci, where 9 ≤ *L* ≤ 13 with a maximum number of loci exhibiting more than two alleles equal to *L*−7, and with no more than five alleles per locus. Although STRs exhibit very high level of polymorphism enabling high discriminatory power, they are subject, like any amplification-based markers, to the presence of polymorphisms within the primer binding site which results in lack of amplification or so-called drop-out alleles (or null alleles). The impact of such result has been well documented (Haned et al., 2011) and probabilistic methods can be used to account for drop-in and drop-out alleles (Gill et al., 2012).

With these managerial constraints, the development of statistical methodologies allowing more formal quantitative comparisons of casework profiles to DNA databanks is required. In the meantime, the LSJML has developed an innovative investigative strategy to increase the use of partial profiles from LCN or complex mixtures in their search for matches in the databanks, relying on two complementary practices.

Elaborated interpretation and databanking guidelines at LSJML allow the specific extraction of the relevant genetic information contained in single-source or mixed profiles for databank searches for intelligence purposes (Noël et al., 2009). For instance, the flagging of alleles as "obligate" or "non-obligate" in queries sent to the NDDB allows filtering out considerably the potential matches, limiting them to a subset of possible matches that is consistent (see section Challenges of the LSJML model and research prospects) with all the information available for the casework. For example, this procedure is used to separate alleles that are likely to come from the putative aggressor in intimate swabs from the victim of a sexual assault—i.e., alleles that must be included in any candidate match returned by the NDDB—from alleles of less certain origin (e.g., alleles of the victim potentially shared with the aggressor) that need not be present in the candidate profile. More generally, this approach is valid for any mixture related to any type of infraction where some of the alleles are more likely than others to come from the offender(s). Another option is to eliminate alleles from a person whose DNA profile is known from other traces obtained for the same casework (e.g., victim, witness or single-source unknown), or those that would imply either highly unbalanced peak heights of a contributor to a mixture or dropouts when it is not a reasonable possibility based on statistical data. It is up to the reporting scientist to check the relevance of the hypothesis with his/her scientific investigation of the case.

The second aspect of the LSJML strategy is the maintenance of its own local database (also hosted in the CODIS system) where complex mixtures that do not meet the NDDB criteria can be deposited, namely in the "Forensic High Mixture" index, for comparison with other local casework profiles (**Figures 1**, **2**). In addition, the local database allows searching for matches using more loci, i.e., up to 15 at the LSJML operational setup instead of the 13 CODIS loci in the NDDB. Finally, mixed strategies are authorized whereby a full mixture can be deposited into the local database while a subset of its alleles (a "submixture") is sent to the NDDB. Thus, matches can potentially occur at the local level between the whole mixture kept as a "backup" and pure or mixed profiles from other caseworks. This can be especially useful when deconvolution is difficult so that there is much uncertainty around which alleles should be sent at the NDDB.

## **CHALLENGES OF THE LSJML MODEL AND RESEARCH PROSPECTS**

While the LSJML model provides great flexibility in maximizing the number of matches, it also raises legitimate questions about potential biases that may arise from its databanking strategy (Lynch, 2003; Dror et al., 2006; Dror and Hampikian, 2011).

Aware of it, the LSJML has adopted different strategies to assess their importance and limit them. These range from operational rules to current and prospective research projects. First, the LSJML does not declare a match as valid as soon it occurs (except when both the target and the candidate are single-sourced and complete). Thus, once a match between a target profile and a candidate profile in the NDDB has occurred, the LSJML scientist must assess its validity. The procedure involves an evaluation of the candidate profile using the original electropherogram from which the target profile was extracted, statistical data on peak height balance and drop-outs, as well as other profiles from the casework. This is the step where consistence with all the information available for the casework is evaluated. In addition, the validity of the match must also be confirmed by one of the two local scientists managing the databank.

Second, because the above procedure may limit but not completely eliminate fortuitous (wrong) matches, the opinion on evidential weight (Providers, 2009) is based on standard statistical approaches such as the probability of exclusion or likelihood ratios performed on the whole mixture, and not on extracted elements, except when a major profile can clearly be extracted using strict deconvolution rules.

Third, the LSJML has been proactive in challenging the validity of its own strategy with respect to biases or invalid match generation by undertaking a number of quantitative statistical evaluations. It is worthy to note that the databanking and match review strategies for targets arising from mixtures of various levels of

complexity generate valid candidate matches in comparable proportions to single-source targets (for which match validity is automatic), with similar levels of effort (i.e., working time required to evaluate the matches) see (Noël et al., 2009) and (Lavergne et al., 2008) for details. For instance, less than 12% of candidate matches produced with the Forensic Unknown and Forensic Mixture indexes (**Figure 2**) are rejected with a "no match" disposition after review. One critical aspect is that mixtures must be of good quality, namely show good peak intensities. Moreover, the LSJML has begun to perform experimental tests by searching twoperson mixtures with up to 13 mixed loci against the Florida data bank constituted of nearly 500,000 convicted offender profiles. Because of the geographical (∼2000 km) and country barriers between Québec and Florida, it is expected that almost any eventual match would be fortuitous. Corroborating above conclusions, these mixtures did not return more candidate matches than less complex ones. Moreover, all candidate matches were rejected independently (i.e., not in concert) by four reporting scientists. Finally, the lab, in collaboration with others, is presently evaluating an alternative to the current selection procedure for uploading mixtures to the NDDB. The new approach would be based on the number of expected matches accounting for the COI size and is implemented in the CODIS Match Estimator® module.

At this time, open discussion between LSJML and academic partners to better assess the potential hazards of inducing these databanking policies with respect to confirmations bias are currently underway. Nevertheless, an understanding of this strategy with respect to a possible future goal toward forensic intelligence should be kept in mind (Ribaux et al., 2006; Pham-Hoi et al., 2013). On the other hand, limiting decisions into whether identification was correct or not by only using pure profiles may provide a sense of security. However, this also leads to the restricted use of the information available, with potentially pertinent information discarded when solving everyday crimes. It is currently unclear what the consequences of refusing to tackle these issues will have on victims, and consequently on justice, who also has a validating role to play in this area.

Nevertheless, a fine-tuned approach, specific to the various types of casework (sexual assault, homicide, burglary, highvolume crimes, etc.) definitely needs to be addressed to better, and more rigorously, assess the consequences these changes will have on the whole process of identification.

# **BEYOND THE PRESENT DNA PRACTICE**

Notwithstanding these innovative practices and the relevant interpretation process to be developed being a sign of academicpractitioner joint effort, the development of STR mixture analysis and databank searching will eventually reach its limit impeding further improvements owing to the inherent limitations of using small sets of markers (typically *<* 20 for STR) typed by technologies that do not permit to separate DNA from different cells found in the same trace (with the exception of differential extraction of semen DNA). Ultimately, substantial increase in the power of mixture analysis will come from newer technologies such as single nucleotide polymorphisms (SNPs) (Daniel and Walsh, 2006; Kidd et al., 2006; Sanchez et al., 2006; Fang et al., 2009; Pakstis et al., 2010; Voskoboinik and Darvasi, 2011) or single-cell sequencing (Hanson and Ballantyne, 2005). Repositories like the NDDB will need to adapt to these forthcoming innovations in a way that permit forensic labs to benefit from the full power of these new tools for match searching, but without compromising on the usefulness of the STR information accumulated since their creation.

Other advances in the biological sciences, not strictly depending on the NDDB itself, could benefit from the advice/input/review of the Advisory committee to pave the way for a new forensic dimension (Daniel and Walsh, 2006; Kidd et al.,

#### **Box 1 | Beyond present DNA practices**

1) Going further with genomics

While it was initially believed PCR would be capable of solving the challenges encountered from analysing LCN samples, difficulties associated with interpretation lessens its pragmatic use in forensic casework (Gill et al., 2000; Kloosterman and Kersbergen, 2006; McCartney, 2008; Budowle et al., 2009) (see also LCN DNA Review at http://www*.*mccannfiles*.*com/id190*.*html). In addition, the use of a limited set of markers, as is the case today, restricts the potential discriminative power that could be accessed if using full genome sequences.

Single-cell genome sequencing is a rapidly improving technology with one of many applications including the detection of somatic intra-individual variation in cancer patients (Navin et al., 2011). Initially applied to small prokaryotic genomes (Stepanauskas and Sieracki, 2007), recent advances in nextgeneration sequencing have enabled the coverage of 93% of the much larger human genome from a single human cell (Zong et al., 2012). This not only allows for the identification of SNPs and LCN variation (Zong et al., 2012) but also genomic structural variation and somatic mutations that give rise to intra-individual genetic variation (O'Huallachain et al., 2012). With as many as 2500 genomic structural variations and three million SNPs (Abecasis et al., 2010) occurring between two unrelated individuals, the discriminative power of this technique could even allow for identical twins to be differentiated.

Full genome sequencing would permit the use of a much wider range of genomic polymorphism to convict or exonerate persons of interest. Although currently cost prohibitive, and against the current ideology that the use of anonymous loci is preferable, the dwindling cost associated with genome sequencing may enable their use in a foreseeable future.

2) Adding transcriptomics and proteomics to the forensic toolbox ?

With research carried out in the fields of transcriptomics and proteomics, opportunities are emerging to develop and add to the already existing genomic platforms.

Evidence of physical abuse is often left of the skin of victims, with bruising found to be the most common form of injury (Dye et al., 2008; Pierce et al., 2010; Jackson et al., 2012). The ability to accurately and reliably determine the age of a bruise, in living individuals, is currently lacking. If possible, this could provide vital evidence to legal cases of suspected physical abuse. In cases where multiple bruises are present on the body of a victim, providing evidence that the injuries were inflicted on separate occasions could have important medico-legal significance.

Building a human proteome map of protein markers present in unbruised skin, as well as bruised skin, and analysing changes in protein expression levels as a bruise evolves, could help to achieve these goals (Lecomte et al., 2013).

2006). Research into fields such as ancestry informative markers (AIMs) (Lao et al., 2008; Kersbergen et al., 2009; Kosoy et al., 2009; Liu et al., 2009), proteomics (Kool et al., 2007; Lecomte et al., 2013), genome/marker based phenotyping (Sulem et al., 2008; Liu et al., 2009; Zubakov et al., 2010; Walsh et al., 2011), framing the input of DNA to forensic intelligence (Jobling and Gill, 2004; Ribaux et al., 2006; Bond, 2007; Roman et al., 2009; Wilson et al., 2011), and the incoming lab-on-a-chip involvement of crime scene (Batt et al., 2009; Bell, 2011). All these fields belong to a still-debated investigative process (Kaye, 2007) opposed to the claim for a strict separation of laboratories from the law enforcement system (Nrc, 2009). **Box 1** presents two examples of techniques that could eventually be used in forensic sciences on a case-by-case basis. One of them, forensic proteomics, does not directly assist to the evolution of the NDDB. However, the power of these new tools to address personal characteristics of human beings, could lead to an ethical position being taken by the Advisory Committee, which could impact the future developments of the data bank.

# **CONCLUSION**

As exciting projects make their way in the field of molecular biology, real challenges also lie in the realm of forensic science, giving new impedimenta to forensic DNA and, raising obvious ethical, social, and economic questions. Nevertheless, the inescapable drive toward DNA intelligence and laboratory miniaturization, and the projection on the crime scene, could underline the need for a better scientific support of the crime scene officers present at the start of the forensic process.

As commissioner Paulson of the NDDB wrote in the last annual report, "the NDDB operates within a diverse environment that must consider scientific advancements, privacy rights, and changing legislation." In regards to the building up of the NDDB and the wisdom of its Advisory committee, an optimistic future for the scientific support of the Canadian law and justice systems is anticipated.

# **AUTHOR CONTRIBUTION**

Emmanuel Milot is post-PhD researcher in biology and consultant for the LSJML and the RCMP laboratories. His research interest addresses two domains: the use of genetics to study human and animal population dynamics and the causes of phenotypic variations between individuals.

Marie M. J. Lecomte has recently submitted her PhD employing proteomic techniques to determine the age of bruises in living individuals.

Hugo Germain is professor of biochemistry, head of chair on vegetal immunity, in charge of human DNA identification course at the forensic curriculum at the UQTR.

Frank Crispino is criminalist with a PhD degree from the University of Lausanne (Switzerland) and a former criminal and counter-terrorist investigation commander in the French Gendarmerie. He is now serving as professor in forensic science at the UQTR.

Although this article is a common elaborated paper, the order of appearance reflects their relative inputs in the paper, the corresponding author being, moreover, in charge of coordination.

# **ACKNOWLEDGMENTS**

The authors would like to thank M. Tony Tessarolo, director of the Center of Forensic Sciences, Ontario Province, for his kind highly appreciated relevant comments on a draft version of this paper and Ron Fourney from the NDDB for insightful discussions. We are also indebted to Josée Noël and Léo Lavergne from the Laboratoire de sciences judiciaires et de médecine légale for contributing important information to this paper and commenting the draft. We also thank Mireille Courteau who kindly drew the final version of the figures. Finally, we express our gratitude to the two unknown reviewers who helped us upgrade the original draft.

# **REFERENCES**


100 pg of DNA. *Forensic Sci. Int.* 112, 17–40. doi: 10.1016/S0379-0738(00) 00158-4


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 16 August 2013; paper pending published: 22 September 2013; accepted: 31 October 2013; published online: 20 November 2013.*

*Citation: Milot E, Lecomte MMJ, Germain H and Crispino F (2013) The National DNA Data Bank of Canada: a Quebecer perspective. Front. Genet. 4:249. doi: 10.3389/ fgene.2013.00249*

*This article was submitted to Statistical Genetics and Methodology, a section of the journal Frontiers in Genetics.*

*Copyright © 2013 Milot, Lecomte, Germain and Crispino. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Your uncertainty, your probability, your decision

## *Alex Biedermann\**

*Faculty of Law and Criminal Justice, School of Criminal Justice, University of Lausanne, Lausanne, Switzerland \*Correspondence: alex.biedermann@unil.ch*

#### *Edited by:*

*Franco Taroni, University of Lausanne, Switzerland*

#### *Reviewed by:*

*Ian W. Evett, Evett Forensic Inference Ltd., UK*

# **A book review on Understanding Uncertainty**

by Dennis V. Lindley, John Wiley and Sons, Inc., Hoboken, New Jersey, 2006, xv + 250 pages. ISBN: 978-0-470-04383-7.

In 1973, in his foreword for de Finetti's book "Theory of Probability, A critical introductory treatment", Professor Lindley wrote that "(. . . ) every now and again delightful ideas spring to view; the idea that we shall all be Bayesian by 2020 (. . . ). But, as I said, this is a book about life" (Finetti, 1974, ix). The two strains of idea, that we should all use probability to approach uncertainty, and that research that has been done on this topic in the twentieth century has the potential to "(. . . ) affect the activities of many people and ultimately all of us" (Lindley, 2006, xiv), is also central to his current book "Understanding Uncertainty" Lindley (2006). Thirty years ago, the year 2020 may have appeared far ahead in the future, but today we can see 2020 showing up at the horizon. So, the question is, where do we stand with our understanding of probability and uncertainty? This question is one of concern for Professor Lindley, as he writes: "(. . . ) I made a discovery. There were people out there, like politicians, journalists, lawyers, and managers, who were, in my opinion, making mistakes; mistakes that could have been avoided had they known the answers to the questions pondered in my ivory tower" (Lindley, 2006, xiv). These words were not intended to be critical. Rather, they express the view that it is up to academics to communicate and "(. . . ) explain in terms that motivated, lay persons can understand, some of the discoveries made in academe, and why they are of importance and value to them, so that they might use the results in their lives" (Lindley, 2006, xiv).

The author makes every effort to achieve this goal. The book is written with exceptional clarity, and the arguments are presented in a way that directly addresses the reader "(. . . ) conveniently called 'you' (. . . )" (Lindley, 2006, 1, 2). The "you" is a stylistic choice that places the author directly in line with other influential authors who hold the so-called subjectivistic interpretation of probability theory1. This reinforces the author's intention to place the readers in the center of the argument: in fact, he notes "[t]his book is for you, whoever you are" (Lindley, 2006, 2). This intention stems directly from one of the book's main messages: that probability is inevitable.

The book is not primarily about the *calculus* of probability (indeed, mathematics are kept to a strict minimum); it is about the very *meaning* of probability how probability ought to be understood in order to deal with uncertainty. On this latter point, Professor Lindley is uncompromisingly clear and, at the same time, draws yet another parallel to de Finetti's two-volume work on probability Finetti (1974, 1975): probability is the measure for strength of belief, but probability *does not exist* in the sense of being a property of the outside world. On a first view, specialized readers of this Frontiers journal may find this proposition all too general, or even inappropriate. But it is not, for several reasons.

Indeed, it is common for forensic science commentators to use the abstract expression *the probability* for some event. Also, scientists may feel or object that they could not ascertain a particular number, only so-called upper and lower probabilities. However, uncertainty about a given proposition may vary between persons, because their extent of knowledge may differ, hence the reason why it is more appropriate to refer to *your probability*, ours or anybody's. Moreover, Professor Lindley presents us with persuasive argument that probability is given by a single number.

Further examples that point out the relevance of this book for forensic specialists can readily be found. Suffice to note that, often, probability and likelihood are used as synonyms. Similarly, probability is often equated with frequency. Here the author emphasizes that these ideas are inappropriate. These terms have very distinct and precise meanings, and it goes without saying that these distinctions have a potential to help clarify and improve the rigor of forensic science communications. So, if you think or have always thought that you can pass from frequency to a belief in a straightforward way, then you might confuse a notion that refers to data with one that refers to belief, and this book will show you that passing from one notion to the other is *not* straightforward.

This book is about its readers' uncertainty, and the book's title "Understanding Uncertainty" essentially "(. . . ) means knowing the three rules of probability" (Lindley, 2006, 66). This topic deals with how one's beliefs should be organized, but there is a further important subject

<sup>1</sup> Indeed, in de Finetti's treatise we find the following parallel: "Let us introduce right away the use of 'You,' following Good (Savage uses 'Thou')." (Finetti, 1974, 27).

that the book brings to the attention of the reader: the *use* of beliefs in action. How might one decide between different courses of action? This question moves the discussion from uncertainty to possible consequences of actions taken under uncertainty, the expression—called utility—of the desirability of these consequences, and the maximization of expected utility as a basis for action. Currently, forensic and legal writings draw little attention to thoughts on how to extend the view from probability and uncertainty to analyzing how to decide sensibly between possible actions. Yet, questions of decision making abound in forensic science practice Taroni et al. (2005) (e.g., "should DNA profiling analyses be performed in this case or not?") and, ultimately, in court Kaye (1999).

To attempt a review of a book of an eminent academic such as Professor Lindley is both difficult and daring for a generalist. The "review" here thus is not written from a position that claims authority—rather, it is a tribute to a work that has the value of inspiring the practice of forensic science to serve society better, even though the general theme requires much challenging thought. The words with which Professor Lindley described de Finetti's work, "[t]he author has words of wisdom to say about many things and the wisdom often only appears after reflection" (Finetti, 1974, ix), clearly apply also for Lindley's own work "Understanding Uncertainty."

# **REFERENCES**


*Received: 29 June 2013; accepted: 19 July 2013; published online: 07 August 2013.*

*Citation: Biedermann A (2013) Your uncertainty, your probability, your decision. Front. Genet. 4:148. doi: 10.3389/fgene.2013.00148*

*This article was submitted to Frontiers in Statistical Genetics and Methodology, a specialty of Frontiers in Genetics.*

*Copyright © 2013 Biedermann. This is an openaccess article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# The evidential foundations of probabilistic reasoning: toward a better understanding of evidence and its usage

# *Patrick O. Juchli\**

*School of Criminal Justice, University of Lausanne, Lausanne, Switzerland \*Correspondence: patrick.juchli@unil.ch*

#### *Edited by:*

*Franco Taroni, University of Lausanne, Switzerland*

#### *Reviewed by:*

*Mike Redmayne, London School of Economics, UK*

**Keywords: forensic science, evidence interpretation, probability, likelihood ratio, relevance, credibility, inferential force**

# **A book review on**

# **The Evidential Foundations of Probabilistic Reasoning**

Edited by David A. Schum, Northwestern University Press, Evanston, 2001, xviii + 545 pages.

"The Evidential Foundations of Probabilistic Reasoning" by David A. Schum ". . . contains a collection of thoughts . . . " (p. 1) on issues related to evidence and to inference tasks based on evidence. The study of such issues is best summarized by an expression introduced in chapter 1: "Science of Evidence." The Science of Evidence tries ". . . to treat the study of evidence as having a life of its own . . . " (p. 8). This perspective of examining evidence and inference with an interdisciplinary, generalist approach, is also reflected by the author David Schum himself: he is a professor of law and information technology and engineering at George Mason University. The fundamental insights he shares in this book are—unfortunately—all too often overlooked and unknown in forensic and judicial practice and research.

An important feature of evidential inference is its involvement with uncertainty, and consequently its probabilistic nature. This view is held also by Schum. He acknowledges that uncertainty is a prevalent feature of reasoning tasks based on evidence, and that it attends situations of daily life but also and most prominently, legal applications: ". . . in any inference task our evidence is always incomplete, rarely conclusive, and often imprecise or vague; it comes from sources having any gradation of credibility. As a result, conclusions reached from evidence [. . . ] can only be probabilistic in nature." (p. xiii) Unfortunately, forensic practice regularly distrusts the notion of probability because people focus on precise numbers (derived from a generous data pool). However, assigning numbers for probabilistic evidence evaluation is neither a prerequisite nor an end for analyses of evidential inference. Schum's work is directly relevant to this aspect by demonstrating that (1) purely structural considerations on evidence and (2) adopting probabilities as numerically variable ingredients of inferences, enable us to approach numerous problems, and to explore evidential subtleties or complexities. Let us first consider (1) and then (2).

1. Every item of evidence fans out into two primary dimensions: *relevance* and *credibility*. A relevance relationship between an event (for the purpose of this review let us say, "DNA matches with suspect's DNA") and a hypothesis ("suspect is the assailant") can involve a multistage reasoning (chain of reasoning). A given linkage pattern between elements of a chain of reasoning is called "argument." Elements regarding the credibility of evidence (e.g., "how reliable is the expert reporting the DNA typing results?") are located upstream in such a chain of reasoning. Depending on the type of evidence and the desired level of detail, it may also involve a multistage reasoning process and produce an argument. Thus, a probabilistic assessment of evidence requires an argument structured in terms of relevance and credibility. The argument structure becomes even more complex when multiple items of evidence are involved. In spite of this fact, basic configurations of evidence combination can be identified and analyzed probabilistically. Schum shows in his studies that such basic configurations of evidence combination result in specific inference structures and well defined inferential mechanisms.

2. Every item of evidence is characterized by an *inferential force*. It expresses if and to what extent evidence supports a hypothesis. Its quantity depends on the argument structure we choose for the evidence and on the probabilistic assessment we attach to the argument. The likelihood ratio is commonly used in Bayesian analysis to measure the inferential force of evidence. The study of likelihood ratios under varying probabilities is an important aspect of Schum's work: "[m]y essential research strategy was to perform sensitivity analyses on the likelihood ratios I identified." (Schum, 1999, p. 576). By doing so, Schum shows how certain argument structures give rise to peculiar inferential phenomenons such as in this non-exhaustive enumeration: inferential drag, redundancy, and synergism. Each additional reasoning stage in a chain of reasoning generally weakens the inferential force of an item of evidence: an inferential drag is accumulated. The likelihood ratio analysis on the inferential drag shows how such an accumulation is generated. Redundancy and synergism occur in specific configurations of evidence combination. The presence of the former implies that knowledge of one item of evidence can diminish or even nullify the inferential force of another. Ignoring redundancies can lead to overstatements of the joint inferential force of the items of evidence. Synergy relates to the opposite situation: the knowledge on one item of evidence increases the inferential force of another. Ignoring synergies leads to understatements of the joint inferential force.

Now, how is such knowledge useful in practice? First, it does not matter from which domain the evidence comes from, nor do we need to be familiar with its domain-specific methods and techniques to enhance our reasoning with these insights. Second, by identifying generic inference structures we know which inferential mechanisms we are exposed to and which we are not. Hence, we are less likely to be subjected to flawed reasoning leading to over- and understatements when assessing the inferential force of evidence. Imagine, for example, a DNA trace is analyzed by two laboratories. Now we have two results, but is our evidence also twice as strong? Third, knowledge on basic inference structures creates gateways to contextualized evidence interpretation, and even more so when we deal with masses of evidence [see for the analysis of a judicial case (Kadane and Schum, 1996) and for a forensic case (Juchli et al., 2012)]. This is a particularly strong point since an item of evidence is typically found in conjunction with other evidence.

The book discusses a vast array of evidence related subjects from different standpoints and across different disciplines. It demands time due to its broad scope; careful reading, and mental flexibility due to its interdisciplinary character. Sometimes it might even ask for the reader to be patient as some subjects are developed incrementally making a few passages appear repetitive. In turn, many topics and problems that have appeared opaque and uneasy before may become clear and intellectually palpable afterwards. For readers who are interested in better understanding the properties of evidence and how to embrace evidence by systematic and logic reasoning, this a book that deserves serious consideration.

# **REFERENCES**


*Received: 23 July 2013; accepted: 06 August 2013; published online: 23 August 2013.*

*Citation: Juchli PO (2013) The evidential foundations of probabilistic reasoning: toward a better understanding of evidence and its usage. Front. Genet. 4:164. doi: 10.3389/fgene.2013.00164*

*This article was submitted to Statistical Genetics and Methodology, a section of the journal Frontiers in Genetics.*

*Copyright © 2013 Juchli. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*