# **ADVANCES IN SYSTEMS IMMUNOLOGY AND CANCER**

**Topic Editors Masaru Tomita, Masa Tsuchiya and Kumar Selvarajoo**

#### *FRONTIERS COPYRIGHT STATEMENT*

© Copyright 2007-2014 Frontiers Media SA. All rights reserved.

All content included on this site, such as text, graphics, logos, button icons, images, video/audio clips, downloads, data compilations and software, is the property of or is licensed to Frontiers Media SA ("Frontiers") or its licensees and/or subcontractors. The copyright in the text of individual articles is the property of their respective authors, subject to a license granted to Frontiers.

The compilation of articles constituting this e-book, wherever published, as well as the compilation of all other content on this site, is the exclusive property of Frontiers. For the conditions for downloading and copying of e-books from Frontiers' website, please see the Terms for Website Use. If purchasing Frontiers e-books from other websites or sources, the conditions of the website concerned apply.

Images and graphics not forming part of user-contributed materials may not be downloaded or copied without permission.

Individual articles may be downloaded and reproduced in accordance with the principles of the CC-BY licence subject to any copyright or other notices. They may not be re-sold as an e-book.

As author or other contributor you grant a CC-BY licence to others to reproduce your articles, including any graphics and third-party materials supplied by you, in accordance with the Conditions for Website Use and subject to any copyright notices which you include in connection with your articles and materials.

All copyright, and all rights therein, are protected by national and international copyright laws.

The above represents a summary only. For the full conditions see the Conditions for Authors and the Conditions for Website Use.

**ISSN** 1664-8714 **ISBN** 978-2-88919-313-4 **DOI** 10.3389/978-2-88919-313-4

### *ABOUT FRONTIERS*

Frontiers is more than just an open-access publisher of scholarly articles: it is a pioneering approach to the world of academia, radically improving the way scholarly research is managed. The grand vision of Frontiers is a world where all people have an equal opportunity to seek, share and generate knowledge. Frontiers provides immediate and permanent online open access to all its publications, but this alone is not enough to realize our grand goals.

### *FRONTIERS JOURNAL SERIES*

The Frontiers Journal Series is a multi-tier and interdisciplinary set of open-access, online journals, promising a paradigm shift from the current review, selection and dissemination processes in academic publishing.

All Frontiers journals are driven by researchers for researchers; therefore, they constitute a service to the scholarly community. At the same time, the Frontiers Journal Series operates on a revolutionary invention, the tiered publishing system, initially addressing specific communities of scholars, and gradually climbing up to broader public understanding, thus serving the interests of the lay society, too.

### *DEDICATION TO QUALITY*

Each Frontiers article is a landmark of the highest quality, thanks to genuinely collaborative interactions between authors and review editors, who include some of the world's best academicians. Research must be certified by peers before entering a stream of knowledge that may eventually reach the public - and shape society; therefore, Frontiers only applies the most rigorous and unbiased reviews.

Frontiers revolutionizes research publishing by freely delivering the most outstanding research, evaluated with no bias from both the academic and social point of view.

By applying the most advanced information technologies, Frontiers is catapulting scholarly publishing into a new generation.

### *WHAT ARE FRONTIERS RESEARCH TOPICS?*

Frontiers Research Topics are very popular trademarks of the Frontiers Journals Series: they are collections of at least ten articles, all centered on a particular subject. With their unique mix of varied contributions from Original Research to Review Articles, Frontiers Research Topics unify the most influential researchers, the latest key findings and historical advances in a hot research area!

Find out more on how to host your own Frontiers Research Topic or contribute to one as an author by contacting the Frontiers Editorial Office: researchtopics@frontiersin.org

# **ADVANCES IN SYSTEMS IMMUNOLOGY AND CANCER**

### Topic Editors:

**Masaru Tomita,** Keio University, Japan **Masa Tsuchiya,** Keio University, Japan **Kumar Selvarajoo,** Keio University, Japan

Fluorescence microscopy image with computational segmentation and annotation of macrophages (adapted from Wenzel et al, in this issue).

Image taken from: Wenzel J, Held C, Palmisano R, Teufel S, David J-P, Wittenberg T and Lang R (2011) Measurement of TLR-induced macrophage spreading by automated image analysis: differential role of Myd88 and MAPK in early and late responses. *Front. Physio*. 2:71. doi: 10.3389/fphys.2011.00071

### Aims and Scope:

The Research Topic is designed to feature the latest innovative and leading-edge research, reviews and opinions on the study of complex and dynamic processes related to the mammalian immune system and cancer. All papers were meticulously selected to present our readers the multidisciplinary approach to tackle the existing challenges faced in these important fields.

From high throughput experimental methodologies to computational and theoretical approaches, the articles are intended to introduce physicists, chemists, computer scientists, biologists and immunologists the idea of systems biology approach to the understanding of mammalian immune system and cancer processes.

Attention was given to works that developed more effective approaches to the treatment of proinflammatory disease and cancer. The strong interdisciplinary focus will discuss biological systems at the level from a few molecules to the entire organism.

Specific focus domain includes:

Innate and adaptive immunity, cancer and cancer stem cell, genomic, proteomic and metabolic analysis, imaging, biophysics of immune and cancer response, computational modeling, non-linear analysis, statistical analysis, translational and disease models

Types of articles:

Viewpoint, commentaries, research letters, research articles, review and methodologies

# Table of Contents


**EDITORIAL** published: 02 July 2014 doi: 10.3389/fphys.2014.00249

## Advances in systems immunology and cancer

### *Kumar Selvarajoo1,2\**

*<sup>1</sup> Systems Immunology, Institute for Advanced Biosciences, Keio University, Tsuruoka, Japan*

*<sup>2</sup> Systems Biology Program, Graduate School of Media and Governance, Keio University, Fujisawa, Japan*

*\*Correspondence: kumar@ttck.keio.ac.jp*

#### *Edited and reviewed by:*

*Raina Robeva, Sweet Briar College, USA*

**Keywords: systems biology, high dimensional data, immunology, cancer, plasticity, computational biology, statistics, nonparametric**

The last two decades have generated numerous studies that show the close link between immune response and cancer progression in the mammalian system. In parallel, we have also witnessed significant progress in systemic approaches, such as high-throughput, multi-dimensional and dynamical analyses, in tackling biological complexities. We took this opportunity to organize a research topic that encompasses the current advances in immunology and cancer. The intention is to emphasize the importance of holistic view, and how such outlook can help shape the future of biological research. In total, our topic consists of 10 articles: five reviews, three research, and two perspectives.

Owens and Naylor introduces the current understanding of cancer heterogeneity and stemness (Owens and Naylor, 2013). They surveyed a depth of recent literature that points to the presence of breast cancer stem cells (CSCs), which are responsible to mediate metastasis and are resistant to both radiationand chemo-therapies. Although the classification of CSCs are currently based on the expression levels of cell surface markers CD44+CD24− and enzyme Aldehyde dehydrogenase (ALDH) activity, they note that the heterogeneity of single cancer cells makes this classification a nontrivial process. Thus, they ask for more mechanistic approaches to elucidate the origins for CSCs, so that more targeted novel therapies can be developed.

In another survey of cancer mechanisms, Catalan et al. discuss the importance in understanding the connection between adipose tissue immunity and cancer (Catalán et al., 2013). They first quote numerous works that showed obesity-related chronic inflammation and, next, mention others that have demonstrated increased levels of immune cells and proinflammatory mediators in the expanded adipose tissue. Finally, they note specific obesityassociated adipokines that can promote tumor growth. Although the mechanistic links between obesity and cancer still remains unclear, more systemic analyses could reveal better hints in the future.

Sangdun Choi and colleagues present a detailed update on the different structures of the crucial innate immune pattern recognition receptors, namely the Toll-like receptors (TLRs) (Manavalan et al., 2011). There are 13 known mammalian TLRs to-date, however, details of TLR12 and 13 is vastly unclear. Here, the authors cover the details of TLR1-11, especially on their structures, to understand the interactions of TLRs with their ligands and activators. They also argue that 3-D molecular simulations can be useful to make predictions on unknown interactions between TLRs and other possible novel interacting partners.

Remaining on the same topic of TLRs, to investigate the differential roles of adaptor molecule MyD88 and MAP kinase activation in early and late immune response, which will influence the spatial movement of macrophages, Wenzel et al. developed an intelligent algorithm for automated image analysis (Wenzel et al., 2011). The novel approach is able to track cell spreading, after ligand stimulation, more accurately and with significant improvement in processing time, compared with manual techniques that are commonly adopted. Their main findings indicate that MyD88 is key for late spreading of macrophages while MAP kinase p38 is crucial for early spreading.

Apart from innate immunity, another important aspect of our immune response is the orchestration of the adaptive immunity. T cells are lymphocytes that are central in the adaptive responses. In order to perform its specialized task, T cells need to differentiate into different lineages for executing distinct responses. In the unstimulated naïve form, T cells exist mainly as two subtypes depending on their surface markers, CD4+, and CD8+ T cells. Ganusov and colleagues reviewed the differentiation lineages taken by CD4+ T cells on encountering MHC class II found on the surface of antigen-presenting cells such as macrophages or dendritic cells (Magombedze et al., 2013). Mainly, they emphasize on the functional plasticity of CD4+ T cells, and argue that understanding this will help treat diseases such as autoimmune diseases and allergic reactions where the elevated activity of differentiated T cells (e.g., T helper 17 or Th17 cells) may be reprogrammed to reach a different attractor state or back to its naïve form that will not be injurious to the host. They acknowledged that computational or mathematical models can be useful for predicting how one could convert a particular T cell subset into another.

A subsequent manuscript by Blair et al. reviews some of the most common mathematical and statistical approaches used for immune and cancer systems biology at different scales of biological modularization (Blair et al., 2012). Next, Hiroi and colleagues present a method to optimize the model parameters where experimental data are either sparse or noisy (Hiroi et al., 2014). They tested their method on well-established data on c-Myc and E2F transcriptional processes. The following article by Oyama and colleagues briefly discusses about recent high-throughput phosphoproteomics research (Kozuka-Hata et al., 2012). They describe the basic terminologies used and also highlight the importance of such methods for the development of large-scale signal transduction models for systemic interpretation of EGF signaling, TLR signaling or any other pathways of interest.

Fitting with the theme of adopting systemic approaches for understanding immune and cancer response is the paper by Campbell et al. (2011). Here, they have studied the distinct roles of CD8+ T cells to the pathogenesis of cancer. Using *in vivo* derived quantitative data of tumor promoting Tag-expressing mice cells encountering CD8+ T cells, they developed a computational model to investigate the interaction pathways. Remarkably, using a simple ordinary differential equation model, the responses of CD8+ T cells to different perturbations *in silico* were consistent with matched experiments. However, from the model, it became clear that the proliferation and decay rates of CD8+ T cells were strongly constrained and hence, Tag-expressing mice cells become tolerant to tumors. Knowing such information *a priori* will surely aid researchers to understand and possibly avoid poor targets for regulating cancer progression.

Finally, we conclude our collection with an interview with a prominent Japanese physicist, Kaneko (2011), who has switched his interest from pure theory to understanding complex living systems. In the article, he describes the reason behind his renewed interest, and the challenges facing theoreticians in biology. In summary, we believe the articles in "Advances in Systems Immunology and Cancer" research topic or e-book will bring continued interests for the development and utility of multidisciplinary approaches to tackle complex diseases.

### **ACKNOWLEDGMENT**

The author thank co-editors Masaru Tomita and Masa Tsuchiya for jointly hosting the research topic.

### **REFERENCES**


dynamic modeling approach. *Front. Physiol.* 2:32. doi: 10.3389/fphys.2011. 00032


**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 11 June 2014; accepted: 15 June 2014; published online: 02 July 2014. Citation: Selvarajoo K (2014) Advances in systems immunology and cancer. Front.*

*Physiol. 5:249. doi: 10.3389/fphys.2014.00249 This article was submitted to Systems Biology, a section of the journal Frontiers in Physiology.*

*Copyright © 2014 Selvarajoo. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

## Breast cancer stem cells

### *Thomas W. Owens and Matthew J. Naylor\**

*Discipline of Physiology, School of Medical Sciences and Bosch Institute, The University of Sydney, Sydney, NSW, Australia*

#### *Edited by:*

*Kumar Selvarajoo, Keio University, Japan*

#### *Reviewed by:*

*Zhiqun Zhang, Banyan Biomarkers Inc, USA Guanglong Jiang, Capital Normal University, China*

#### *\*Correspondence:*

*Matthew J. Naylor, Discipline of Physiology, School of Medical Sciences and Bosch Institute, The University of Sydney, Room E212, Anderson Stuart Building (F13), Camperdown, Sydney, NSW 2006, Australia e-mail: matthew.naylor@ sydney.edu.au*

### **INTRODUCTION**

Breast cancer is the leading cause of cancer death in women, causing extensive morbidity and psychological distress to millions globally. Encouragingly, the combination of better screening and treatment programmes have moderately improved the chances of surviving the disease, but there is still much to be done if the many women who are refractory to current therapies are to have a better chance of survival. Over the last decade breast cancer cells with stem-cell-like properties have been identified and characterized. There is now much interest around the role that these breast cancer stem cells (CSCs) have in the disease and whether they provide the key to unlocking new insight into the mechanisms driving breast cancer progression, drug resistance and reoccurrence.

Often described as a caricature of normal tissue development, cancer occurs when the regulation of tissue homeostasis is perturbed, resulting in the evolution of cells with increased growth and survival potential. The breast, like many other organs, is a hierarchically-organized tissue maintained by a series of stem and progenitor cells that have decreasing potency as they differentiate toward terminally-committed epithelial cells. Below, we describe briefly the normal breast epithelial hierarchy, but for comprehensive analyses we recommend (Visvader, 2009; Van Keymeulen et al., 2011; Raouf et al., 2012; Šale et al., 2013).

The breast is composed of a bilayered epithelium comprising two main epithelial cell types; luminal and basal (Watson and Khaled, 2008; Gusterson and Stein, 2012). The luminal cells line the ductal structures that will transport milk to the nipple during lactation. The basal cells surround the luminal cells and are in contact with the surrounding basement membrane that separates the parachyme from the stromal component of the tissue. Mammary stem cells (MaSCs) share cell surface and expression profiles consistent with basal cells and are hence thought to reside within the basal compartment of the gland. Isolated several

Cancer metastasis, resistance to therapies and disease recurrence are significant hurdles to successful treatment of breast cancer. Identifying mechanisms by which cancer spreads, survives treatment regimes and regenerates more aggressive tumors are critical to improving patient survival. Substantial evidence gathered over the last 10 years suggests that breast cancer progression and recurrence is supported by cancer stem cells (CSCs). Understanding how CSCs form and how they contribute to the pathology of breast cancer will greatly aid the pursuit of novel therapies targeted at eliminating these cells. This review will summarize what is currently known about the origins of breast CSCs, their role in disease progression and ways in which they may be targeted therapeutically.

**Keywords: breast cancer, cancer stems cells, transcription factors, cell fate, mammary gland**

years ago through the use of cell surface expression markers, cell populations greatly enriched for MSCs have been shown to be capable of reconstituting an entire mammary gland when transplanted into a mammary fat pad cleared of endogenous epithelium. Furthermore, serial transplants have demonstrated that the MSCs can self-renew as well as give rise to the other cell types (Shackleton et al., 2006; Stingl et al., 2006).

Initially thought to be restricted to relatively few cell types (luminal, basal, and stem cells), the repertoire of mammary cell types has expanded over the last few years. Development of lineage-specific markers and *in vitro* functional assays has enabled the isolation of discrete sub-populations of epithelial progenitors (Raouf et al., 2012; Sheta et al., 2012). Using an alternative approach, *in vivo* lineage-tracing has recently identified previously undescribed epithelial cell types (Šale et al., 2013). In the future, these techniques will likely unearth additional levels of complexity in the epithelial cell hierarchy that will no doubt aid our understanding of breast cancer and CSCs. However, when discussing CSCs, it is imperative to highlight that they are distinct from normal stem cells.

### **DEFINING CANCER STEM CELLS**

It is important to clarify that although they share functional similarities to normal stem cells, CSCs are not necessarily derived from stem cells. A CSC is functionally defined by the ability to (1) form a tumor in immunocompromised mice, (2) self-renew shown by tumor formation in secondary mice and (3) "differentiate," i.e., produce cells with non-stem cell characteristics (McDermott and Wicha, 2010).

In certain tissues, new technological advances are enabling CSCs to be studied in their primary setting, without the need for transplantation, however comparable studies have not yet been described in the breast (Chen et al., 2012; Driessens et al., 2012; Schepers et al., 2012).

We have chosen to use the term CSC but we recognize that cells with defining features of CSCs are also referred to as tumorinitiating cells (TICs) and tumor-propagating cells. In the majority of cases, these terms refer to the same functional entity. TICs can also describe the cell from which the cancer originated and CSCs may form long after the tumor was initiated. The cancer cell of origin is discussed in length elsewhere (Visvader, 2011). This review will focus on breast CSCs, their origins, pathological significance and potential therapeutic strategies to tackle them.

### **DISCOVERY OF BREAST CANCER STEM CELLS**

Historically, the hematopoietic field has led the way in the identification of stem and progenitor cells and their resulting lineages. The same was true in the CSC field, with the CSC-theory in solid tumors validated only relatively recently (Al-Hajj et al., 2003). Using cell surface markers Al-Hajj and colleagues found that CD44+CD24−*/*low Lin<sup>−</sup> cells from breast cancer patients were significantly enriched for tumor forming ability in NOD/SCID mice compared with CD44+CD24+ Lin− cells. Moreover, the tumors formed by CD44+CD24−*/*low Lin<sup>−</sup> cells could be serial passaged (self-renew) and also reproduce the tumor cellular heterogeneity observed in the initial tumor (differentiation).

CD44 is a cell surface receptor for the extracellular matrix molecule hyaluronan, that influences cell behavior by direct signaling/structural roles or by acting as a co-receptor for receptor tyrosine kinases (Ponta et al., 2003). CD24 is a cell surface glycoprotein whose level of expression has become commonly used to isolate distinct cell populations from the normal mammary gland and breast cancer cells. CD24*high* expression in normal human mammary gland and breast carcinoma corresponds to a differentiated gene expression signature, whereas, CD44+ cells exhibit a more "stem-like" signature of gene expression (Shipitsin et al., 2007). In the mouse mammary gland, CD24−, CD24low, and CD24high expression levels correspond to populations of nonepithelial, basal and luminal epithelial cells, respectively (Sleeman et al., 2006). Functionally, the epithelial cell populations exhibited differential stem potential in mammary fat pad transplantation assays, with CD24low cells being significantly enriched for mammary gland repopulating capacity.

The combination of CD44 and CD24 expression have been used to successfully enrich for CSCs in both cell line and tumor samples but caution must be exercised. For example, within epithelial populations CD44highCD24<sup>−</sup> was shown to mark mesenchymal-like cells that formed mammospheres and had an invasive phenotype, but the cells lacked the capacity to produce the heterogeneity of the parental cell line (Sarrio et al., 2012). Therefore, these cells did not meet all the criteria of bona fide CSCs and thus highlight the importance of functionally testing "stemness" rather than assuming that a particular combination of cell surface markers is indicative of a phenotype.

In addition to cell surface markers, other expression-based methods of CSC-enrichment have been developed. Aldehyde dehydrogenase (ALDH) activity has been identified as a method of enriching for normal human breast stem and CSCs (Ginestier et al., 2007). Furthermore, by combining ALDH activity with CD44highCD24<sup>−</sup> expression, the CSC fraction was refined further compared to either method alone. Interestingly, the ALDH−, CD44highCD24<sup>−</sup> population was not enriched for CSCs demonstrating that the CD44highCD24<sup>−</sup> population retains significant heterogeneity.

Separating cell populations based on protein expression profiles of either cell surface markers or ALDH1 requires functional validation of the isolated cells to confirm their capacity as CSCs. Recently, Pece and colleagues developed a novel reciprocal approach of using function to isolate CSCs that were then used to identify new markers. By taking advantage of the stem cell ability to survive in suspension culture combined with slow proliferation rate they isolated stem cells from normal human mammary gland based on retention of a membrane-labeling dye, PKH26 (Pece et al., 2010). Gene expression analysis of the PKH26+ cells revealed a novel set of stem cell markers that the group then used to isolate stem cells from both normal breast and tumor samples (i.e., DNER and DLL1).

Due to the intra- and inter-tumor heterogeneity in cancer, it is possible that CSCs from different tumors have distinct expression profiles. Thus, isolating CSCs by function and detailing their expression profiles may prove extremely valuable where traditional markers fail.

### **ORIGINS OF CANCER STEM CELLS**

The stem cell characteristics of CSCs draw in to question the cell type from which they derive. Two potential models of CSC formation are: (1) the tumor cell of origin had stem cell or progenitor properties, or (2) the tumorigenesis process yields cells distinct from the cells of origin that are capable of reconstituting the tumor (**Figure 1**).

The simple model of hierarchical tissue organization suggests that as cells differentiate along a particular lineage, they lose the potential to give rise to multiple cell types and are therefore less likely to be able to act as CSCs. Normal stem cells already have

many of the properties associated with CSCs. Moreover, the longlived nature of stem cells allows more time for multiple genetic lesions to be acquired. Therefore, it is possible that CSCs originate from tissue stem cells.

Studies demonstrating an increased risk of breast cancer in children exposed to radiation suggest that the cells subject to transformation would be long-lived stem or progenitor cells (Miller et al., 1989; Modan et al., 1989). Much more recently, luminal progenitor cells were identified as the likely cell of origin in *BRCA1* driven tumors (Lim et al., 2009; Molyneux et al., 2010; Proia et al., 2011). Cells displaying the markers of stem cells have also been identified in early DCIS lesions suggesting that possible CSC are present at early stages of tumorigenesis (Pece et al., 2010). If the transformed cell has stem/progenitor properties then it is understandable that this could give rise to CSCs, as well as the non-CSCs that make up the majority of the tumor.

The model in which the cancer cell of origin is responsible for the properties of the CSC would be encouraging when it comes to designing therapies to tackle the disease. If the tumor behaves in a rigid linear hierarchy with relatively few stem cells giving rise to the majority "differentiated" tumor cells then therapies that can kill CSCs or drive them to differentiate would remove the ability of the tumor to regenerate following therapy.

However, cancer is a disease that forms over many years, so even if the original transformation event had occurred in a stemlike cell, the tumor that presents at the clinic is likely to be a much more evolved and heterogeneous entity than a linearlyhierarchical tissue. A linear hierarchy in cancer would also not explain why recurring tumors are resistant to therapy, as successive rounds of tumor growth may be expected to be produce similarly-sensitive progeny. In this sense, it appears that tumors have also evolved mechanisms to be self-sustaining even if their original CSC pool is destroyed, potentially via the generation CSCs cells from non-stem cells.

### **FORMATION OF CSCs FROM NON-CSCs**

A range of breast cancer cell lines are now known to be composed of a heterogeneous mixture of cells. A proportion of the cells act as CSCs by being able to give rise to all the cell types within that line, while the other cells show reduced ability to act as CSCs. There is also suggestion of heterogeneity within the CSC populations themselves (Wong et al., 2012). Significantly, several studies have now demonstrated that cells have the capacity to interconvert between phenotypes.

Breast cancer cell lines SUM159 and SUM149 sorted into stem-like, basal and luminal populations demonstrated the ability to transition between these cell states to maintain the overall heterogeneity of the parental line (Gupta et al., 2011). This stochastic cell state transition enabled purified populations to reconstitute the proportions of the parental cell line within 11 days of sorting (Gupta et al., 2011). Piggott and colleagues used the mammosphere assay to demonstrate that MDA-MB-231, BT474, SKBR3, and MCF7 cells all contained self-renewing mammosphere forming units (MFUs). Interestingly, BT474 cells depleted of MFUs reacquired these progenitor-like cells following 4 weeks in culture (Piggott et al., 2011). *In vitro*, Ca1a, MCF7, Sum159, and MDA-MB-231 breast cancer lines, sorted CD44+CD24+ non-invasive cells could give rise to invasive CD44+CD24− cells (and vice versa), even when initially plated as single cell clones (Meyer et al., 2009).

The generation of CSCs from non-CSCs has been confirmed *in vivo* using transplantation assays. Clones of noninvasive CD44+CD24+ sorted cells from Ca1a, ZR75.1 and MCF7 breast cancer lines transplanted into immunocompromised mice gave rise to molecularly heterogeneous tumors that exhibited local invasion (Meyer et al., 2009). Moreover, the stemlike-depleted basal and luminal populations of SUM159 cells were also able to transition to stem-like cells during tumor formation in NOD/SCID mice. However, it is interesting that the non-stem-like SUM159 populations required co-injection with irradiated parental SUM159 cells for tumor formation to occur. This co-injection requirement suggests that additional factors to those in the homogenous luminal or basal populations are required for conversion to stem-like phenotypes (Gupta et al., 2011).

Recent evidence suggests that the ability of the cancer cells to trans-differentiate is related to the transformation process. Using an inducible Src oncogene to drive transformation of MCF10A cells, CSC-like cells were generated during the transformation process within 16–24 h of Src activation (Iliopoulos et al., 2011). Furthermore, once generated the relative proportion of CSCs was maintained over several weeks in culture. Isolated CSCs readily formed non-CSCs whereas the reciprocal spontaneous conversion did not occur. However, media from CSC was found to drive non-CSCs to form CSCs and this was dependent of IL-6 (Iliopoulos et al., 2011).

Chaffer and colleagues demonstrated that hTERTimmortalized HMECs gave rise to a population of floating cells they term HME-flopcs (Chaffer et al., 2011) CD44low HME-flopcs were able to spontaneously convert to CD44high cells that had stem-like properties. Moreover, transformation of the HME-flopcs with the SV40 and H-ras increased the efficiency with which the conversion to CD44high cells occurred.

Despite the growing evidence of the ability of non-CSCs to produce CSCs it is noteworthy that in the parental populations the proportions of CSCs remains constant over time. Even when sorted into distinct populations, the sorted cells eventually recapitulate the proportions of cells originally present in the parental line. Tumor molecular expression profiles remain constant during disease progression, suggesting a level of stability within a population of tumor cells (Ma et al., 2003; Weigelt et al., 2003). Moreover, similar molecular profiles of primary tumor and metastases suggest ancestors are common rather than genetically distinct (Sorlie, 2004). This supports a hypothesis that perhaps paracrine signals mediate a level of homeostatic control over the proportions of different cell types present within a tumor.

### **CSC AND EPITHELIAL-TO-MESENCHYMAL TRANSITION**

Inter-conversion of CSC and non-CSC (spontaneously or otherwise) means that CSCs do not behave like classical stem cells. The question remains of how CSCs could arise from non-CSCs. Epithelial-to-Mesenchymal transition (EMT) is a natural process that occurs during development and is a method by which cancer cells metastasize during cancer progression (Thiery and Sleeman, 2006). EMT is also thought to be a mechanism by which CSCs form.

Induction of EMT in normal human mammary epithelial (HMLE) cells by expression of Snail, Twist or treatment with TGFβ1 caused the majority of cells to adopt the CD44+CD24low expression profile consistent with CSCs. There was also a significant increase in the number of mammosphere forming cells following EMT (Mani et al., 2008; Morel et al., 2008). In addition to EMT driving cells to acquire stem cell characteristics, naturally occurring stem cell fractions of normal mouse and human mammary epithelium cells as well as human neoplastic samples expressed significant levels of EMT markers (Mani et al., 2008).

The mechanism by which EMT induces CSC formation may involve the transcription factor FOXC2, which was upregulated in immortalized normal human mammary epithelial (HMLE) cells in response to multiple EMT-inducing stimuli (Mani et al., 2007). The CSC-characteristics acquired through EMT were attenuated by suppression of FOXC2 expression (Hollier et al., 2013). Furthermore, FOXC2 was upregulated in CSCenriched populations and expression of FOXC2 in V12H-Rastransformed HMLE cells was sufficient to drive EMT and increase their tumor forming and metastatic potential in transplants (van Vlerken et al., 2013).

The ability of EMT-driving factors to induce CSC formation is likely to be dependent on the cell type in which EMT occurs. Slug is a transcription factor that can drive EMT and its expression is enriched in MaSCs. Exogenous expression of SLUG in luminal progenitor cells was sufficient to drive them to a more stem-like phenotype, whereas SLUG expression in differentiated luminal cells failed to do so (Guo et al., 2012). Interestingly, co-expression of Sox9 with Slug could induce differentiated luminal cells into a stem-like state by activating distinct gene sets. Moreover, Snail, but not Twist could substitute for Slug and cooperate with Sox9 in driving differentiated luminal cells into stem-like cells. Therefore, EMT contributes to, but is not sufficient for the non-stem cell to stem-cell transition and not all EMT-driving factors elicit the same effect (Guo et al., 2012).

Analysis of non-tumorigenic mammary epithelial cell lines (MCF12A, MCF10-2A, and MCF10A) and immortalized Myo1089 cells using EpCAM and CD49f expression levels, identified heterogeneous cell populations. The EpCAM+CD49f+ had an epithelial morphology with an expression profile characteristic of luminal progenitors, while EpCAM−CD49fmed*/*low were fibroblastic in appearance and expressed genes associated with EMT (Twist1/2 and Slug) (Sarrio et al., 2012). Interestingly, although the epithelial (EpCAM+) Myo1089 cells gave rise to mesenchymal-like cells that were more invasive and could form mammospheres, it was the epithelial cells that had higher ALDH1 activity and could recapitulate the heterogeneous cell populations seen in the parental line. Therefore, in this instance EMT was associated with a loss of stem-cell capacity and re-iterates the importance of determining "stemness" functionally (Sarrio et al., 2012).

The reprogramming of cancer cells into CSCs by EMTassociated transcription factors highlights the importance of understanding how transcription factor networks regulate cell fate determination in breast cancer (Kalyuga et al., 2012). The power of transcription factor-mediated cell fate control is most notably demonstrated by the creation of induced pluripotency stem (iPS) cells by the introduction of Oct4, Sox2, c-Myc and Klf4 into differentiated adult cells (Takahashi and Yamanaka, 2006). The same factors that induce pluripotency in normal differentiated cells may also be involved in the formation of CSCs. Nontumorigenic MCF10A cells transduced with Oct4, Sox2, c-Myc, and Klf4 formed iPS-like cells that upon differentiation adopted a CSC phenotype (Nishi et al., 2013). These induced CSC-like-10A cells were largely CD44+CD24low, expressed ALDH1 and had high tumorigenicity *in vivo*. In metastatic breast cancer cells, Klf-4 expression increased the proportions of CD44+CD24low and mammosphere-forming cells (Okuda et al., 2013). Oct4 alone was able to transform primary HMLE cells into cells capable of initiating tumors in xenografts and Oct4 is also thought to be the downstream effector of IL-6 induced CSC formation (Beltran et al., 2011; Kim et al., 2013).

Transcription factors mediate changes in gene expression, but the action of transcription factors is also influenced through epigenetic genome modification. Epigenetic regulation of gene expression controls cell fate specification by activating or repressing genes associated with lineage commitment. Epigenetic changes are also associated with cancer progression.

In mammary epithelial cells, repressive and activating histone methylation patterns are associated with changes in gene expression during lineage determination (Pal et al., 2013). CSCs isolated from breast cancer cell lines had elevated levels of the polycomb group protein, EZH2, which catalyses histone methylation (van Vlerken et al., 2013). EZH2 knockdown by siRNA moderately reduced the CSC populations in breast and pancreatic cancer cell lines, inducing a more differentiated pattern of gene expression. Moreover, high EZH2 expression correlates with poor prognosis in breast and prostate cancer (Varambally et al., 2002; Pietersen et al., 2008).

Interestingly, the methylation patterns in mammary epithelial cells alter during pregnancy and also in ovariectomized mice, demonstrating that they are subjected to hormonal control. Furthermore, experiments in isolated epithelial cells suggested that EZH2 is induced by progesterone in a paracrine fashion (Pal et al., 2013). Thus, changes in local tumor environment could alter methylation patterns and facilitate CSC formation in relatively few generations, as it does not require further mutations to occur.

### **FACTORS INFLUENCING CSC FORMATION**

Selective pressure in a genetically unstable environment can drive selection for epigenetic or genetic changes that support survival. Factors that influence this tumor environment include infiltrating cells, hypoxia and chemotherapy, all of which have been linked to CSC development.

Co-culture of SUM159 cells with bone marrow-derived mesenchymal cells induced an expansion of the ALDH1-expressing SUM159 population (Liu et al., 2011). This expansion was due to a chemokine signaling loop between cancer-cell derived IL-6 and CXCL7 produced by ALDH+ mesenchymal cells. Moreover, co-injection of ALDH+ mesenchymal cells with SUM159 cells into NOD/SCID mice accelerated tumor growth and increased the capacity of the SUM159 cells to form secondary tumors following serial passage. Intratibial injection of mesechymal cells demonstrated that they could augment tumor growth and home to the site of breast tumor xenografts (Liu et al., 2011).

The immune response in FVB mice to cells derived from tumors in a Her2/neu transgenic strain caused the outgrowth of Her2-negative tumors. This antigen loss effect was dependent on CD8+ T cells. Her2-negative tumor cells had reduced CD24 levels compared with the parental Her2-positive cells and were more mesenchymal in appearance and expression patterns. Moreover, these CD24−*/*low cells were much more tumorigenic than controls suggesting that the CD8+ T cell-dependent immune response was inducing EMT in the cancer cells to generate CSCs (Santisteban et al., 2009).

#### **HYPOXIA**

As tumors develop, the requirement for oxygen increases, leading to regions of hypoxia. Hypoxia causes activation of hypoxiainducible factors, HIFs, which enable to cells to adapt to the low-oxygen environment. Hypoxic culture conditions (1% O2) induced an increase in the ALDH1+ proportion in breast cancer cell lines (Conley et al., 2012). Moreover, CSCs were enriched in hypoxic regions of tumor xenografts compared with normoxic regions (Conley et al., 2012). Using cycles of hypoxia and reoxygenation to model the tumor microenvironment, Louie and colleagues enriched for populations of MDA-MB-231 and BCM2 cells that were significantly more tumorigenic than the parental lines (Louie et al., 2010). The hypoxia-selected populations also had a greater proportion of CD44+CD24−*/*low cells. The low oxygen levels may influence the progenitor-like state of CSCs, as hypoxia blocked differentiation in MCF10A cells, possibly by maintaining greater levels of histone acetylation (Vaapil et al., 2012).

### **CHEMOTHERAPY**

In addition to CSCs forming as a part of tumor progression, therapeutic intervention may contribute to CSC genesis. Anti-angiogenic agents sunitinib and bevacizumab, which induce hypoxia in tumors, increased the number of CSCs in breast cancer xenografts (Conley et al., 2012). The release of factors by dying tumor cells may also act to augment the CSC pool. Interleukin-8 (IL-8) levels increased in SUM159 breast cancer cells following treatment with chemotherapeutic docetaxel (Ginestier et al., 2010). Interestingly, IL-8 signaling via its receptor CXCR1 on CSCs can expand CSC numbers in breast cancer cell lines (Charafe-Jauffret et al., 2009).

Further to the dying tumor cells releasing CSC-promoting factors, chemotherapy could alter the cells intrinsic mechanisms of preventing EMT. ER can directly suppress the EMT-driver SLUG; therefore anti-estrogen therapies may promote CSC formation by inducing EMT (Ye et al., 2008). Clearly the benefits of anti-estrogen therapies, such as tamoxifen, in prolonging patient survival are unarguable, but it is possible that under certain circumstances, initial anti-estrogen treatment may predispose the patient to recurrence of the disease.

### **PATHOLOGICAL SIGNIFICANCE OF BREAST CANCER STEM CELLS**

#### **TUMOR AGGRESSIVENESS**

Since the discovery of breast CSCs, they have been touted as critical targets for the design of future therapeutics. However, it is important to understand how CSCs influence the pathology of breast cancer so that treatments can be targeted appropriately.

Different subtypes of breast cancer are associated with different prognoses; luminal cancers offer the best chance of longterm survival and basal, claudin-low and Her2-positive cancers offer a much shorter life expectancy. Gene set enrichment analysis demonstrated similarity between the expression profile of stem cells and basal-breast cancers (Pece et al., 2010). The proportion of cells expressing stem-cell markers was approximately 3–4-fold higher in poorly differentiated compared with well-differentiated breast tumors. TAMresistant ER-positive breast cancers are more basal-like, showing reduced E-Cadherin expression, increased CD44 and NFκB expression along with increased motility (Hiscox et al., 2009).

A CSC gene signature from comparative analysis of CD44+CD24− sorted tumor cells and cancer mammospheres showed that this signature was associated with claudin-low breast cancers, suggesting that claudin-low tumors are enriched for CSCs (Creighton et al., 2009). Moreover, the expression profile of the CSC-regulator, FOXC2 was enriched in claudin-low tumors and cell lines (Hollier et al., 2013). Her2 expression has been shown to correlate with ALDH1 expression in human breast cancer. ALDH1 levels also correlated with poor clinical outcome and proved to be an independent prognostic marker (Ginestier et al., 2007; Morimoto et al., 2009). Together, these studies suggest a link between CSCs and the aggressiveness of the disease.

In inflammatory breast cancer (IBC), ALDH1 expression correlated with histological grade but interestingly not with the CD44highCD24<sup>−</sup> phenotype (Ginestier et al., 2007). This may be due to differences in analyzing CD44 and CD24 expression by immunohistochemistry rather than FACS or that CD44/CD24 may not be suitable markers of CSCs in IBC. A second study using IHC to assess prognostic significance of CD44 and CD24 expression in breast cancer also failed to find a correlation between the CD44highCD24<sup>−</sup> phenotype and tumor progression, although there was suggestion of a correlation with bone metastasis (Abraham et al., 2005). These discrepancies between FACS and IHC studies could be due to the different techniques employed or other factors, such as the source of the tumor cells being analyzed.

There is accumulating evidence that CSC are involved in the metastatic progression of breast cancer. This is particularly significant given that the majority of cancer deaths are due to secondary lesions that have disseminated from the initial tumor. Immunohistochemistry of breast cancer cells isolated from bone marrow using the CD44highCD24−*/*low phenotype suggests that there may be a much greater proportion of CSCs in metastatic tumors compared with the primary site (Balic et al., 2006). In IBC models, CSCs isolated by ALDH activity were shown to mediate metastasis in both *in vitro* and xenograft studies (Ginestier et al., 2007). Moreover, detection of ALDH+ cells in tumors from IBC patients correlated with both early onset of metastasis and overall decreased survival (Ginestier et al., 2007). CSCs have also been proposed to alter tissue architecture by driving epithelial remodeling. This disruption of normal tissue structure could be another method by which CSCs contribute to metastasis (Parashurama et al., 2012).

#### **CANCER RECURRENCE FOLLOWING THERAPY**

Resistance of CSCs to chemotherapy/radiotherapy is a possible mechanism to explain breast cancer recurrence. CSCs are enriched following neoadjuvant chemotherapy suggesting that CSCs are more resistant to therapy than the bulk of the tumor (Yu et al., 2007; Li et al., 2008). Treatment of both SUM159 and SUM149 cells with chemotherapeutics (paclitaxel or 5-fluorouracil) led to enrichment in the proportion of stem-like cells (Gupta et al., 2011). CSClike MCF7 cells were resistant to several commonly used chemotherapeutics (Adriamycin, Etoposide, 5-Fluorouracil cis-Platinum, and Methotrexate), although they were more sensitive to Taxol (Creighton et al., 2009; Sajithlal et al., 2010).

The association between EMT and CSCs is also relevant to chemo-resistance, as cells undergoing EMT are more resistant to chemotherapeutics (Li et al., 2009). Cells isolated from Her2 antigen loss tumors that had undergone EMT had upregulated expression of protein pumps associated with drug resistance (BCRP and PGP). Accordingly, these cells were protected from chemotherapeutics mitoxantrone and etoposide. The mesenchymal tumor cells also had increased levels of DNA repair enzymes and were resistant to ionizing radiation (Santisteban et al., 2009).

#### **TUMOR MAINTENANCE**

CSCs are often referred to as being responsible for "maintaining" the tumor. In some respects, this maintenance role is an extrapolation of data showing that CSCs can recapitulate tumors of heterogeneous cell types over several passages in immunecompromised mice. Few studies have examined whether elimination of CSCs actually causes spontaneous-regression in the primary setting, which could be expected if the CSCs were maintaining the tumor. Part of the reason for this, is the lack of models in which to test the maintenance of tumors by CSCs.

Seminal lineage tracing experiments in both the skin and intestine demonstrated that during early transformation the tissues retain a cellular hierarchy akin to the normal tissue (Driessens et al., 2012; Schepers et al., 2012). Notably, in contrast to benign skin tumors, squamous cell carcinomas had an increased proportion of CSC, which had reduced propensity to differentiate. These studies demonstrate that CSCs exist early in the tumorigenesis process, but does still not delineate whether these early CSCs are maintaining the tumor. In a mouse model of glioblastoma, Chen and colleagues demonstrated the presence of quiescent CSCs that could expand and re-populate the tumor following chemotherapy with temozxolomide (TMZ). Eradication of these CSCs using a thymidine kinase transgene and ganciclovir (GCV) significantly improved survival. Moreover, the tumors in the GCV treated mice had reduced levels of proliferation and were less invasive suggesting that the CSCs were in indeed maintaining the tumor progression (Chen et al., 2012).

### **THERAPEUTIC TARGETS IN CSCs**

The growing evidence that CSCs contribute to cancer progression and recurrence shows that developing anti-CSC therapies will likely improve chances of long-term survival of cancer patients. A proof of principle for targeting CSCs has been demonstrated in AML where the anti-leukemia drug TDZD-8 selectively killed leukemia stem cells while not affecting normal hematopoietic stem and progenitor cells (Guzman et al., 2007).

Many of the pathways currently under investigation as potential therapeutic targets in CSCs have been shown to regulate normal stem and progenitor cells, so finding methods to selectively target the pathways in cancer will be critical. Two developmental pathways that have received much recent attention as cell fate regulators in the breast are Notch and Wnt (Gu et al., 2013; Meier-Abt et al., 2013; Regan Joseph et al., 2013; Šale et al., 2013). It is therefore not surprising that they may be therapeutic targets in CSCs. In a model of Notch1-driven mammary tumorigenesis, inhibition of Notch signaling induced tumor regression and reduced tumorsphere formation *in vitro* (Simmons et al., 2012). Upregulation of the Notch ligand, Jagged2 in breast cancer cells and bone marrow derived cells in response to hypoxia led to an expansion of CSCs (Xing et al., 2011). Notch 4 activity is increased in breast CSCs and Notch and Wnt signaling were found to mediate radio-resistance in breast progenitor and CSCs (Phillips et al., 2006; Woodward et al., 2007; Harrison et al., 2010). The Wnt co-activator Pygo2 augmented mammosphere formation in MDA-MB-231 breast cancer cells (Chen et al., 2010). Conversely, deletion of pygo2 in MMTV-Wnt1 tumor cells reduced both mammosphere and tumor-forming capacity (Watanabe et al., 2013).

The potential therapeutic benefit of targeting Wnt-signaling was demonstrated by the identification of Salinomycin in a screen for CSC-inhibitors. Salinomycin preferentially eliminated CSCs by inhibiting Wnt signaling and inducing apoptosis Gupta et al., 2009; Fuchs et al., 2009; Lu et al., 2011; Tang et al., 2011. Salinomycin also killed iCSCL-10A cells that were resistant to Taxol and Actinomycin D (Nishi et al., 2013). Another drug that appears efficacious against CSCs is the anti-diabetic drug Metformin. Metformin targets CSC and can act synergistically with chemotherapy drugs to reduce CSC numbers and tumor growth (Hirsch et al., 2009; Vazquez-Martin et al., 2011). Subsequent work demonstrated that Metformin might act by inhibiting nuclear translocation of NF-κB and phosphorylation of STAT3 in CSCs compared with non-CSCs (Hirsch et al., 2013). Metformin may therefore be a candidate to treat TAMresistant ER+ cancers that have been shown to upregulate NF-κB (Hiscox et al., 2009). Significantly, metformin treatment overcame Herceptin™ resistance in a Her2-positive xenograft model (Cufi et al., 2012).

Cell surface receptors make attractive targets for therapeutic design, as they are accessible to drugs. The growth factor receptor PDGFR-β was shown to lie downstream of FOXC2 in cells induced to undergo EMT and both proteins were expressed in CSC-enriched populations of SUM159 and HMLER cells (Hollier et al., 2013). The PDGFR-β inhibitor sunitinib reduced tumor growth and metastasis of FOXC2-expressing tumor cells (Hollier et al., 2013). Thus, sunitinib may be effective to combat CSC that arise as a result of EMT. FGF-receptor 2 (FGFR2) was enriched in CSC isolated from a MMTV-PyMT mouse breast cancer model (Kim et al., 2013). Moreover, FGFR2-expressing human tumor cells were more tumorigenic than FGFR2-negative cells in the xenograft experiments. Treatment with the FGFR inhibitor, TKI258, reduced the proportion of CSCs in MMTV-PyMT-driven tumors and delayed tumor growth (Kim et al., 2013).

The enrichment of CSCs that occurs under certain conditions, suggests that CSCs are capable of increasing their numbers by symmetric division. Blocking this mechanism of CSC expansion may slow tumor progression and allow more successful elimination of the CSC pool. By restoring p53 function in Her2 over-expressing cells, asymmetric cell division in the CSCs was restored leading to reduced tumor formation (Cicalese et al., 2009). Hedgehog (Hh) signaling via Bmi1 increased the frequency of mammosphere forming cells and this effect was reversed using the Hh inihibitor cyclopamine (Liu et al., 2006). Suppression of cFLIP eliminated CSCs in response to TRAIL, reducing formation of primary tumors in transplant models and almost completely preventing metastasis (Piggott et al., 2011). cFLIP suppression also reduced MFU-enrichment following passage of mammospheres, suggesting symmetric CSC division was compromised.

The plasticity of tumor cells is another hurdle that needs to overcome in order to prevent *de novo* CSC formation from non-CSCs. By blocking Activin/Nodal signaling, the ability of CD44+CD24<sup>+</sup> (non-stem) cells to give rise to CD44+CD24low (CSC) progeny was also blocked (Meyer et al., 2009).

Therapeutic ablation of specific cell populations is likely to only provide temporary relief from tumor progression. Moreover, as some therapies appear to support CSC production, it will be necessary to tackle cancer in a multi-pronged

### **REFERENCES**


have a putative breast cancer stem cell phenotype. *Clin. Cancer Res.* 12, 5615–5621. doi: 10.1158/1078- 0432.CCR-06-0169


approach, targeting both CSC and non-CSCs. The CXCR1 inhibitor repertaxin killed bulk tumor cells by upregulating Fas expression and also prevented IL-8 signaling through CXCR1 to kill the CSCs (Ginestier et al., 2010). Combining GCV and TMZ to target both CSCs and non-CSCs significantly reduced the tumor burden compared with GCV treatment alone (Chen et al., 2012). Unfortunately, the outgrowth of cells that had suppressed the TK transgene precluded the authors from determining if there was a significant benefit to overall survival.

A problem with current cancer therapies is that they have been tested, selected and approved based on the ability to reduce tumor size without testing the effect on CSCs. Therefore, in addition to developing drugs that target CSCs it will be necessary to develop new assays focused on being able to detect changes in CSCs function that alone may not necessarily cause a reduction in tumor size. The efficacy of CSC-targeted therapeutics could also be determined by examining cancer recurrence in patients treated with combined drug regimes.

### **SUMMARY**

There is now little doubt that cancer cells with the properties of stem cells exist within heterogeneous populations and that these CSCs have tumor-forming capacity. However, the role that these cells have in the formation and progression of the tumor in the primary setting is still unclear and will require suitable models to be developed for this to be delineated. The mechanisms of CSCs formation will require particular attention if they are to be successfully eliminated from patients. Finally, new assays that can detect the efficacy of targeting CSCs are essential if CSC-therapies are to make it to the clinic.

### **ACKNOWLEDGMENTS**

Financial support was provided by the Cancer Council NSW, National Breast Cancer Foundation of Australia, National Health and Medical Research Council of Australia and Prostate Cancer Foundation of Australia.

Breast cancer cell lines contain functional cancer stem cells with metastatic capacity and a distinct molecular signature. *Cancer Res.* 69, 1302–1313. doi: 10.1158/0008-5472.CAN-08-2741


*U.S.A.* 106, 13820–13825. doi: 10.1073/pnas.0905718106


apoptosis in chronic lymphocytic leukemia cells. *Proc. Natl. Acad. Sci. U.S.A.* 108, 13253–13257. doi: 10.1073/pnas.1110431108


in mouse intestinal adenomas. *Science* 337, 730–735. doi: 10.1126/science.1224676


*Sci. U.S.A.* 100, 15901–15905. doi: 10.1073/pnas.2634067100


let-7 regulates self renewal and tumorigenicity of breast cancer cells. *Cell* 131, 1109–1123. doi: 10.1016/j.cell.2007.10.054

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 29 May 2013; accepted: 03 August 2013; published online: 27 August 2013.*

*Citation: Owens TW and Naylor MJ (2013) Breast cancer stem cells. Front.* *Physiol. 4:225. doi: 10.3389/fphys. 2013.00225*

*This article was submitted to Systems Biology, a section of the journal Frontiers in Physiology.*

*Copyright © 2013 Owens and Naylor. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

### *Victoria Catalán1,2\*, Javier Gómez-Ambrosi 1,2, Amaia Rodríguez 1,2 and Gema Frühbeck1,2,3*

*<sup>1</sup> Metabolic Research Laboratory, Clínica Universidad de Navarra, Pamplona, Spain*

*<sup>2</sup> CIBER Fisiopatología de la Obesidad y Nutrición, Instituto de Salud Carlos III, Pamplona, Spain*

*<sup>3</sup> Department of Endocrinology and Nutrition, Clínica Universidad de Navarra, Pamplona, Spain*

#### *Edited by:*

*Masa Tsuchiya, Keio University, Japan*

#### *Reviewed by:*

*Sudipto Saha, Bose Institute, India Satyaprakash Nayak, Pfizer Inc., USA*

#### *\*Correspondence:*

*Victoria Catalán, Metabolic Research Laboratory, Clínica Universidad de Navarra, Avda. Pío XII, 36, 31008 Pamplona, Spain e-mail: vcatalan@unav.es*

Inflammation and altered immune response are important components of obesity and contribute greatly to the promotion of obesity-related metabolic complications, especially cancer development. Adipose tissue expansion is associated with increased infiltration of various types of immune cells from both the innate and adaptive immune systems. Thus, adipocytes and infiltrating immune cells secrete pro-inflammatory adipokines and cytokines providing a microenvironment favorable for tumor growth. Accumulation of B and T cells in adipose tissue precedes macrophage infiltration causing a chronic low-grade inflammation. Phenotypic switching toward M1 macrophages and Th1 T cells constitutes an important mechanism described in the obese state correlating with increased tumor growth risk. Other possible synergic mechanisms causing a dysfunctional adipose tissue include fatty acid-induced inflammation, oxidative stress, endoplasmic reticulum stress, and hypoxia. Recent investigations have started to unravel the intricacy of the cross-talk between tumor cell/immune cell/adipocyte. In this sense, future therapies should take into account the combination of anti-inflammatory approaches that target the tumor microenvironment with more sophisticated and selective anti-tumoral drugs.

**Keywords: adipose tissue, inflammation, immune cells, adipokines, angiogenesis, hypoxia, macrophages, tumor growth**

### **INTRODUCTION**

The incidence of obesity and its associated disorders is increasing at an accelerating and alarming rate worldwide (Flegal et al., 2012; Frühbeck et al., 2013). Relative to normal weight, obesity is associated with significantly higher all-cause mortality (Frühbeck, 2010; Flegal et al., 2013). Body mass index (BMI) represents the most used diagnostic tool in the current classification system of obesity, frequently used as an indicator of body fat percentage (BF). The controversy in studies (Hughes, 2013) arises in part because a wide variety of BMI cutoffs for normal weight has been applied to correlate with mortality which can yield quite diverse findings. Furthermore, in spite of its wide use, BMI is only a surrogate measure of body fat and does not provide an accurate measure of body composition (Frühbeck, 2012; Gómez-Ambrosi et al., 2012). Noteworthy, obesity is defined as a surplus of body fat accumulation, with the excess of adipose tissue really being a well-established metabolic risk factor for the development of obesity-related comorbidities such as insulin resistance, type 2 diabetes (T2D), cardiovascular diseases and some common cancers (Bray, 2004; Kahn et al., 2006a; Van Gaal et al., 2006; Renehan et al., 2008; Bardou et al., 2013).

Results from epidemiological studies indicate that overweight and obesity contribute to the increased incidence and/or death from quite diverse types of cancers, including colon, breast (in postmenopausal women), endometrium, kidney (renal cell), esophagus (adenocarcinoma), stomach, pancreas, gallbladder and liver, among others (Calle and Kaaks, 2004). The mechanisms linking excess of adiposity and cancer are unclear but the obesityassociated low-grade chronic inflammation is widely accepted as an important factor in cancer pathogenesis (Catalán et al., 2011d; Hursting and Dunlap, 2013). Chronic hyperinsulinaemia as well as the alterations in the production of peptide and steroid hormones associated to obesity are other postulated mechanisms involved in cancer development (Calle and Thun, 2004). Particular attention is placed on the pro-inflammatory microenvironment associated with the obese state (Catalán et al., 2011d; Ribeiro et al., 2012; Hursting and Dunlap, 2013), specifically highlighting the involvement of obesity-associated hormones/growth factors in the cross-talk between macrophages, adipocytes, and epithelial cells in many cancers. Among the various pathophysiological mechanisms postulated to explain the link between obesity and cancer, the dysfunctional adipose tissue may be a unifying and underlying factor (van Kruijsdijk et al., 2009). Understanding the contribution of obesity to growth factor signaling and chronic inflammation provides mechanistic targets for disrupting the obesity-cancer link (Harvey et al., 2011).

In this regard, obesity prevention is a major part of several evidence-based cancer prevention guidelines (Kushi et al., 2012). Recent studies exploring the effect of weight loss, suggest that severe caloric restriction in humans may confer protection against invasive breast cancer (Michels and Ekbom, 2004). This protective effect includes reductions in the initiation and progression of spontaneous tumors in several tissues (Longo and Fontana, 2010). Moreover, the association between obesity and cancer is consistent with data from animal models showing that caloric restriction decreases spontaneous and carcinogen-induced tumor incidence (Dunn et al., 1997; Yun et al., 2013). Both bariatric surgery and short-term intentional weight loss have been shown to improve insulin sensitivity and inflammatory state, which have been postulated to contribute to the relationship between obesity and cancer (Sjöström et al., 2007; Cummings et al., 2012).

### **THE IMPORTANCE OF OBESITY-INDUCED CHRONIC INFLAMMATION**

Adipocytes, the principal cellular component of adipose tissue, are surrounded by connective tissue comprising macrophages, fibroblasts, preadipocytes, and various cell types included in the stromovascular fraction (Hausman et al., 2001; Nishimura et al., 2007; Cinti, 2012). Although adipocytes have been considered primarily as fat-storage depots, in recent years, it has become clear that together with other metabolically active organs, adipose tissue is a dynamic endocrine system key in the regulation of whole body energy homeostasis (Frühbeck et al., 2001a; Ahima, 2006; Sáinz et al., 2009). Indeed, mature adipocytes are involved in endocrine, paracrine, and autocrine regulatory processes (Ahima and Flier, 2000) through the secretion of large number of cytokines, hormones and other inflammatory markers, collectively termed adipokines (Lago et al., 2007, 2009; Lancha et al., 2012). In addition to playing key roles in the regulation of the lipid and glucose homeostasis, adipokines modify physiological processes, such as hematopoiesis, reproduction, and feeding behavior, being also involved in the genesis of the multiple pathologies associated with an increased fat mass including cancer development (Rajala and Scherer, 2003). However, adipose tissue not only secretes adipokines but also functions as a target of these pro-inflammatory mediators, expressing a wide variety of receptors for cytokines, chemokines, complement factors, and growth factors (Frühbeck, 2006a,b; Schäffler and Schölmerich, 2010).

The connection between inflammation and diabetes was suggested more than a century ago (Williamson, 1901), but the evidence that inflammation is an important mediator in the development of insulin resistance came recently. It was described that the administration of tumor necrosis factor-α (TNF-α) led to increased serum glucose concentrations (Feingold et al., 1989). The first study that established the concept of obesity-induced adipose tissue inflammation was conducted by Hotamisligil et al. (1993), demonstrating that the pro-inflammatory cytokine TNFα mediate insulin resistance in many experimental models of obesity. Importantly, the development of adipose tissue has been associated with increased plasma levels of well-known inflammatory and acute phase proteins such as C-reactive protein, interleukin (IL)-6, IL-8, serum amyloid A (SAA) and monocyte chemotactic protein (MCP)-1 in patients and different animal models of obesity (Frühbeck et al., 1995; Wellen and Hotamisligil, 2003; Frühbeck, 2005; Gómez-Ambrosi et al., 2006; Kahn et al., 2006b; Kim et al., 2006; Catalán et al., 2007, 2008), whereas production of the anti-inflammatory and insulinsensitizing adipokine adiponectin is reduced with increasing body weight (Kadowaki et al., 2006). In obesity, the activation of the c-Jun N-terminal kinase (JNK) and nuclear factor κB (NF-κB) transduction signals is key in the inflammation process of adipose tissue and these pathways could interact with insulin signaling via serine/threonine inhibitory phosphorylation of IRS (Bastard et al., 2006; Gil et al., 2007). Genetic or pharmacological manipulations of these different effectors of the inflammatory response modulate insulin sensitivity in different animal models.

Recent data suggest that stromovascular cells also contribute to the secretion of inflammatory adipokines. In this sense, the infiltration of adipose tissue by immune cells is a feature of obesity, with adipose tissue macrophage (ATM) accumulation being directly proportional to measures of adiposity in both mice and humans (Weisberg et al., 2003). This evidences a role of adipose tissue as part of the innate immune system.

### **ADIPOSE TISSUE INFLAMMATION, A MICROENVIRONMENT FOR TUMORIGENESIS**

Analogously to adipose tissue, the tumor microenvironment is composed by multiple cell types including epithelial cells, fibroblasts, mast cells, and cells of the innate and adaptive immune system that favor a pro-inflammatory and pro-tumorigenic environment (Harvey et al., 2011). These inflammatory cells secrete cytokines, growth factors, metalloproteinases, and reactive oxygen species, which can induce DNA damage and chromosomal instability, thereby favoring carcinogenesis (Khasawneh et al., 2009). The abundance of leukocytes in neoplasic tissue was crucial to establish the link between chronic inflammation and cancer development (Virchow, 1863). Now, inflammation is a well-known hallmark of cancer, and growing evidence continues to indicate that chronic inflammation is associated with increased cancer risk (Aggarwal and Gehlot, 2009).

The expanded adipose tissue constitutes an important initiator of the microenvironment favorable for tumor development (Catalán et al., 2011d) due to its ability to produce and secrete inflammatory cytokines by adipocytes or infiltrating macrophages (Xu et al., 2003). Noteworthy, novel adipokines [lipocalin-2 (LCN-2), osteopontin (OPN) and YKL40] related to inflammation and insulin resistance with emerging roles in tumor development have been recently described to be increased in adipose tissue from patients with colon cancer (Catalán et al., 2011d).

In this line, periprostatic adipose tissue of obese subjects shows a dysregulated expression of genes encoding molecules involved in inflammatory processes including antigen presentation, B cell development, and T helper cell differentiation. Moreover, subjects with prostate cancer display an altered profile of genes with great impact on immunity and inflammation in their periprostatic adipose tissue (Ribeiro et al., 2012). The up-regulation of complement factor H and its receptor in periprostatic adipose tissue from patients with prostate cancer has been also described, suggesting an inhibitory modulation of the complement activity in prostate tumor cells and evasion to attack. Other altered molecules include the B lymphocyte antigen CD20 encoded by the *MS4A1* gene with a functional role in B-cell activation and *FFAR2* that encodes a protein reported to modulate the differentiation and/or activation of leukocytes (Ribeiro et al., 2012). This observation highlights the bi-directional interactions between periprostatic adipose tissue and tumor cells, which influence adipose tissue function and may influence prostate cancer progression inducing an environment favorable to cancer progression.

Clusters of enlarged adipocytes become distant from the vasculature in expanding adipose tissue leading to local areas of hypoxia and eventually necrosis. The reduction in oxygen pressure associated with adipose tissue hypoxia is considered to underlie the inflammatory response (Trayhurn et al., 2008; Ye, 2009; Trayhurn, 2013). The master regulator of oxygen homeostasis is the hypoxia-inducible factor (HIF)-1α. HIF-1α is increased in the adipose tissue of obese patients and its expression is reduced after surgery-induced weight loss (Cancello et al., 2005). It is well-documented that HIF-1α also influences both the innate and the adaptive immunity regulating functions of myeloid cells, neutrophils, macrophages, mast cells, dendritic cells, natural killer cells and lymphocytes (Eltzschig and Carmeliet, 2011). Similarly to what takes place in tumor tissue, adipose tissue hypoxia is related to the presence of macrophages, which migrate to the hypoxic regions and alter their expression profile increasing inflammatory events (Fujisaka et al., 2013). Hypoxia activation is a critical microenvironmental factor during tumor progression with oxygen concentrations in solid tumors being frequently reduced compared with normal tissues (Semenza, 2003; Jiang et al., 2011). HIF-1α and HIF-2α are overexpressed in certain solid tumors (Zhong et al., 1999; Talks et al., 2000), with these elevated levels being associated with cancer-related death in specific tumoral types of the brain (oligodendroglioma), breast, cervix, oropharynx, ovary, and uterus (endometrial) (Semenza, 2003). HIF-2α is also strongly expressed by subsets of tumor-associated macrophages, sometimes in the absence of expression in any tumor cell (Talks et al., 2000). Overall, hypoxia has effects on the function of adipocytes and appears to be an important factor in adipose tissue dysfunction in obesity increasing the risk of cancer development.

Moreover, hypoxia is a primary physiological signal for angiogenesis (growth of blood vessels) in both physiological and pathological conditions. Angiogenesis is a physiological response that regulates adipogenesis representing a hallmark of tumor growth (Hanahan and Folkman, 1996; Carmeliet and Jain, 2000; Cao, 2007). Adipocytes seem regulate angiogenesis both by cell to cell contact and by adipokine secretion (Cao, 2007; Lemoine et al., 2013). In this regard, many cytokines produced by adipose tissue show angiogenic activities such as leptin, TNF-α, IL-6, IL-8, vascular endothelial growth factor (VEGF) and tumor growth factor β (TGF-β) (Ferrara and Kerbel, 2005; Ye, 2009; Gómez-Ambrosi et al., 2010).

The blocking of tumor angiogenesis as an anticancer strategy has shown desirable results across multiple tumor types (Folkman, 1971; Schneider et al., 2012). The standard chemotherapy usually results in partial or total resistance after different cycles of treatment (Kerbel, 1997). Based on the hypothesis that endothelial cells have a normal complement of chromosomes and a relative genetic stability, the use of inhibitors of angiogenesis may avoid acquired drug resistance (Kerbel, 1997). Current pharmacotherapeutic options for treating obesity and related metabolic disorders remain limited and ineffective. Emerging evidence shows that modulation of angiogenesis is a possible therapeutic intervention to impair the development of obesity by regulating the growth and remodeling of the adipose tissue vasculature (Rupnick et al., 2002; Cao, 2010). Adipose tissue growth is angiogenesis-dependent (Rupnick et al., 2002) and the modulation of angiogenesis appears to have the potential to impair the development of obesity (Lijnen, 2008). Studies in mice have shown that the administration of anti-angiogenic agents prevents diet-induced or genetic obesity (Brakenhielm et al., 2004a). Genetically obese mice treated with different angiogenesis inhibitors such as TNP-470, angiostatin, endostatin, Bay-129566, a matrix metalloproteinase inhibitor, or thalidomide showed reduced body and adipose tissue weights as well as increased apoptosis in the adipose tissue compared with control mice (Rupnick et al., 2002). In this regard, targeting a proapoptotic peptide to prohibitin in the adipose vasculature caused ablation of white fat in both, diet-induced and age-related obesity (Kolonin et al., 2004). Recently, the antiangiogenic treatment blocking VEGFR2 by antibodies but not of VEGFR1 has been described to limit adipose tissue expansion (Tam et al., 2009). To evaluate the effects of the different antiangiogenic agents characterized in the cancer field in obesity models *in vivo* may be an attractive target to limit adipose tissue expansion. However, a too strong inhibition of adipose tissue expansion by impairing angiogenesis may lead to ectopic lipid storage, increased inflammation, and further deterioration of systemic insulin sensitivity (Sun et al., 2012; Lemoine et al., 2013). Moreover, adipose tissue development is a multifactorial process and it is unlikely that a single angiogenesis inhibitor will allow reduction of obesity without associated side effects (Lijnen, 2008). Thus, blocking the capacity for angiogenesis may have different outcomes, depending on the stage of obesity.

### **IMMUNE CELL TYPES PRESENT IN EXPANDED ADIPOSE TISSUE**

In cases of severe obesity, adipose tissue can constitute up to 50–60% of the total body mass being the expanded adipose tissue a largely uncharacterized immunological organ with distinct subpopulations of cells of the immune system (Kanneganti and Dixit, 2012). Furthermore, excess of body fat is accompanied by altered immune cell function and different expression profile of genes related to immunity in obese human subjects compared with healthy-weight individuals (Gómez-Ambrosi et al., 2004). Discrepancies in leukocyte number, subset, and activity of monocytes between lean and obese individuals have been reported (Nieman et al., 1999). Adipose tissue has been shown to exhibit a dynamic infiltration by innate and adaptive cells during the onset of insulin resistance and diet-induced obesity (Duffaut et al., 2009). The observation of infiltrated macrophages in the adipose tissue of obese patients prompted an increased interest in the interplay between immune cells and metabolism. Recent studies have revealed a growing list of immune cell types (including macrophages, lymphocytes, mast cells, eosinophils neutrophils and foam cells) that infiltrate adipose tissue and have potential roles in insulin resistance (Olefsky and Glass, 2010; Dalmas et al., 2011; Wu et al., 2011; Shapiro et al., 2013) (**Figure 1**).

The role of adaptive immune cells in obesity-induced adipose tissue inflammation has been less characterized than that of innate immune cells. Based on studies in mouse models, lymphocyte infiltration in adipose tissue might occur in a chronological sequence. B and T lymphocytes are recruited during early

obesity-induced inflammation by preadipocytes or chemotactic adipokines like CCL5, CXCL5, CXCL12, or CCL20. Furthermore, the cytokines derived from Th lymphocytes reportedly modulate macrophage phenotype switching, which is directly linked to insulin resistance (Sell et al., 2012).

To explain the chronological order of how immune cells infiltrate adipose tissue in obesity, it has been proposed that T cells may stimulate preadipocytes to induce the recruitment of macrophages via chemotactic factors such as MCP-1, shedding new light on the importance of chemotaxis in this scenario (Kintscher et al., 2008).

### **INNATE IMMUNE SYSTEM IN ADIPOSE TISSUE**

Macrophages and monocytes are representative of the innate immune system and represent a large proportion of the stromovascular cell fraction in adipose tissue. Several cell types of the innate immune system are involved in the development of adipose tissue inflammation and the most studied cell type among these is the ATM (Kalupahana et al., 2012). Neutrophils and mast cells, also members of the innate immune system have been also implicated in promoting inflammation and insulin resistance during obesity, whereas eosinophils and myeloid-derived suppressor cells have been suggested to play a protective role (Wu and Van Kaer, 2013).

### **MONOCYTES AND MACROPHAGES IN ADIPOSE TISSUE**

The majority of macrophages found in the adipose tissue of diet-induced obese mice are originated from blood monocytes (Weisberg et al., 2003; Dalmas et al., 2011). Monocytes are a heterogeneous cell population that differ in their migration and cell fate properties (Saha and Geissmann, 2011). The phenotype of macrophages depends on the subset of monocytes upon arrival at target tissues being probably determined by the local microenvironment (Dalmas et al., 2011). The number of resident macrophages present in adipose tissue was found to correlate positively with obesity in various mouse models and in human adipose tissue (Weisberg et al., 2003; Xu et al., 2003). Thus, it is possible to speculate that macrophages might be involved in the growth of the fat mass in a similar manner to that described in tumors (Curat et al., 2004).

Based on their cytokine profile secretion and cell surface markers, ATMs are classified into two main types: the "classical" macrophages named M1 in contrast to the "alternatively activated" M2. M1 macrophages are the first line of defense against intracellular pathogens with high microbicidal activity and are classically stimulated by interferon (IFN)-γ or by lipopolysaccharide (LPS). M1 induce the secretion of inflammatory cytokines (IL-1, IL-6, TNF-α, MCP1) and reactive oxygen species, and nitric oxide (NO) through the stimulation of inducible NO synthase (iNOS) (Lumeng et al., 2008). Alternative activation, resulting from induction by the Th2 cytokines interleukin IL-4 and IL-13 (Gordon, 2003) is associated with tissue repair and humoral immunity producing immunosuppressive factors, such as IL-10, IL-1Ra, and arginase (Gordon and Taylor, 2005). Obesity induces a phenotypic switch from an anti-inflammatory M2 polarized state to a pro-inflammatory M1 state (Lumeng et al., 2007). The importance of the M1/M2 ratio has been reported in macrophage-specific *Pparg-*deficient mice that show impaired alternative macrophage activation, increased development of obesity and adipose tissue inflammation as well as glucose intolerance (Odegaard et al., 2007). The identification of the signaling pathways that control macrophage polarization in expanding adipose tissue remains a challenging issue. In this sense, it has been described that the local hypoxia in expanding adipose tissue may promote the M2 to M1 switching (Ye and McGuinness, 2013). Moreover, a recent study in *Trib1*-deficient mice has shown a severe reduction of M2-like macrophages in adipose tissue highlighting the contribution of Trib1 for adipose tissue homeostasis by controlling the differentiation of tissue-resident M2-like macrophages (Satoh et al., 2013).

### **INVOLVEMENT OF NEUTROPHILS, EOSINOPHILS, AND MAST CELLS IN OBESITY**

The notion that a transient "acute inflammatory infiltrate" precedes the "chronic inflammatory infiltrate" in obesity and that neutrophils play a key role (Wagner and Roth, 2000) producing chemokines and cytokines, thereby facilitating macrophage infiltration has been proposed (Talukdar et al., 2012). In this line, adipose tissue neutrophils could have a role in initiating the inflammatory cascade in response to obesity based on the fact that mice fed with a high-fat diet show an increase in neutrophil recruitment to adipose tissue peaking at 3–7 days and subsiding thereafter (Elgazar-Carmon et al., 2008). The treatment of hepatocytes with neutrophil elastase causes cellular insulin resistance while deletion of neutrophil elastase in obese mice leads to reduced inflammation (Talukdar et al., 2012).

Although eosinophils are associated with allergic diseases and helmintic infections (Rothenberg and Hogan, 2006), the biologic role of these cells in adipose tissue remains incompletely defined (Maizels and Allen, 2011). It has been shown that eosinophils are the main source of IL-4 and IL-13 in white adipose tissues of mice, and, in their absence, M2 macrophages are greatly attenuated (Wu et al., 2011). Moreover, in the absence of eosinophils, mice which were fed a high-fat diet develop increased body fat and insulin resistance (Wu et al., 2011). The promotion of eosinophil responses can protect against metabolic syndrome (Wu et al., 2011).

Mast cells, like macrophages, are inflammatory cells, but the exact mechanisms of mast cells in the pathogenesis of obesity are not fully understood. In this regard, increased mast cells in adipose tissue from obese subjects compared with those of lean subjects have been reported. Obese subjects also had significantly higher tryptase concentrations in their serum than lean individuals. Mast cells may contribute to inflammation through the secretion of IL-6 and IFN-γ (Stienstra et al., 2011). Moreover, mast cell number is related to fibrosis, macrophage inflammation and endothelial activation of adipose tissue in human obesity (Divoux et al., 2012). These observations suggest a possible association between mast cells and obesity-associated inflammation (Liu et al., 2009; Zhang and Shi, 2012).

### **ADAPTIVE IMMUNE SYSTEM IN ADIPOSE TISSUE**

Recent advances in the field of adipose tissue biology reveal a prominent role of different types of lymphocytes (T-lymphocytes, B-lymphocytes, and natural-killer cells) in adipose tissue inflammation depending on the obese state in parallel to macrophages (Sell and Eckel, 2010).

### **T-LYMPHOCYTES IN ADIPOSE TISSUE**

CD4+ T cells along with CD8+ T cells constitute the majority of T-lymphocytes. Experimental data suggest that T-lymphocytes might play a role in the development of insulin resistance during obesity. In this sense, T-lymphocytes are described in visceral and subcutaneous adipose tissue of obese mice and humans (Bornstein et al., 2000) but the role of different subtypes of lymphocytes, CD4+, and CD8+ cells, in adipose tissue inflammation remains largely unexplored. The increase in the number of T cells in adipose tissue from diet-induced obesity mice is genderdependent, with higher numbers of T cells in obese males than in females or lean males (Wu et al., 2007). Based on studies in mouse models, lymphocyte infiltration in adipose tissue might occur in a chronological sequence with T lymphocytes being recruited during early obesity-induced inflammation by chemokines like RANTES, a T-cell specific chemokine also known as CCL5 (Sell et al., 2012). In this regard, the expression of RANTES and its respective receptor CCR5 in visceral adipose tissue of morbidly obese patients have been described (Wu et al., 2007).

CD4+ T cells are crucial in achieving a regulated effective immune response to pathogens. In adipose tissue, CD4+ T cells are mainly classified into the classical T-helper 1 (Th1) and T-helper 2 (Th2) although new subsets have been identified including T-helper 17 (Th17), induced T-regulatory cells (iTreg), and the regulatory type 1 cells (Tr1), among others (Luckheeram et al., 2012). The roles for CD4+ T lymphocytes in adipose tissue are related to the regulation of body weight, adipocyte hypertrophy, insulin-resistance, and glucose tolerance. Thus, CD4+ cells are key in the control of disease progression in diet-induced obesity (Winer et al., 2009). Th1 cells show a pro-inflammatory profile, secreting IFN-γ, which elicits the production of macrophage mediators, induces leukocyte adhesion molecules and chemokines, as well as increases antigenpresenting capacity by macrophages and endothelial cells (Geng and Hansson, 1992; Tellides et al., 2000). Interestingly, T cells extracted from fat tissue of obese mice and stimulated *in vitro* produced higher amounts of IFN-γ than those extracted from lean animals. This finding suggests that obesity primes T cells from adipose tissue toward a Th1 switch (Rocha et al., 2008). Winer *et al*. (Winer et al., 2009) reported that the increase of CD4+ T cells with obesity in mice is largely due to the accumulation of IFNγ produced by Th1 cells. The elevated levels of IFNγ also contribute to the classical activation of adipose tissue macrophages, resulting in increased inflammation in adipose tissue. On the other hand, Th2 are anti-inflammatory cells and are a source of IL-4 and IL-13. In this regard, T cells may orchestrate an inflammatory cascade, depending on the set of cytokines they predominantly produce (Hansson and Libby, 2006). A dramatic increase in the number of Th1 cells has been described in dietinduced obesity states, whereas the number of Th2 cells remained unchanged (Sell and Eckel, 2010).

T regulatory (Treg) cells are a small subset of T lymphocytes constituting normally 5–20% of the CD4+ compartment. Tregs are critical in the defense against inappropriate immune responses such as inflammation and tumorigenesis (Sakaguchi et al., 2008) because they control the behavior of other T cell populations and influence the activities of the innate immune system cells (Maloy et al., 2003). Treg cells regulate the activities of macrophages and adipocytes probably secreting IL-10, given their association with improved insulin sensitivity in both rodents and humans (Scarpelli et al., 2006). It has been recently described that the accumulation of Tregs in visceral adipose tissue is mediated by the nuclear receptor peroxisome proliferator-activated receptor (PPAR)-γ (Cipolletta et al., 2012). PPAR-γ tended to impose the transcriptional characteristics of visceral adipose tissue Tregs on naïve CD4+ T cells (Cipolletta et al., 2012). Tregs may be regulated by local hypoxia, increased adipocyte death and adipocyte stress (Feuerer et al., 2009). The diminished Treg cells in obesity could promote the infiltration of macrophages in adipose tissue and, thereby, increase the production of inflammatory cytokines.

CD8+ T cells are involved in the initiation and propagation of inflammatory cascades in obese adipose tissue (Nishimura et al., 2009). CD8+ cells are required for adipose tissue inflammation and have major roles in macrophage differentiation, activation and migration (Nishimura et al., 2009). A study in mice reported mainly CD8+ lymphocyte infiltration in hypoxic areas of epididymal adipose tissue in mice fed a high-fat diet, whereas the numbers of CD4+ and regulatory T cells were reduced (Rausch et al., 2008). The infiltration by CD8+ T cells precedes the recruitment of macrophages. Indeed, immunological and genetic depletion of CD8+ T cells lowered macrophage infiltration and adipose tissue inflammation as well as ameliorated systemic insulin resistance (Rausch et al., 2008). Another study also demonstrates an early T lymphocyte infiltration during the development of insulin resistance in a mouse model of high fat diet-induced obesity as well as a correlation of T cells with waist circumference in diabetic patients (Kintscher et al., 2008), highlighting the association of insulin resistance with adipose tissue lymphocyte infiltration. Oppositely, most of these cells were CD4+ with only a few CD8+ cells.

Recent studies have focused on another regulatory T cell subset, natural killer T (NKT) cells, in the development of obesity-associated inflammation and comorbidities (Lukens and Kanneganti, 2012; Lynch et al., 2012). NKT cells are abundant in metabolically active organs such as liver and adipose tissue (Emoto and Kaufmann, 2003; Lynch et al., 2009) and show the capacity to produce a variety of both pro- and anti-inflammatory cytokines (Wu and Van Kaer, 2013). NKT cells exert their effects in the development of inflammation and metabolic diseases in response to nutritional lipid excess (Wu and Van Kaer, 2013).

### **B-LYMPHOCYTE ACCUMULATION IN DYSFUNCTIONAL ADIPOSE TISSUE**

A fundamental pathogenic role for B cells in the development of metabolic abnormalities has been described (Winer et al., 2011; DeFuria et al., 2013). In mice, B-lymphocytes accumulate in adipose tissue before T cells, shortly after the initiation of a high-fat diet (Duffaut et al., 2009). The early recruitment of B cells promotes T cell activation and pro-inflammatory cytokine production, which potentiates M1 macrophage polarization and insulin resistance (Winer et al., 2011).

Moreover, an impaired function of toll-like receptors in B cells from patients with T2D that increases inflammation by the elevation of pro-inflammatory IL-8 and lack of antiinflammatory/protective IL-10 production has been described (Jagannathan et al., 2010).

### **ADIPOKINE DYSREGULATION AND CANCER**

A growing body of evidence suggests that the inflammatory milieu of the obese state is linked to the development of cancer through different mechanisms (Grivennikov et al., 2010). Infiltrating immune cells in adipose tissue regulates the local immune response, inducing increased levels of pro-inflammatory cytokines and adipokines and providing a major link to the obesity-associated tumor development (van Kruijsdijk et al., 2009). Critical molecules involved in the promotion of tumor cell proliferation include inflammatory transcription factors [such as NF-κB and signal transducer and activator of transcription 3 (STAT3)], adipokines (leptin and adiponectin) as well as inflammatory cytokines and enzymes (TNF-α, IL-6, MCP-1, SAA) and matrix metalloproteases (Gómez-Ambrosi et al., 2006; Aggarwal, 2009). Among all these molecules, perhaps the transcription factor NF-κB is the central mediator of inflammation (Aggarwal, 2004).

Leptin, the product of the *ob* gene, is an adipocyte-derived hormone that is a central mediator in regulating body weight by signaling the size of the adipose tissue mass (Zhang et al., 1994). Leptin levels are closely correlated with adiposity in obese rodents and humans (Maffei et al., 1995; Frühbeck et al., 1998, 2001b; Muruzábal et al., 2002). Subsequent studies have suggested that this hormone may be linked to the increased incidence of cancer in obesity (Khandekar et al., 2011). Leptin has attracted attention due to its potential function as an antiapoptotic, mitogenic, proangiogenic, and prometastatic agent, as observed in numerous *in vitro* studies (Frühbeck, 2006a,b; Park et al., 2011). Circulating levels of leptin have been investigated to determine the correlation with cancer and progressive disease. A strong association between leptin levels and colorectal and endometrial cancer has been reported (Petridou et al., 2002; Koda et al., 2007a). However, the findings of clinical studies of the relationship between leptin and breast cancer are inconsistent (van Kruijsdijk et al., 2009). Interestingly, many colorectal, breast, and endometrial cancers overexpress the leptin receptor OB-R (Koda et al., 2007a,b). Leptin produced by adjacent adipose tissue might promote the growth of colorectal cancer enhancing the proliferation of colon cancer cells although other factors released by adipocytes are also likely to be involved in the process. It suggests that the presence of tumor-associated adipose tissue represents an important microenvironmental influence (Amemori et al., 2007; Vansaun, 2013).

It has now been extensively documented that adiponectin expression is inversely correlated with obesity (Scherer et al., 1995; Hu et al., 1996). Adiponectin may influence cancer risk through its well-recognized effects on insulin resistance, but it is also plausible that adiponectin acts on tumor cells directly (Yamauchi et al., 2001; Barb et al., 2007). Interestingly, several cancer cell types express the adiponectin receptors AdipoR1 and AdipoR2 that may mediate the inhibitory effects of adiponectin on cellular proliferation (Kim et al., 2010). Epidemiologic studies show that low levels of adiponectin have an inverse association with the risk for the development of multiple cancers as well as advanced progression of disease (Wei et al., 2005; Barb et al., 2007; Bao et al., 2013). In a prospective analysis, adiponectin levels were inversely associated with endometrial (Dal Maso et al., 2004) and breast cancer risk

in postmenopausal women (Tworoger et al., 2007). Adiponectin also inhibits prostate and colon cancer cell growth (Bub et al., 2006). In a mouse tumor model, adiponectin markedly induced a cascade activation of caspase−8, −9, and −3, which leads to cell death inhibiting primary tumor growth (Brakenhielm et al., 2004b).

TNF-α, a cytokine originally identified as mediating endotoxin-induced tumor necrosis (Carswell et al., 1975), has been shown to be involved in the development of a number of cancers through the promotion of vessel growth and tumor destruction by direct cytotoxicity angiogenesis (Leibovich et al., 1987) as well as the metastatic potential of circulating tumor cells (Orosz et al., 1993). However, although TNF-α is the most potent activator of NF-κB, elevated levels of TNF-α in tissue or serum are not very common in cancer patients (Aggarwal and Gehlot, 2009). The increased circulating levels of TNF-α of both obese rodents and obese humans, suggest a possible link between obesity and tumorigenesis (Khandekar et al., 2011). In this regard, obesity-promoted hepatocellular carcinoma development was dependent on increased production of the cytokines TNF-α and IL-6, which cause hepatic inflammation and activation of the oncogenic transcription factor STAT3 (Park et al., 2010). Diet-induced obesity produces an elevation in colonic TNF-α giving rise to a number of alterations including the dysregulation of the Wnt signaling pathway, with an important involvement in colorectal cancer (Liu et al., 2012).

Another pro-inflammatory molecule produced in adipose tissue is IL-6. The circulating levels of IL-6 are higher in subjects with obesity-related insulin resistance (Kern et al., 2001). IL-6 is a pleiotropic cytokine with a significant role in growth and differentiation (Ghosh and Ashcraft, 2013) that signals to the nucleus through STAT3, an oncoprotein that is activated in many human cancers and transformed cell lines (Bromberg et al., 1999). Interestingly, STAT3 is activated by leptin (Vaisse et al., 1996) and probably may have a role in the pro-tumorigenic effects of this adipokine. Moreover, different studies indicate that serum IL-6 levels are a negative indicator of the development of breast cancer in overweight or obese patients with prominent insulin resistance (Gonullu et al., 2005; Knupfer and Preiss, 2007).

MCP-1 is a member of the CC chemokine superfamily (Panee, 2012) that plays a crucial role in recruitment and activation of monocytes during acute inflammation and angiogenesis (Charo and Taubman, 2004). Circulating levels of MCP-1 are generally increased in obese patients compared to lean controls (Catalán et al., 2007). Gene expression levels in adipose tissue follow the same trend, being higher in the visceral and subcutaneous adipose tissue of obese patients compared to lean volunteers (Huber et al., 2008). There is emerging evidence that MCP-1 induces tumor cell proliferation via activation of the phosphatidylinositol 3-kinase/protein kinase B (PI3K/Akt) pathway in various cancer types (Loberg et al., 2006). Moreover, MCP-1 promotes cancer tumorigenesis indirectly via its effects on macrophage infiltration (Walter et al., 1991). It has been described that MCP-1 is highly expressed by breast tumor cells and has causative roles in breast malignancy and metastasis (Soria and Ben-Baruch, 2008). The pleiotropic roles of CCL2 in the development of cancer are mediated through its receptor, CCR2 (Lu et al., 2007).

Novel adipokines involved in obesity-associated inflammation have emerged as important players of tumor growth (Catalán et al., 2011d). OPN is a secreted glycoprotein expressed by different cellular types (Brown et al., 1992). Recently, several studies have highlighted the expression of OPN in adipose tissue of both humans and mice and its involvement in obesity and obesityassociated T2D promoting inflammation and the accumulation of macrophages in adipose tissue (Gómez-Ambrosi et al., 2007; Nomiyama et al., 2007). High OPN expression in the primary tumor is associated with early metastasis and poor outcome in human breast and other cancers (Denhardt et al., 2001). LCN-2 also known as neutrophil gelatinase associated lipocalin is a component of the innate immune system with a key role in the acute-phase response to infection (Flo et al., 2004). Increased levels of LCN-2 in visceral adipose tissue in human obesity and a relationship with pro-inflammatory markers has also been described (Catalán et al., 2009, 2013). In addition to inhibiting invasion and metastasis, LCN-2 also appears to be a negative regulator of angiogenesis in cancer cells (Chakraborty et al., 2012). Tenascin-C (TNC) is an extracellular matrix glycoprotein specifically induced during acute inflammation and persistently expressed in chronic inflammation (Chiquet-Ehrismann and Chiquet, 2003; Udalova et al., 2011). Increased expression of TNC has been described in most solid cancers, playing important roles in enhancing proliferation, invasion and angiogenesis during tumorigenesis and metastasis (Midwood and Orend, 2009; Midwood et al., 2011). In this line, elevated expression levels of TNC have been found in visceral adipose tissue in obesity with a tight association of genes being involved in maintaining the chronic inflammatory response associated to obesity (Catalán et al., 2011c). YKL-40 is another adipokine involved in inflammation and cancer cell proliferation. YKL-40 is a growth factor with elevated gene and protein expression levels in visceral adipose tissue in human obesity-associated T2D (Catalán et al., 2011b). Moreover, circulating levels of this cytokine are described as an obesity-independent marker of T2D (Nielsen et al., 2008). On the other hand, elevated levels of YKL-40 were found in patients with different types of solid tumors, including several types of

### **REFERENCES**


organ. *Trends Endocrinol. Metab.* 11, 327–332. doi: 10.1016/S1043- 2760(00)00301-5


adenocarcinomas, small cell lung carcinoma, glioblastoma, and melanoma (Johansen et al., 2006). Calprotectin is a member of the S100 protein family released by activated phagocytes and recognized by TLR4 on monocytes (Vogl et al., 2007). Calprotectin is not only involved in differentiation and cell migration but has also been identified as an important regulator of inflammation in cancer development and tumor spreading (Hiratsuka et al., 2008; Ehrchen et al., 2009). The increased levels of calprotectin in obesity and obesity-associated T2D have been shown decrease after weight loss achieved by RYGB (Catalán et al., 2011a).

### **CONCLUSIONS**

The prevalence of obesity has risen steadily for the past several decades. Excess of adiposity is associated with increased death rates for all cancers combined and for cancers at multiple specific sites with the strongest evidence for endometrial cancer, postmenopausal breast cancer, colon cancer, renal cell carcinoma of the kidney, liver, gallbladder, esophageal, and pancreatic cancer. The mechanisms linking obesity and cancer are unclear but lowgrade chronic inflammation, dysregulation of growth signaling pathways, chronic hyperinsulinemia, and hypoxia associated to obesity are widely accepted as important factors in cancer pathogenesis. Particular attention is placed on the pro-inflammatory environment associated with the obese state, specifically highlighting the involvement of infiltrated immune cells into adipose tissue. In this sense, the understanding of the regulatory mechanisms that lead to polarization of macrophages or lympocytes in adipose tissue toward a pro-inflammatory phenotype will provide new ways to control adipose tissue inflammation (**Figure 2**). A better understanding of the mechanistic links between obesity and cancer will help to identify intervention targets and strategies to avoid the pro-tumorigenic effects of obesity.

### **ACKNOWLEDGMENTS**

This work was supported by Fondo de Investigación Sanitaria (FIS) PI11/02681, PI12/00515 from the Spanish Instituto de Salud Carlos III and by the Department of Health (48/2011 and 58/2011) of the Gobierno de Navarra of Spain.

Adiponectin in relation to malignancies: a review of existing basic research and clinical evidence. *Am. J. Clin. Nutr.* 86, S858–S866.


adipose tissue and differentiating human adipose cells in primary culture. *Diabetes* 49, 532–538. doi: 10.2337/diabetes.49.4.532


obesity: impact of type 2 diabetes mellitus and gastric bypass. *Obes. Surg.* 17, 1464–1474. doi: 10.1007/s11695-008-9424-z


Gil, M. J., et al. (2008). Expression of caveolin-1 in human adipose tissue is upregulated in obesity and obesity-associated type 2 diabetes mellitus and related to inflammation. *Clin. Endocrinol.* 68, 213–219.


*Nature* 438, 967–974. doi: 10.1038/ nature04483


and humans. *J. Nutr. Biochem.* 21, 774–780. doi: 10.1016/j.jnutbio. 2009.05.004


pre-metastatic phase. *Nat. Cell Biol.* 10, 1349–1355. doi: 10.1038/ ncb1794


T cells contribute to macrophage recruitment and adipose tissue inflammation in obesity. *Nat. Med.* 15, 914–920. doi: 10.1038/nm.1964


*Endocrinology* 144, 3765–3773. doi: 10.1210/en.2003-0580


Friedman, J. M. (1996). Leptin activation of Stat3 in the hypothalamus of wild-type and *ob/ob* mice but not *db/db* mice. *Nat. Genet.* 14, 95–97. doi: 10.1038/ng0996-95


*Cancer* 49, 431–435. doi: 10.1002/ ijc.2910490321


*Biochim. Biophys. Acta* 1822, 14–20. doi: 10.1016/j.bbadis.2010.12.012


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 11 June 2013; paper pending published: 17 July 2013; accepted: 12 September 2013; published online: 02 October 2013.*

*Citation: Catalán V, Gómez-Ambrosi J, Rodríguez A and Frühbeck G (2013) Adipose tissue immunity and cancer. Front. Physiol. 4:275. doi: 10.3389/fphys. 2013.00275*

*This article was submitted to Systems Biology, a section of the journal Frontiers in Physiology.*

*Copyright © 2013 Catálan, Gómez-Ambrosi, Rodríguez and Frühbeck. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Similar structures but different roles – an updated perspective onTLR structures

### *Balachandran Manavalan, Shaherin Basith and Sangdun Choi\**

*Department of Molecular Science and Technology, Ajou University, Suwon, South Korea*

#### *Edited by:*

*Masa Tsuchiya, Keio University, Japan*

#### *Reviewed by:*

*Vladimir N. Uversky, University of South Florida, USA Tiandi Wei, Shandong University, China*

#### *\*Correspondence:*

*Sangdun Choi, Department of Molecular Science and Technology, Ajou University, Suwon 443-749, South Korea. e-mail: sangdunchoi@ajou.ac.kr*

Toll-like receptors (TLRs) are pattern recognition receptors that recognize conserved structures in pathogens, trigger innate immune responses, and prime antigen-specific adaptive immunity. Elucidation of crystal structures of TLRs interacting with their ligands such as TLR1-2 with triacylated lipopeptide, TLR2-6 with diacylated lipopeptide, TLR4–MD-2 with LPS, and TLR3 with double-stranded RNA (dsRNA) have enabled an understanding of the initiation of TLR signaling. Agonistic ligands such as LPS, dsRNA, and lipopeptides induce "m" shaped TLR dimers in which C-termini converge at the center. Such central convergence is necessary to bring the two intracellular receptor TIR domains closer together and promote their dimerization, which serves as an essential step in downstream signaling. In this review, we summarize TLR ECD structures that have been reported to date with special emphasis on ligand recognition and activation mechanism.

**Keywords: innate immunity, ligand, myeloid differentiation factor 88,Toll-like receptor**

### **INTRODUCTION**

The Toll-like receptor (TLR) protein family plays an important role in the innate immune system by recognizing common structural patterns in diverse microbial molecules (Gay and Gangloff, 2007). TLRs are type I transmembrane glycoproteins characterized by the presence of an extracellular domain (ectodomain; ECD) containing leucine rich repeats (LRRs), which is primarily responsible for mediating ligand recognition, followed by a single transmembrane helix and an intracellular Toll-like/interleukin-1 (IL-1) receptor (TIR) domain that is responsible for downstream signaling. To date, 10 and 12 functional TLRs have been identified in humans and mice, respectively. TLR1-9 is conserved in both species; however, mouse TLR10 is not functional because of a retrovirus insertion, and TLR11-13 have been lost from the human genome (Kawai and Akira, 2010). "Toll" was first identified as a protein important in the early stages of development in *Drosophila*. Later, it was discovered that Toll signals to Dorsal (like mammalian NF-κB) and is involved in the coordination of antifungal and antibacterial responses (Rosetto et al., 1995; Lemaitre et al., 1996).

The TLR family can be largely divided into two subgroups, extracellular and intracellular, depending on their cellular localization. TLR1, 2, 4, 5, 6, and 10 are largely localized on the cell surface to recognize PAMPs. Conversely, TLR3, 7, 8, and 9 are localized in intracellular organelles such as endosomal/lysosomal compartments and the endoplasmic reticulum (ER). Among the TLRs, the ligand (lipopolysaccharide; LPS) of TLR4 was first identified by genetic studies (Lemaitre et al., 1996). Lipopeptides or lipoproteins are recognized by TLR2 in complex with TLR1 or 6, while viral double-stranded RNA (dsRNA) is recognized by TLR3, flagellin is recognized by TLR5, single-stranded RNA is recognized by TLR7 and 8, and host- or pathogen-derived DNA is recognized by TLR9. In addition to known pathogen/microbial derived

ligands, TLR also recognizes the endogenous ligands (produced by stressed or damaged cells) and synthetic ligands listed in **Table 1**.

The common mechanism of TLR signaling is that interaction of an agonist with the ECD either induces the formation of a receptor dimer, or changes the conformation of a pre-existing dimer (Latz et al., 2007; Zhu et al., 2009) in such a way that it brings two intracellular TIR domains of the TLRs to interact physically. This simple rearrangement serves as a nucleating act for the recruitment of downstream signaling adapter proteins (Jin and Lee, 2008). Signaling cascades via the intracellular TIR domains are mediated by specific adaptor molecules such as Myd88 (Myeloid differentiation factor 88), Mal (Myd88 adaptor like), TRIF (TIR domain containing adaptor inducing interferon-β), and TRAM (TRIF related adaptor molecule). These adaptor proteins also contain TIR domains that mediate TIR–TIR interactions between TLR receptors, receptor–adaptor, and adaptor–adaptor interactions that are critical for signaling (Palsson-Mcdermott and O'Neill, 2007). In general, intracellular TIR domain of adaptor proteins are composed of approximately 160 amino acid residues and the primary sequences of TIR domains are characterized by three conserved sequence boxes designated Box 1, 2, and 3. Box 1 is considered to be the signature sequence of the family, whereas boxes 2 and 3 contain functionally important residues involved in signaling (Carpenter and O'Neill, 2009). These processes result in the formation of a large multimer complex, or "signaling platform," that propagates downstream signaling, eventually leading to changes in the expression of several hundred primary immune response genes. However, the architecture of the TLR signaling complexes is poorly understood at this time due to a lack of reliable methods to study such interactions as well as the inherent weaknesses of individual inter- and intra-protein interactions in transitory complexes.

#### **Table 1 | Toll-like receptors and their principal ligands.**


*(Continued)*

#### **Table 1 | Continued**


Structural studies of TLR–ligand complexes have been an attractive area of research that has enabled a better understanding of the structure based activation of innate immunity. Such information is essential for the development of adjuvants that specifically bind to TLR ECD and activate its signaling and also in the development of anti-inflammatory drugs that block TLR mediated signaling. To date, five TLR–ligand structures (TLR1–TLR2–Pam3CSK4, TLR2–TLR6–Pam2CSK4, TLR4– MD-2–Eritoran, TLR4–MD-2–LPS, and TLR3–dsRNA) have been determined (Jin et al., 2007;Kim et al., 2007b; Liu et al., 2008;Kang et al., 2009; Park et al., 2009). Currently, these solved atomic models can be used as templates to predict the structures of other unknown TLRs. In this review article, we discuss how similar structures of TLR ECD LRRs have evolved to bind a wide array of different ligands and their activation mechanism.

### **GENERAL STRUCTURE OF TLR ECDs**

The ECD of TLR members contains multiple blocks of LRR, which are protected by cysteine rich regions to form cap-like structures at the LRR-N- and -C-terminal ends. The C-terminal capping structure of TLRs is connected to the cytoplasmic TIR domain via a single transmembrane α helix. Individual LRR module (approximately 20–30 amino acid residues long) consists of conserved "LxxLxLxxNxL" motifs and a variable region (**Figure 1A**). The conserved leucine residue in these motifs can be substituted by other hydrophobic amino acids (Matsushima et al., 2007). The asparagine residues that are also present in the motif form continuous H-bonds with the backbone carbonyl group of neighboring strands throughout the entire protein, resulting in an asparagine ladder. These conserved asparagine residues are important in maintaining the overall shape of the ECD, which can also be replaced by other residues such as cysteine, threonine, or serine, which are able to form H-bonds (Kajava et al., 1995; Kobe and Deisenhofer, 1995; Bell et al., 2003). The variable "*x*"

residues present in the motif are exposed to the solvent. Among them, only few residues are involved in ligand recognition. The "LxxLxLxxNxL" motifs located in the inner concave surfaces of the horseshoe-like structure form parallel β-strands, whereas the variable region forms a convex surface generated by α helices, βturns, and loop structures (**Figure 1A**). LRR proteins are present in a very large and diverse group of proteins and have been found to be involved in a wide variety of physiological functions including immune responses, signal transduction, cell cycle regulation, enzyme regulation, and transcriptional regulation (Buchanan and Gay, 1996; Dolan et al., 2007).

The crystallization of some LRR proteins, including TLRs, has proven to be very difficult. This problem was overcome by the introduction of a new method known as the "hybrid LRR technique" (Jin et al., 2007; Kim et al., 2007a,b; Kang et al., 2009; Park et al., 2009). Hagfish variable lymphocyte receptors (VLRs) were chosen as fusion partners, and the TLR and VLR were fused at their conserved LxxLxLxxNxL motifs. Interestingly, the TLR–VLR hybrid demonstrated that the structure and function of the fusion proteins were not altered. Some hybrids fail to form soluble proteins due to the atomic collisions or the exposed hydrophobic core at the fusion sites. However, hybrids that produced soluble proteins formed stable heterodimers and possibly bound with ligands that were used for the crystallographic studies (Jin et al., 2007;Kim et al., 2007a,b; Kang et al., 2009; Park et al., 2009).

The LRR protein family can be classified into seven subfamilies based on their sequence and structural patterns. TLR belongs to the typical subfamily of the LRR superfamily (Kobe and Kajava, 2001; Matsushima et al., 2007). Each LRR region consists of 24 amino acid residues, possesses the conserved motif, xLxxLxxLxLxxNxLxxLPxxxFx, and displays a unique horseshoe shape structure (**Figure 1B**). LRR modules of TLR1, 2, 4, and 6, but not TLR3, have been shown to deviate from their conformation and length when compared with other typical members (Kim et al.,

2007b; Jin and Lee, 2008; Kang et al., 2009; Park et al., 2009). These four TLRs have major structural changes in their central β-sheets; hence, their LRR domains can be divided into an N-terminal, central, and C-terminal domain, respectively (**Figure 1C**). The central domain of TLR1, 2, 4, 6, and 10 lacks an asparagine ladder, which is primarily responsible for the stabilization of the horseshoelike structure. Furthermore, this broken asparagine ladder leads to unusual structural distortions. LRR modules of the central domain differ considerably in the number of residues, varying from 20 to 33. However, the LRR modules present in the majority of LRR proteins are of uniform length (Kajava et al., 1995; Kobe and Deisenhofer, 1995; Matsushima et al., 2007). LRR subfamilies with shorter LRR modules encompass loops in the convex surface, and those containing longer LRR modules have bulkier α helices. It should be noted that helices require more space than loops; therefore, subfamilies with α helices have smaller radii than those with loops that generate enough space in the convex region (Jin and Lee, 2008; Kang and Lee, 2011). This anomaly explains the structural conformation variations of TLR receptors and the ability of the receptor to bind with diverse ligands as well as co-receptors.

### **CRYSTALLOGRAPHIC STRUCTURES OF TLR ECD WITH THEIR LIGANDS**

To date, five crystallographic structures of the TLR ECDs and their ligand complexes have been reported. Of those, four were complexed with agonistic ligands and the remaining one was complexed with a co-receptor and an antagonistic ligand. These structures provide evidence about how pattern recognition receptors (PRRs) recognize patterns present in the ligands. Additionally, these studies suggest that ECD activation mechanisms are also common among all TLR receptor family members.

### **TLR2 COMPLEXES**

Toll-like receptor-2 heterodimerizes with TLR1 or 6 to recognize multiple PAMPs of fungi, Gram-positive pathogens and mycobacteria (Kawai and Akira, 2010). TLR2 recognizes lipopeptides that are anchored to the bacterial membrane by lipid chains covalently attached to N-terminal cysteine (Hantke and Braun, 1973). Lipopeptides from Gram-negative bacteria have three lipid chains. Two of these are attached to the glycerol through an ester bond, which is in turn connected to the sulfur atom of the N-terminal cysteine. The third lipid chain is connected to the amino terminal via amide bonds. Lipopeptides from Gram-positive bacteria or mycoplasma have only two lipid chains and lack the amidelinked lipid chain (Muhlradt et al., 1997; Shibata et al., 2000). Synthetic lipopeptide analogs (Pam2CSK4,Pam3CSK4) containing a di- or tri-acylated cysteine group mimic the pro-inflammatory properties of the lipoproteins, which confirms that acylated Nterminal cysteine is the primary motif responsible for stimulating the immune response. Furthermore, TLR2 receptor also recognizes other ligands such as lipoteichoic acid, lipomannan, peptidoglycan, zymosan, and phenol-soluble modulin (Zahringer et al., 2008).

### **TLR1–TLR2–TRIACYLATED LIPOPEPTIDE COMPLEX**

The crystal structure of TLR2 in association with TLR1 and a synthetic triacylated lipopeptide, Pam3CSK4, has been determined (Jin et al., 2007). Indeed, this is the first crystal structure of a TLR dimer resulting from the binding of agonists, which further explains the ligand-induced dimerization. In this structure, the ECD of TLR2 and 1 form an "m" shaped heterodimer, with the two N-terminals extending in the opposite direction and the Cterminals converging in the middle region (**Figure 2A**). Pam3CSK4 consists of three lipid chains,two of those insert into the hydrophobic pocket of TLR2 and the remaining one inserts into a narrow hydrophobic channel of TLR1 (**Figure 2B**). Apart from the acyl chain binding, the head groups of Pam3CSK4 also interact with TLRs 1 and 2. In particular, TLRs form H-bonds with glycerol and peptide backbone and also form hydrophobic interactions with sulfur atoms. The ligand-binding pockets of TLR1 and 2 are located at the junction of the central and C-terminal domains, indicating the importance of structural transition in the formation of ligand-binding pockets. The ligand binding in the convex surface of TLR2/1 was found to be quite unusual because most ligand-binding sites on LRR proteins that have been identified were found to be present on the concave surfaces (Kobe and Deisenhofer, 1995). The ligand bound complex of TLR1 and 2 is stabilized by non-covalent forces such as H-bonding, hydrophobic interactions and ionic interactions at the interface near the ligand-binding

pocket. It is worth noting that TLR1 P315L polymorphic variation has been reported to interfere with TLR1 signaling (Omueti et al., 2007). In fact, this P315 residue is located at the TLR1/2 dimer interface, highlighting the importance of P315 in TLR1 and 2 heterodimerization. Moreover, species-specific lipoproteins response has also been observed (Grabiec et al., 2004). Lipopeptides with shorter lipid chains act as more potent activator in mouse than human TLR2. This phenomenon is mainly due to the structural variations observed in the TLR2 pocket (Jin et al., 2007).

### **TLR2–TLR6–DIACYLATED LIPOPEPTIDE COMPLEX**

The crystal structure of TLR2 in association with TLR6 and a synthetic diacylated lipopeptide Pam2CSK4 has been determined (Kang et al., 2009). In this structure, the ECD of TLR2 and 6 form an "m" shaped heterodimer, with the two N-terminals extending in the opposite direction and the two C-terminal ends converging in the middle region (**Figure 2C**). The dimeric arrangement of TLR2/6 is similar to TLR2/1 complex. However, TLR1 and 6 contain important structural differences in their ligand-binding sites and dimerization interface. In TLR6, the side chains of two phenylalanine (F343 and F365) residues block the lipid-binding pocket, leading to a pocket that is less than half the length of the TLR1 (**Figure 2D**). This structural feature provides selectivity for diacylated over triacylated lipopeptides, as confirmed by the mutation studies of these phenylalanine residues to the corresponding amino acids of TLR1 that rendered TLR6 fully responsive not only to diacyl but also to triacylated lipopeptides. In

the TLR2/6 complex, two-ester bound lipid chains of Pam2CSK4 are inserted into a hydrophobic pocket in TLR2 that is located between the LRR11 and 12 loops. Whereas, F319 located in the LRR11 loop of TLR6, forms an H-bond with the peptide bond of the ligand. Such an H-bond network is absent in the TLR2–TLR1– Pam3CSK4 structure. Moreover, TLR2-6 heterodimerization is primarily mediated by surface exposed residues of LRR11-14 modules. In the TLR2-1 complex, the amide bound lipid chain plays an important role in bridging the two TLRs. Although Pam2CSK4 lacks these amide bound chains, it still forms a dimer, primarily through hydrophobic and hydrophilic interactions of their surface exposed residues between the two TLRs. This area of hydrophobic interaction is 80% larger than in the TLR1/2 complex, suggesting that this surface interaction together with the H-bond between LRR11 and the ligand drives the heterodimerization of TLR6.

### **TLR2–LPTA**

During the course of TLR2–TLR6–diacylated lipopeptide complex determination, TLR2 in complex with two non-peptide ligands, *Streptococcus pneumonia* lipoteichoic acid (pnLTA) and PE-DTPA (1,2-dimyristoyl-sn-glycero-3-phosphoethanolamine-*N*-diethylenetriaminepentaacetic acid), has been determined (Kang et al., 2009). PE-DTPA is a synthetic derivative of phospholipid in which metal coordinating DTPA is attached to the ethanolamine head group. In the monomeric TLR2–pnLTA structure, the overall horseshoe-shaped structure of TLR2 and the ligand-binding pocket remain unchanged. When compared with TLR2-6–Pam2CSK4,the position of the sugar head group of LTA of the TLR2–pnLTA complex displaces upward by <sup>∼</sup>5.2 Å and rotated by 110 Å toward the lateral surface of the ECD. Moreover, the hydrogen donor and acceptor atoms in the sugar head group of pnLTA have a different arrangement than the lipopeptides. Hence, it is not possible to form an H-bonding network. Due to the shift, TLR1 or TLR6 cannot approach TLR2 to form heterodimers. In the TLR2–PE-DTPA structure, the acyl chain and head group arrangements are similar to those of TLR2–pnLTA. When compared with TLR2-6 lipopeptide complexes, the head group of PE-DTPA is shifted <sup>∼</sup>4.3 Å. This structural shift primarily occurs due to a lack of proper H-bonding between the ligand head group and the TLRs, as well as to repulsion of the hydrophilic oxygen atom of the ligand, whose corresponding position in lipopeptide contains sulfur that forms a hydrophobic interaction with TLRs. These complexes (pnLTA and PE-DTPA) have little or no ability to activate TLR2 because of the structural shift in ligand head groups, which strongly suggests that the ligand/lipopeptide head group plays an important role in TLR2 activation via heterodimerization. A large proportion of TLR2 ligands are lipopeptides that can bind to the TLR2 hydrophobic pocket, but some TLR2 ligands including peptidoglycan, hyaluronic acid, teichoic acid, and zymosan do not contain this hydrophobic region (**Table 1**). Hence, the interaction of these ligands with TLR2 might use different binding sites. Further crystallographic or modeling studies are required to clarify the exact binding sites of non-lipid ligands and to verify whether these bindings induce the formation of similar heterodimeric structures such as TLR1-2 or TLR2-6.

### **TLR3–dsRNA COMPLEX**

Toll-like receptor-3 has been shown to recognize dsRNA produced during viral replication (Alexopoulou et al., 2001). The first TLR3 structure was identified independently by two different groups (Bell et al., 2005; Choe et al., 2005). Both groups have shown that the LRR region of TLR3 displays a heavily glycosylated horseshoe-shaped solenoid structure. Choe et al. (2005), postulated that dsRNA might bind at the convex surface because this region is a glycan-free face, which enables dsRNA to bind to the positively charged residues of the TLR ECD. However, Bell et al. (2005) suggested that the nucleotide binding site is located in the concave surface. This is likely due to the fact that during crystallization, two sulfate molecules from the crystallization medium stably bound to residues in LRRs 12 and 20, and these two LRRs contain large insertions. As the sulfate ions share the same atomic arrangement as phosphate groups, those present in the dsRNA backbone might be able to bind to one or both of the sulfate binding sites. Hence, each group prediction differs in the dsRNA binding sites and it was not clear how TLR3 specifically recognizes dsRNA and initiates signaling. However, the recently solved crystal structure of mTLR3 bound to dsRNA explains how this is accomplished (Liu et al., 2008). TLR3 ECD exists as a monomer in solution and the dimerization only occurs upon ligand binding. In the structure, dsRNA interacts with both the N- and C-terminal sites on the lateral side of the convex surface of the TLR3 ECD (**Figure 3A**). The N-terminal interaction sites are composed of LRRNT and LRR1-3 modules, whereas the C-terminal site is composed of LRR19-21 modules. The dsRNA in the complex retains a typical A-DNA like structure, in which the ribose phosphate backbone and the position of the grooves are the major determinants in binding (**Figure 3B**). The mTLR3–ECD interacts with the sugar phosphate backbones, but not with individual bases, which accounts for the lack of any particular nucleotide specificity in binding (Alexopoulou et al., 2001; Leonard et al., 2008). This feature would prevent the viruses from escaping detection by mutation (Botos et al., 2011). Moreover, the identified structure reveals the possible reasons for the inability of TLR3 to recognize dsDNA. The helical structure of dsDNA is the B form, whereas dsRNA is present in A form. The B form helical structure would not be structurally compatible with the two terminal binding sites on the TLR3–ECD. Moreover, several H-bonds were observed between TLR3–ECD and the 2- -OH groups of dsRNA that is missing in dsDNA.

The TLR3–TLR3 interaction site located near the LRRCT occupies only a small portion, demonstrating that ligand–protein interaction as the major driving force behind TLR3 dimerization. The ligand interaction sites (two TLR3 ECD N-terminal regions) are separated by about 120 Å, thus showing why only 40–50 base pairs are sufficient for the stabilized binding of dsRNA to TLR3 (Leonard et al., 2008). However, there have also been study reports of dsRNA of substantially less than 40 base pairs being able to initiate TLR signaling (Kariko et al., 2004; Kleinman et al., 2008). This raises the possibility that the N-terminal interaction site is not essential for efficient TLR3 signal induction in some experimental conditions. Moreover, mutation studies

have identified functional amino acid residues in three different regions (N-terminal, C-terminal, and dimerization region) of the TLR3 ECD. H539E and N541A mutation in the C-terminal, H39A/E and H60A/E mutation in the N-terminal region, and D648A, T679A, and P680L in the dimerization region leads to a loss of TLR3 activity (Bell et al., 2006; Ranjith-Kumar et al., 2007; Fukuda et al., 2008; Wang et al., 2010). Although hydrophobic interactions play a crucial role in binding of lipopeptides to TLR1/2 and 2/6, the TLR3 interaction with dsRNA mainly involves electrostatic interactions and H-bonds. Despite these differences in ligand interactions, the ligand-induced dimers of TLR3, TLR2-6, and TLR1-2 adopt a similar fold, the "m" shaped dimer, in which the two C-termini of the TLR ECDs are in proximity, thereby bringing the two TIR domains together on the cytoplasmic side and providing a scaffold for the recruitment of adaptor proteins and subsequent initiation of further downstream signaling.

### **TLR4–MD-2-AGONIST/ANTAGONIST COMPLEX**

A crystal structure of human TLR4–MD-2 complex binding with an antagonist (Eritoran) has been described (Kim et al., 2007b). Unlike other TLRs that recognize ligands directly, TLR4 does not directly interact with ligands. Alternatively, TLR4 forms a stable 1:1 heterodimer with MD-2 and uses the hydrophobic pocket in MD-2 to interact with the LPS of Gram-positive bacteria (Shimazu et al., 1999). Two accessory proteins such as lipid-binding protein (LBP) and CD14, whose main function is to extract LPS from the bacterial membrane and transferring it efficiently into MD-2. The general structure of bacterial LPS consists of a hydrophobic lipid A domain, an oligosaccharide core and a distal polysaccharide (the O antigen; Bryant et al., 2010). Lipid A moiety alone is sufficient to activate innate immune responses. Lipid A consists of a diglucosamine diphosphate head group that is substituted with a variable number of acyl chains, ranging from four to eight. In general, lipid A moieties consisting of hexa acylated lipid chain and two phosphate groups are powerful immune stimulators, whereas Lipid A with five acyl chains have <sup>∼</sup>100-fold less activity. Several synthetic derivatives of lipid A have been developed as candidate drugs against sepsis and septic shock syndrome. Eritoran or E5564 is a synthetic molecule derived from the lipid A component of non-pathogenic LPS of *Rhodobacter sphaeroides*. This compound contains only four acyl chains and acts as a strong antagonist of TLR4–MD-2 complex and is currently in Phase III clinical trial (Mullarkey et al., 2003; Rossignol and Lynn, 2005).

Toll-like receptor-4 ECD has 22 LRRs capped by LRRNT and LRRCT at its N- and C-termini, respectively. MD-2 has a cup fold like structure and is composed of antiparallel β sheets forming a large hydrophobic core, with the surface area of <sup>∼</sup>1000 Å that is able to bind with ligand. The opening of the pocket is lined with positively charged residues and three disulfide bridges that stabilize the cup-like structure. It should be noted that MD-2 does not have either a transmembrane or an intracellular domain; hence it is not able to transmit the signals. Recent TLR4 and MD-2 complex clearly indicated that only one-third of MD-2 is involved in TLR4 binding, the remaining part is available for the interaction with ligands (Kim et al., 2007b; Park et al., 2009). The MD-2 binding site of TLR4 can be divided into two chemically and evolutionary distinct areas, termed as A and B patches. The A patch is provided by the N-terminal domain of TLR4, which is mainly comprised of negatively charged amino acids. The B patch is located in the central domain that is predominantly comprised of positively charged residues. The TLR4 binding surface of MD-2 shows a clear charge complementarity to the TLR4 surface (**Figure 4E**). In the crystal structure, four acyl chains of Eritoran occupy approximately 90% of the solvent accessible volume of the MD-2 pocket. Of those, two acyl chains are in the fully extended conformation within the binding pocket, while the remaining two acyl chains are bent in the middle (**Figure 4A**). The diglucosamine backbone is fully exposed to the solvent and the phosphate groups make ionic contacts with positively charged residues at the surface of the pocket. Additionally, there is no direct interaction between Eritoran and TLR4 (Kim et al., 2007b). Indeed, this is very similar to the recently identified structure of MD-2 in complex with the lipid IVA (**Figure 4B**; Ohto et al., 2007). Lipid IVA, or compound 406, is an intermediate in LPS biosynthesis, which contains four lipid chains with lengths and structures that differ from the Eritoran. Lipid IVA acts as an antagonist of human TLR4–MD-2, but behaves as an agonist of mouse TLR4–MD-2 (Means et al., 2000). Despite the significant structural differences seen between lipid IVA and Eritoran, their binding modes are similar. The structural superimposition of TLR4–MD-2–Eritoran and MD-2–lipid IVA have shown that lipid chains of different lengths are accommodated in the MD-2

pocket with only a 2-Å shift in the glucosamine backbone (Kim et al., 2007b). In both lipid IVA–MD-2 and Eritoran-TLR4–MD-2 structures, the ligands did not induce any conformational changes in the receptors, thereby demonstrating that these molecules are antagonists.

The much anticipated TLR4–MD-2–LPS complex has recently been solved (Park et al., 2009). The authors demonstrated that TLR4 and MD-2 proteins associate with each other without LPS, but the dimerization of the TLR4–MD-2 complex with another TLR4–MD-2 occurs only via binding of LPS. The receptor multimer is composed of two copies of the TLR4–MD-2–LPS complex arranged in a symmetrical fashion (**Figure 4D**). In the crystal structure, five of the six lipid chains of LPS bind to this pocket, while the remaining lipid chain that is exposed on the surface of MD-2 forms hydrophobic interactions (F440, F463, and L444) with the second TLR4 (**Figure 4C**). Mutation of the F440 and F463 interface residues disrupt TLR4 dimerization and its signaling (Resman et al., 2009). The binding of LPS induces localized conformational changes in MD-2, primarily on the F126 loop region, which leads to the hydrophilic residues in the F126 loop

and R90 residues of MD-2 form H-bonds and ionic interactions with the second TLR4, further stabilizing the complex. In addition to the above major interaction, TLR4 makes an additional contribution to dimerization by directly interacting with second TLR4 (**Figure 4F**). The previously solved MD-2 bound to the Eritoran and lipid IVA structures revealed that F126 of MD-2 was exposed to the solvent, thereby showing no conformational changes and hence MD-2 complex was unable to induce TLR4 dimerization. Park et al. (2009) clearly demonstrated that structural changes that mainly occurred at the F126 loop of MD-2 following LPS simulation are necessary for the dimer formation and subsequent initiation of downstream signaling. Mutation studies of the F126 residue of MD-2 supports this finding. The mutation of F126 did not affect LPS binding; however, it abolished the ability of the TLR4–MD-2 heterodimer to form the activated heterotetramer, suggesting that these residues form part of the dimerization region (Kobayashi et al., 2006; Kim et al., 2007b). Moreover, LPS contain two phosphate groups that are important for forming ionic interactions with positively charged residues on both TLR4 and MD-2. Comparison of LPS bound

MD-2 with Eritoran–MD-2 indicates that the additional two lipid chains in LPS displace the phosphorylated glucosamine backbone upward by 5 Å toward the solvent area, which allows the phosphate groups to associate with the second TLR4 (Park et al., 2009). In addition to the displacement, the glucosamine backbones are also rotated by 180˚, interchanging the phosphate groups. It should be noted that there is a general rule for TLR signaling (based on the structural and biochemical studies); specifically, TLR agonists induce TLR dimerization, whereas antagonists are likely to interfere with dimerization (Brodsky and Medzhitov, 2007).

Crystallographic studies have provided almost 50% of the mammalian TLR structures (TLR1, 2, 3, 4, and 6), which have provided a basis for the understanding of agonistic induced TLR activation and antagonistic mediated TLR inhibition. Each TLR member recognizes "n" number of ligands starting from the microbes, and each ligand has its own unique properties. From this review, we come to know that the binding sites of these ligands cannot be similar in all TLRs. For example, TLR4 recognizes various ligands (**Table 1**), but the binding site of those ligands are not the same as LPS in TLR4–MD-2 complex. X-ray crystallographic studies have revealed that there are only a limited number of TLR ECD interactions with ligands. The identification of all ligand interactions with each TLR member (listed in **Table 1**) using X-ray crystallographic studies have proven to be very difficult. Hence, we have to rely on molecular modeling studies along with biochemical validation, to gain further insights into these interactions.

#### **COMPUTATIONAL STUDIES OF THE TLR ECD**

To date, approximately 20 molecular modeling studies have investigated on TLR signaling. These studies include: (i) prediction of TLR ECD using available TLR crystal structures as a template and identification of its possible ligand-binding region. (ii) Structural basis identification of positive and negative regulators in TLR signaling and (iii) Identification of the interaction between the TIR domain and its adaptor molecules, which provides structural insights into the mechanism responsible for TLR mediated downstream activation or inhibition.

The first modeling study reported the structures of the mouse (m) and human (h) TLR4 ECD. These structures were generated using the first solved hTLR3 structure as a template (Kubarenko et al., 2007). Their target–template alignment showed that Nterminal and C-terminal domains aligned with the template, but the central domain did not align well. Hence, the alignment of this portion was conducted individually by matching LRRs in hTLR3. These sub domains (N-terminal, C-terminal, and individual LRRs) were manually assembled and subjected to MD simulation. Their analysis revealed that the central domain of TLR4 ECD (LRR9-13) is hypervariable across human and mouse. It should be noted that the ECDs of TLR7 and 9 are cleaved in the endolysosome to recognize ligands, and this cleaved form is necessary for Myd88 activation (Kawai and Akira, 2010; Basith et al., 2011b). Wei et al. (2009) generated structural models of cleaving ligand-binding domains of TLR7, 8, and 9. Based on comparison of the structures, they have identified potential ligand-binding

sites as well as possible configurations of the receptor–ligand complexes. Conversely, Kubarenko et al. (2010) modeled full length ECD structures of TLR7, 8, and 9. Structural comparison of these ECDs revealed that the insertion mainly takes place in the TLR9 loop regions (LRR2, 5, and 8), which contains primarily cysteine and few proline residues (Kubarenko et al., 2010). Finally, the loop insertion residues have been quantified through biochemical studies and identified the functional role of these residues (C98, C110, P183, C184, C265, C268, and P269) in TLR9 signaling. The first modeling report to show the ligand binding to the TLR ECD is TLR5, whose concave surface interacts with flagellin and the biochemical studies provided that D296 and D367 of TLR5 are necessary for mediating this interaction (Andersen-Nissen et al., 2007).

Recently, the LRRML and TollML tools were designed to identify appropriate templates for each LRR and the functional annotation of TLR primary sequences, respectively (Wei et al., 2008; Gong et al., 2011). LRRML, the program produces the alignment for each LRR along with templates that were subsequently used for homology modeling of LRR proteins. Generally, one or more full length protein has been used as a template for modeling. However, due to variations in the LRR numbers among TLRs, sequences with low similarity between the target and full length template are usually not sufficient for homology modeling. The LRRML tool was developed to address this issue. This tool currently contains 1261 individual LRRs (obtained from 112 PDB structures) that serve as a local template for each target. As a test case, the developers modeled the structure of the mouse TLR3 ECD and excluded the LRRs of the mouse/human TLR3 ECD from the LRRML dataset. The final 26-line multiple alignments were generated by 25 template sequences and the target sequences were used for modeling. Superimposition of the modeled TLR3 structure with the actual TLR3 crystal structure revealed an RMSD value of 1.9 Å, confirming the reliability of modeling studies. This method has since been used to predict series of human TLR5-10 and mouse 11–13 (Wei et al., 2010). These models can be used to conduct ligand docking studies or design mutagenesis experiments to investigate the TLR–ligand-binding mechanism. Recent studies by our group have shown that the Pam3CSK4 might be the ligand for the TLR2/10 complex and Pam2CSK4 might activate TLR10/6 and TLR10 homodimer. The predicted TLR10 complexes are similar to the available TLR1 family complexes. However, the binding orientation of TLR10 homodimer was different due to the presence of negatively charged surface near LRR11-14, that defined the specific binding pocket (Govindaraj et al., 2010). This has been the first study to suggest the possible ligands for TLR10. Our predictions were also confirmed by the recent biochemical studies by showing that chimeric receptors [TLR10 ECD and endodomain (TIR) TLR1] along with TLR2 recognize triacylated lipopeptides (Guan et al., 2010).

It is well known that lipid IVA acts as an agonist or antagonist for TLR4–MD-2 complex, depending upon the species. To identify the species specificity,Walsh et al. (2008) conducted modeling studies and identified differences in primary sequences among the species (mouse, cat, horse, and human). Mouse, cat and horse species were able to induce signaling in response to lipid IVA, whereas human species were not able to induce signaling, primarily due to the conservative substitution. However, this reason alone cannot be expected to have a large influence on the overall structure of the protein. Furthermore, they identified significant differences in the local charge distribution on the surfaces of MD-2 and TLR4 from different species, which suggests that electrostatic forces also govern the pharmacology of lipid IVA, further leading to the transduction of TLR4 signaling. In general, the assembly of active TLR4 complexes is a stepwise process, with initial TLR4–MD-2 complex formation being induced by the binding of lipid IVA, further promoting the subsequent homodimerization of receptor ECDs. In the modeled complex structure, LRR 15–17 modules were found to participate in the main dimerization interface of TLR4. Their predicted modeling and mutagenesis data were remarkably accurate when the LPS bound TLR4–MD-2 crystal structure was released (Park et al., 2009).

African swine fever viruses (ASFV) encode a novel protein (pI329L) that has been shown to inhibit TLR3 signaling pathway. Modeling studies have shown that pI329L structural arrangement is similar to TLRs (Henriques et al., 2011). However, the difference observed in ECD of pI329L, which is shorter than the TLR. This protein forms a heterodimer with TLR3, thus acting like a decoy receptor, demonstrating that viral protein hinders the TLR3 homodimerization, and thereby inhibiting the TRIF mediated pathways. A recent study showed that the pentameric B subunit of type IIb *Escherichia coli* enterotoxin (LT-IIb-B5), a non-lipidated protein ligand, activates TLR2/1 signaling pathways. Molecular modeling along with mutagenesis studies showed that the upper pore of LT-IIb-B5 (M69E, A70D, L73E, and S74D) defines an interactive surface for binding with the concave surface of the TLR2/1 central domain (Liang et al., 2009). Unlike TLR2–TLR1–triacylated lipopeptide complex, non-lipidated ligands cannot fit into the small hydrophobic channel; however, these ligands can engage in TLR surface interactions via specific residues.

### **TIR MEDIATED DOWNSTREAM ACTIVATION AND INHIBITION**

Toll-like receptor ECD activation leads to TIR dimerization of TLRs, which creates specific scaffold for the binding of adaptor proteins such as Myd88, Mal, TRIF, and TRAM. This assembly of the TIR complexes activates the downstream signaling pathways, leading to the expression of pro-inflammatory cytokines, antiviral response and also in the initiation of adaptive immunity. To date, five mammalian TIR structures have been reported (TLR1, TLR2, TLR10, IL-1RAPL, and Myd88; Xu et al., 2000; Tao et al., 2002; Khan et al., 2004; Nyman et al., 2008; Ohnishi et al., 2009). All these TIR domains, containing alternative β strands and α helices are arranged as a central five stranded parallel β sheets surrounded by α helices. The TIR domains of TLR1 and TLR2 exist as a monomer in the crystal. Conversely, TLR10 TIR domain without the extracellular and transmembrane regions behaves as a monomer in solution, but it forms a homodimer in the crystal asymmetric unit. This structure has been used to represent the signaling dimer of TIRs. In the TLR10 TIR dimer interface, BB-loop connecting the βB strand

and the αB helix, and the death domain (DD) loop connecting the βD strand and the αD helix, have been reported to be important for the downstream signaling. Moreover, part of the BB-loop exposed to the surface is essential for the binding of the adaptor proteins during signal transduction (Nyman et al., 2008).

On the basis of TLR10 TIR structure, TLR4 TIR homodimer has been modeled by computational studies and identified two symmetrically related interfaces that are potentially capable of binding to adaptors, Mal and TRAM (**Figure 5**; Nunez Miguel et al., 2007). It is of worth noting that TLR4 TIR P681H polymorphism variation has been reported to abolish signal in response to LPS. In fact, this P681 located at the BB-loop, highlights its importance in TIR dimerization. Moreover, this model indicates that two adaptors could bind simultaneously to the TLR4 TIR dimer. Another important question raised by this study is whether adaptors binding is mutually exclusive, that is whether a single activated receptor complex recruits either Mal or TRAM, but not both simultaneously. Kagan et al. (2008) suggested that TLR4 signaling via Mal–Myd88 occurs at the plasma membrane and the signaling via TRAM–TRIF might be endosomal.

The crystal structures of bacterial (Chan et al., 2010) and the plant TIR domains (Chan et al., 2009) are highly homologous to those of mammalian TIRs. In bacterial TIR domain, the dimerization interface involves DD loop but not the BB-loop (important for TLR10 dimer). Chan et al. (2009) suggest that

**the activated TLR4 TIR domains.** The BB-loops in each TIR domain are highlighted in red. MAL and TRAM proteins are both predicted to bind to the TLR4 homodimer interface. It is probable that binding of MAL or TRAM protein is mutually exclusive, with the former binding to activated receptors at the cell surface and the latter in endosomes.

the BB-loop is not important for the homotypic interactions but may have a defined role in the heterotypic interactions with Mal or Myd88 TIR domains. Moreover, the available TIR structures lack the region immediately following their transmembrane segment, further making it hard to predict their exact orientation. Myd88 contains an N-terminal DD that is separated from the C-terminal TIR domain by a short linker sequence. After binding of Myd88 TIR domain to TLR TIR domains, Myd88 DD can interact with DD members of IRAK family, to activate their downstream signaling cascades. Recently, Lin et al. (2010) identified the complex crystal structures formed by DD of Myd88, IRAK4, and IRAK2. This complex structure known as Myddosome, consists of left handed helical structure in the order of 6 Myd88, 4 IRAK4, and 4 IRAK2 DDs. Like TIRs, DD are small globular proteins but have an anti parallel α helical fold rather than α–β structure. The dimerization of TLR TIR dimer recruits two Myd88 TIR domains then the larger myddosome superhelix could possibly bridge several activated receptor dimer in the network (Gay et al., 2011). Polymorphism of S34Y and R98C in the human DD, interfere with the myddosome assembly and may contribute susceptibility to infection (George et al., 2011).

Single immunoglobulin interleukin-1 receptor TIR domain (SIGIRR) and ST2L, belong to the TIR/IL-1R superfamily, which act as a negative regulator of Myd88-dependent TLR signaling. Specifically, this family attenuates the recruitment of Myd88 adaptors to the receptors via its intracellular TIR domain. Thus, these molecules are highly important for the treatment of autoimmune diseases caused by TLRs. Gong et al. (2010) proposed a residue detailed structural framework of SIGIRR inhibiting the TLR4 and 7 signaling pathways. In their multimer complex, SIGIRR exerts its inhibitory effect by blocking the molecular interface of TLR4, TLR7, and Myd88 adaptors, mainly via its BB-loop region. Our group proposed a structural framework of ST2L inhibiting the TLR4, TLR2/1, and TLR2/6 signaling pathways (Basith et al., 2011a). Apart from this, our group identified the structure based modulation of IκB family proteins. These proteins are structurally similar that are activated by TLR signaling and it has specific role in the cytoplasm and the nucleus by interacting with different subunits NF-κB dimer. Although the structures are similar, the binding specificities of these proteins remain unknown. The modeling studies have identified that variation in charged surfaces among the IκB proteins and also differences in the flexible residual position might be the chief factor for the IκB protein binding specificities (Manavalan et al., 2010, 2011).

### **CONCLUSION**

In the past few years, there has been tremendous progress in the study of interaction of TLRs with their ligands and activators. Herein, we have discussed recent structural information regarding the TLR family and its proposed activation and inhibition mechanisms. Recent crystallographic studies of TLR1/2, 2/6, 4, and 3 have provided an explanation for *in vivo*, *in vitro*, and clinical observations. The solved structures have demonstrated that TLR exists as a monomer in solution and that dimerization takes place only upon ligand binding. Conversely, TLR8 and 9 exist as preformed dimers that subsequently change the conformation upon ligand binding. The solved (TLR1, 2, 6, 3, and 4) and modeled TLR ECD structures appear to have a common fold that belong to a well known LRR family with repeated LRR modules. Sequence and structural analyses indicate that TLRs present in the extracellular membrane (TLR1, 2, 4, 6, and 10) belong to a three-domain subfamily that binds to hydrophobic ligands such as lipoprotein, LTA and LPS. Conversely, TLRs present in the endolysosome (TLR3, 7, 8 and 9) belong to a single domain family that interacts with hydrophilic proteins or nucleic acids. This ligand-induced dimerization leads to the juxtamembrane sequences at the C-terminal ECDs coming into close proximity. These sequences are then transmitted across the transmembrane, resulting in reorientation or homodimerization between the receptor TIR domains. The homodimeric receptor TIR domains provide specific molecular surfaces for the recruitment of adaptor TIR domains. Although the structures of the TLRs are similar, the binding pocket and electrostatic surfaces are not conserved among these receptors. These variations are mandatory for the discrimination of the ligand specificity in each TLR family member. For example, triacylated lipopeptides bind to the hydrophobic binding pocket of TLR1/2; however, LT-IIb-B5 protein binds to the same receptor on another surface rather than the hydrophobic pocket. This is primarily due to the patterns present in the ligands with different properties (lipids and proteins), which causes the binding site of ligands to vary among all TLRs according to the surface and cavity provided by the receptors.

It is essential that we continue to develop a thorough and detailed understanding of the structural or molecular interactions of the ligands listed in **Table 1** with their corresponding TLR family members. Such studies facilitate the rational design of receptor agonists and antagonists, leading to potential improvements in the treatment of diseases. However, there are still many important unanswered questions about TLR signaling. For example, the conformation of the transmembrane spanning segment once the TLR ECDs are activated is not known. The process leading to the recruitment of adaptor proteins following TLR activation is also not clear. Furthermore, it is not known if other ligands bind to the receptors in the same orientation and induce similar "m" shaped dimerization as seen in crystal structures. TLR4 receptor activation requires a co-receptor such as MD-2, but further work is needed to determine if this mechanism holds true for TLR4 designed agonists and if these synthetic agonists also need a co-receptor to bind with TLR receptor. The recent advances that have been made in structure–function analyses should allow many of these questions to be resolved in the near future.

### **ACKNOWLEDGMENTS**

This work was supported by the Basic Science Research Program through the NRF of Korea funded by the MEST (2010-0016256). This work was also partly supported by a grant from the Korea Food and Drug Administration (10182KFDA992) and the Priority Research Centers Program (NRF 2010-0028294).

### **REFERENCES**


receptor domain of human IL-1RAPL. *J. Biol. Chem.* 279, 31664–31670.


concentration. *J. Exp. Med.* 185, 1951–1958.


C. C. (2007). Biochemical and functional analyses of the human Tolllike receptor 3 ectodomain. *J. Biol. Chem.* 282, 7668–7678.


*Struct. Biol.* 8, 47. doi: 10.1186/1472- 6807-8-47


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 26 April 2011; paper pending published: 06 June 2011; accepted: 11 July 2011; published online: 27 July 2011.*

*Citation: Manavalan B, Basith S and Choi S (2011) Similar structures but different roles – an updated perspective on TLR structures. Front. Physio. 2:41. doi: 10.3389/fphys.2011.00041*

*This article was submitted to Frontiers in Systems Physiology, a specialty of Frontiers in Physiology.*

*Copyright © 2011 Manavalan, Basith and Choi. This is an open-access article subject to a non-exclusive license between the authors and Frontiers Media SA, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and other Frontiers conditions are complied with.*

# Measurement ofTLR-induced macrophage spreading by automated image analysis: differential role of Myd88 and MAPK in early and late responses

### *JensWenzel 1, Christian Held2, Ralf Palmisano3, Stefan Teufel 4, Jean-Pierre David4,ThomasWittenberg2 and Roland Lang1\**

*<sup>1</sup> Immunology and Hygiene, Institute of Clinical Microbiology, University Hospital Erlangen, Friedrich-Alexander Universität Erlangen-Nürnberg, Erlangen, Germany*

*<sup>2</sup> Department of Image Processing and Biomedical Engineering, Fraunhofer Institute for Integrated Circuits IIS, Erlangen, Germany*

*<sup>3</sup> Department of Biology, Friedrich-Alexander Universität Erlangen-Nürnberg, Erlangen, Germany*

*<sup>4</sup> Medical Clinic 3, University Hospital Erlangen, Friedrich-Alexander Universität Erlangen-Nürnberg, Erlangen, Germany*

#### *Edited by:*

*Kumar Selvarajoo, Keio University, Japan*

#### *Reviewed by:*

*Sangdun Choi, Ajou University, South Korea Konrad Alexander Bode, University of Heidelberg, Germany*

#### *\*Correspondence:*

*Roland Lang, Immunology and Hygiene, Institute of Clinical Microbiology, University Hospital Erlangen, Friedrich-Alexander-Universität Erlangen-Nürnberg, Wasserturmstr. 3-5, 91054 Erlangen, Germany. e-mail: roland.lang@uk-erlangen.de*

Sensing of infectious danger by toll-like receptors (TLRs) on macrophages causes not only a reprogramming of the transcriptome but also changes in the cytoskeleton important for cell spreading and motility. Since manual determination of cell contact areas from fluorescence micrographs is very time-consuming and prone to bias, we have developed and tested algorithms for automated measurement of macrophage spreading. The two-step method combines identification of cells by nuclear staining with DAPI and cell surface staining of the integrin CD11b. Automated image analysis correlated very well with manual annotation in resting macrophages and early after stimulation, whereas at later time points the automated cell segmentation algorithm and manual annotation showed slightly larger variation. The method was applied to investigate the impact of genetic or pharmacological inhibition of known TLR signaling components. Deficiency in the adapter protein Myd88 strongly reduced spreading activity at the late time points, but had no impact early after LPS-stimulation. A similar effect was observed upon pharmacological inhibition of MEK1, the kinase activating the mitogen-activated protein kinases (MAPK) ERK1/2, indicating that ERK1/2 mediates Myd88-dependent macrophages spreading. In contrast, macrophages lacking the MAPK p38 were impaired in the initial spreading response but responded normally 8–24 h after stimulation.The dichotomy of p38 and ERK1/2 MAPK effects on early and late macrophage spreading raises the question which of the respective substrate proteins mediate(s) cytoskeletal remodeling and spreading. The automated measurement of cell spreading described here increases the objectivity and greatly reduces the time required for such investigations and is therefore expected to facilitate larger throughput analysis of macrophage spreading, e.g., in siRNA knockdown screens.

**Keywords: macrophage, spreading,TLR, image analysis**

### **INTRODUCTION**

Macrophages reside in all tissues and play an important role in tissue remodeling and homeostasis by phagocytosis and digestion of dead cells and cellular debris. Their second function as sentinels for infectious danger is embodied by the expression of pattern recognition receptors for pathogen-associated molecular patterns. The best characterized group of PRR is the toll-like receptor (TLR) family, which includes TLR9 as the receptor for CpG-rich bacterial DNA and TLR4, which together with MD2 forms the receptor for the lipopolysaccharide from the cell wall of Gram-negative bacteria (Kawai and Akira, 2010). Triggering of TLR family members by microbial ligands, e.g., during phagocytosis of bacteria, induces a rapid and massive transcriptional response engendering the inflammatory response to infection. A bottleneck in the signal transduction of TLR is the adapter protein Myd88 that binds to the intracellular TIR domain of most TLR and recruits further adapters (e.g., TRAF6) and kinases (e.g., IRAK1; IRAK4; Kawai et al., 1999; Kawai and Akira, 2010). The major signaling modules activated by TLR are the IKK complex leading to NFκB translocation to the nucleus, and the cascade of mitogen-activated protein kinases (MAPK). MAPK family members expressed in macrophages are ERK1/2, JNK1/2, and p38. These MAPK control the transcriptional response to TLR ligation through phosphorylation-mediated activation of transcription factors (e.g., AP-1, CREB, and many others; Lang et al., 2006); in addition, the plethora of MAPK substrate proteins are involved in diverse cellular processes including cell motility, adhesion, and phagocytosis (Schmidt et al., 2001; Blander and Medzhitov, 2004; West et al., 2004; Kang et al., 2008).

In a recent global and quantitative analysis of TLR4-driven phosphorylation events in primary macrophages, we identified more than 1800 phosphoproteins containing nearly 7000 phosphorylation sites. LPS-stimulation caused reproducible changes in the phosphorylation of around a quarter of all sites both early (15- ) and late (4 h) after stimulation (Weintz et al., 2010). Bioinformatic analysis of the regulated phosphoproteome showed an enrichment of known TLR signaling pathway molecules, but also revealed that genes annotated in Gene Ontology as cytoskeletaland actin-binding proteins were enriched and thus are a hotspot of TLR-induced phosphorylation. The set of cytoskeleton-associated proteins showing phosphorylation after TLR4 triggering includes (among others) adapters like Arp3, Paxillin and Vasp, kinases like Ptk2, and motor proteins like Myo1e, Myo1f, and Myo9b. Some of these proteins possess well characterized functions in phagocytosis, cell spreading and motility in macrophages, or other cell types (Schaller, 2001; Sechi and Wehland, 2004; Kim et al., 2006; Hanley et al., 2010). Others have not yet been associated functionally with cytoskeleton-based macrophage responses.

In order to better understand the control of TLR-driven changes in phagocytosis and macrophage spreading by signaling molecules and components of the cytoskeleton, quantitative readout systems are needed. Ideally, such assay systems should have the potential for high-throughput analysis, for example to test the effects of siRNA knockdown of a larger group of candidate genes. The spreading response of cells to stimulation increases the contact area with the cell culture support material, an effect that can be easily visualized in fluorescence microscopy. Quantitation of spreading responses from many cells can be done manually, e.g., using ImageJ software (Girish and Vijayalakshmi, 2004; Collins, 2007). However, this approach has the drawback of being very labor-intensive and time-consuming, hence excluding the analysis of multiple samples in a reasonable amount of time. In addition, the manual delineation of the cell borders introduces potential bias. In contrast, automatic image analysis has the potential of circumventing these drawbacks and is therefore highly desirable. Available freeware software packages like the so-called CellProfiler (Carpenter et al., 2006) are able to perform an automated image analysis. The software can be adapted to the image content by adjusting the minimum and maximum cell diameter and by choosing a threshold selection method. Furthermore various pre-processing methods exist that can improve image quality. However, the requirement to adjust all these parameters until satisfying results are obtained makes the use very time-consuming for researchers not trained in image processing. Thus, we have developed algorithms to segment nuclei and to determine cell size and contact areas (Held et al., submitted; Held et al., 2011). Here, these methods were applied and optimized to investigate the macrophage spreading in response to TLR4 stimulation and its control by the Myd88 adapter protein and the MAPK family members ERK1/2 and p38.

### **MATERIALS AND METHODS MICE**

Colonies of C57BL/6, Myd88−/− (Kawai et al., 1999), and p38flox/flox; Mx-Cre (Engel et al., 2005; Bohm et al., 2009) mice were maintained at the Franz Penzoldt Center of the Medical Faculty at the University Erlangen-Nuremberg. p38flox/flox; Mx-Cre mice were injected three times i.p. at week 5 with poly I:C (13 mg/kg body weight) for depletion of p38a in hematopoietic cells as described (Bohm et al., 2009).

### **REAGENTS**

LPS (*Escherichia coli* 0111:B4) and Cytochalasin D were purchased from Sigma-Aldrich. MEK1 inhibitor was received from Selleck Chemicals. Antibodies to phosphorylated (p)-Erk1/2, p38, p-p38, pMAPKAPK2 (p-MK2) were obtained from Cell Signaling Technology and to Grb2 from BD Biosciences.

### **MACROPHAGE DIFFERENTIATION**

Mouse bone marrow was flushed out of the prepared femurs and tibiae with sterile ice-cold PBS. After 5 min of erythrocyte lysis at room temperature with NH4CL [0.15 M] cells were washed in complete Dulbecco's modified Eagle's medium (cDMEM; 0.05 mM β-mercapthoethanol, 1% Pen/Strep; 10% FCS). After overnight depletion of adherent cells non-adherent cells were further incubated in cDMEM containing 10% L-cell conditioned medium (LCCM) as a source of M-CSF for 6 days on 10 cm bacteriological plates at 37˚C, 5% CO2. After 3 days in between 5 ml of cDMEM + 10% LCCM was added to the cell culture.

### **STIMULATION OF MACROPHAGES**

Differentiated macrophages were washed with 10 ml sterile PBS, incubated with accutase, collected and washed twice with cDMEM. Macrophages were seeded at a density of 5 <sup>×</sup> 104 and 2.5 <sup>×</sup> 104 cells/well on eight-well Permanox chamber slides (Nalge Nunc International). After an overnight cultivation at 37˚C cells were stimulated with 100 ng/ml LPS for different time points and finally fixed for 20 min with 2% PFA. Supernatants of stimulated cells were collected for examination of cytokine secretion by ELISA.

### **WESTERN BLOT**

To determine the efficiency of MEK1 inhibitor and p38 deletion, parallel cultures with 1 <sup>×</sup> 106 cells/well were cultured in 1 ml cDMEM in 12-well plates. Cell lysates were prepared at the indicated time points after LPS-stimulation. 25μl of lysate were loaded on 10% PAA gel containing SDS. After blotting membrane was blocked in TBS buffer containing 3% BSA and 0.01% Tween20 and proteins were detected with the respective antibodies.

### **ELISA**

Supernatants from the spreading experiments were collected from the respective conditions and cytokine concentration was measured by ELISA kits from R&D. Samples were treated as described in the manufacturer's instructions.

### **STAINING PROCEDURE**

For visualization of the cell surface by fluorescence microscopy, macrophages were incubated for 30–45 min at room temperature with 1μg/ml Allophycocyanin (APC)-labeled anti-mouse CD11b antibody (BioLegends) in PBS containing 2% FCS. Cells were washed twice with PBS and nuclei were stained by addition of 1μg/ml DAPI (Sigma D8417) in PBS and incubation of 10 min at room temperature. For further sample preparation cells were washed again with PBS, mounted with 70% glycerol on specimen and covered.

### **IMAGE ACQUISITION AND PROCESSING**

Images were acquired using a Zeiss AxioVert 200M (Germany) widefield microscope connected to an AxioCam MRm camera. Images were taken using a 20× objective and post-processed by contrast and brightness enhancement within the AxioVison 4.8.2 software (Carl Zeiss MicroImaging). CD11b–APC stained macrophages were imaged using a CY5 filter set from AHF (Germany).

### **ALGORITHM DESCRIPTION**

Staining of the cell nuclei facilitates the unequivocal definition of individual cells in macrophage cultures containing cells in contact with each other. This information about the nuclei was also included for automatic segmentation, i.e., the definition of macrophage cell bodies against the background and from other cells. The resulting two-step segmentation method consists of the segmentation of all cell nuclei based on the DAPI stain, followed by the segmentation of the macrophages according to the CD11b cell surface signal.

### **SEGMENTATION AND SEPARATION OF NUCLEI**

For the automatic segmentation of the nuclei, a watershed transform-based segmentation routine (Roerdink and Meijster, 2000) was applied. For pre-processing, Gaussian smoothing was applied to reduce the noise level in the image. After shading correction, an additional pre-processing filter is applied which facilitates the splitting of touching cell nuclei. This algorithm will be described in detail elsewhere (Held et al., submitted). In brief, this filter uses adaptive weighting of the local principal curvature *C*P. As boundaries between touching cells usually hold *C*<sup>P</sup> > 0, intensities of these pixels are reduced while other pixels are preserved. The result of this operation is denoted as modified curvature *C*− P :

$$C\_{\mathfrak{p}}^{-} = \begin{cases} C\_{\mathfrak{p},} & \text{if } C\_{\mathfrak{p}} > 0 \\ 0, & \text{else} \end{cases}.$$

After normalization to the range of [0,1] the modified curvature *C*− <sup>P</sup> can be used as a weight map for the input image *I*, yielding a filtered image *I* F:

$$I\_{\mathbb{F}} = \frac{\alpha + (1 - C\_{\mathbb{P}}^{-})^{\mathbb{P}}}{1 + \alpha},$$

where the parameters α and *p* determine the strength of the applied filter. After filtering, the nuclei are separated from the image background by a k-means clustering-based threshold selection method (Hartigan and Wong, 1979). Holes in the resulting binary image were filled and a distance transform (Saito and Toriwaki, 1994) was applied to incorporate prior knowledge on the morphology of the nuclei. For separation of nuclei touching each other a watershed transform was applied to the distance image.

### **SEGMENTATION OF CELLS**

After segmentation of the cell nuclei, the contact area of the macrophage with the slide was segmented. Details on this algorithm have been described elsewhere (Held et al., 2011). In brief, analog to the nuclei image, the macrophage image was smoothed

with a Gaussian filter and a shading correction was performed. Afterward, the cells were separated from the image background by application of the k-means clustering algorithm. Note that a different number of clusters was used for the separation of "background and cells" and "background and nuclei." For separation of the cells a gradient magnitude based fast marching level set method (Sethian, 1999) was performed, using the segmented cell nuclei as initialization.

### **STATISTICAL ANALYSIS**

If not described otherwise in the figure legend, results were expressed as means ± SEM of at least 300 cells per condition. Graphs were generated with GraphPad Prism and statistical significance was determined with Student *t* test for unpaired conditions (ns = not significant; <sup>∗</sup> *p* < 0.05; ∗∗∗ *p* < 0.0001).

### **RESULTS**

To quantitatively measure spreading responses of macrophages to TLR stimulation, the contact area of the cell to the support material needs to be determined. In order to define the cell borders against the background of the slide, we tested different staining approaches for the best discrimination. In our hands, using bone marrow derived mouse macrophages cultured on Permanox chamber slides, the staining of the integrin CD11b with an APC-labeled antibody resulted in more even staining than the lipid-staining molecule PKH and in a better signal-to-noise ratio than the cytosolic dye CFSE (data not shown). Therefore, CD11b staining of the macrophage cell surface was used for definition of the contact area to the slide and combined with a DAPI staining of the cell nuclei to clearly identify individual cells on fluorescent microscopy images (**Figure 1A,B**). The spreading response of macrophages to stimulation with the TLR4 agonist LPS was examined in a kinetic analysis using time points between 1 h and 24 h (**Figure 1C**). A small but significant increase in the contact area of 10–15% was observed already 1 h after addition of 100 ng/ml LPS. Over time, macrophages continued to spread on the Permanox surface and extended the contact area to approximately twice the initial size after 8–24 h. In most experiments, the maximum effect was observed at 24 h; in some experiments, a peak was reached already 8 h after LPS-stimulation (see below). The increase in the macrophage contact area with the Permanox surface was completely prevented when the inhibitor of actin polymerization Cytochalasin D was added to the cultures before addition of LPS (**Figure 1C**).

The manual annotation of contact areas from the fluorescence microscopy images is very time-consuming, prohibiting the performance of experiments with multiple conditions. Therefore, we applied the automated two-step segmentation algorithm described in the Section"Materials and Methods"to the raw image data from resting and LPS-stimulated macrophages (**Figure 2**). The results of the automated segmentation are displayed by the software and the annotation of the cell borders is highlighted (**Figure 2B**), allowing quality control and manual editing by the user (**Figure 2C**). While for most cells, the segmentation obtained by the software was found to be correct upon inspection, in the case of overlapping cells some manual editing of the contact area annotation was required (arrows in **Figure 2C**).

eight-well chamber slides and rested overnight before addition of LPS (100 ng/ml). The inhibitor of actin polymerization Cytochalasin D (5μg/ml) was added 1 h prior to LPS-stimulation. At the indicated time points after addition of LPS, slides were processed for staining of CD11b and nuclei, followed by fluorescence microscopy. **(A,B)** Representative images from control and 24 h

To test the accuracy and reliability of the automatic segmentation algorithm, we directly compared the distribution of contact areas on a large number of macrophages under resting and LPS-activated conditions using automated segmentation by the software tool versus manual annotation as the gold standard (**Figure 3A**). In this comparison based on the image data from a single experiment, both methods yielded very similar distribution of contact areas and median values in resting cells and early after LPS-stimulation. At the later timepoints, the automatic segmentation algorithm produced a "shrinkage effect" with a smaller median value compared to the manual annotation. To determine whether this "shrinkage effect" is generic at later timepoints, we extended this comparison analysis to a series of six independent experiments (**Figure 3B**). Comparison of the median contact area values shows a consistent high agreement between the automated method and the manual annotation in resting macrophages and early (2 h) after stimulation. At the later timepoints, the differences obtained by both methods tended to become larger; however, in addition to "shrinkage effects" there were also "blow-up effects," i.e., examples where automatic annotation gives larger values compared to the manual annotation method. Overall, we observed a

time points. Scale bar = 50μm. **(C)** Quantitation of cell spreading by manual annotation of CD11b staining. Shown are mean and SEM of at least 300 cells per condition from one representative experiment. Media control (open circles), LPS (closed circles), and LPS in the presence of Cytochalasin D (closed squares). Statistical significance refers to LPS-treated compared to untreated condition. ns = not significant; \* *p* < 0.05; \*\*\* *p* < 0.0001.

good agreement between the results obtained by automated segmentation and manual annotation even at the later timepoints. Therefore,we employed the automated method to investigate TLRtriggered macrophage spreading and its control by canonical TLR pathway molecules; for comparison and validation of the method, the time-consuming manual editing of the automatic annotated data was included in each experiment.

Most TLR employ the adapter protein Myd88 for activation of the major signaling pathways leading to gene expression and cytokine secretion. The MAPK p38 and ERK1/2 are activated by TLR stimuli and contribute to cellular responses through phosphorylation of transcription factors and other substrate proteins. We employed macrophages with genetic deletion of Myd88 and p38, and pharmacological inhibition of the ERK1/2 kinase MEK1, to determine the control of LPS-induced macrophage spreading by these molecules. The effect of genetic and pharmacologic deletion on the cytokine response to LPS was examined using supernatants from the cultures used for analysis of cell spreading (**Figure 4**). In the absence of Myd88, secretion of the pro-inflammatory TNF, IL-6, and IL-12p40, as well as of anti-inflammatory IL-10 was almost completely absent. In contrast, the inhibition or genetic

**FIGURE 2 | Segmentation and annotation of macrophages with two-step segmentation software.** Fluorescence microscopy images of CD11b–APC- and DAPI-stained macrophages were uploaded into the software **(A)**, nuclear segmentation and contact area annotation was performed by the software tool **(B)**, and finally checked and corrected by the user **(C)**. Arrowheads indicate shrinkage effects; arrows point to overlapping cells necessitating manual editing of the annotation. Cells were imaged by a 20× objective. Scale bar represents 50μm.

ablation of the MAPK family members had more subtle effects. TNF is secreted rapidly after LPS-stimulation; its levels were moderately (40–60%) decreased by the MEK1 inhibitor PD184352 or in p38-deficient macrophages throughout the time course (**Figure 4A**). In contrast, IL-6 is secreted later and was less affected by interference with MAPK signals (**Figure 4B**). Of interest, IL-12p40 was increased upon inhibition of MEK1 or in p38-deficient macrophages (**Figure 4C**). IL-10 production was severely impaired in the absence of p38, but also blocked considerably by MEK1 inhibition (**Figure 4D**).

We next determined the spreading response of macrophages after LPS-stimulation in the absence of Myd88 (**Figure 5**). In a comparison of Myd88 heterozygous and knockout macrophages, we observed that the strong induction of macrophage spreading at the late 8 and 24 h timepoints was indeed severely impaired in the absence of Myd88. However, Myd88−/− macrophages did respond nearly as well to LPS at the early timepoint of 2 h as the heterozygous control cells. Although there were slight differences in the absolute values, this pattern of responsiveness was robustly identified by both segmentation methods (left and right panel in **Figure 5**). Myd88−/− macrophages showed a complete lack of cytokine secretion when supernatants from the chamber slide cultures were examined by ELISA (**Figure 4**). The conserved early spreading response in the absence of Myd88 is therefore likely independent of cytokine secretion. Furthermore, it indicates that other signaling molecules (e.g., the adapter TRIF) may play an important role in cytoskeletal rearrangement after LPS.

The second generation MEK1 inhibitor PD184352 has been reported to have a higher specificity than the older reagents U0126 and PD98059 for MEK1 over MEK5 (Grill et al., 2004), and was therefore used here. Pretreatment of macrophages with PD184352 strongly reduced the levels of basal and LPS-induced phosphorylation of ERK1/2 without obvious impact on the phosphorylation of p38 MAPK (**Figure 6A**). MEK1 inhibition had no effect on basal macrophage spreading, and did not change the significant spreading observed at the 2 h timepoint (and in the single experiment also at the 1 h and 4 h timepoints). However, the further increase in spreading at the 8 h and 24 h timepoints was prevented; in fact, similar to the effect of Myd88 deficiency, MEK1 inhibition caused a reduction of the macrophage contact area between 8 h and 24 h (**Figure 6B**). Again automatic annotation and manual editing gave basically identical results.

A possible contribution of the p38 MAPK to LPS-induced macrophage spreading was investigated using macrophages derived from the bone marrow of conditional p38flox/flox; Mx-Cre mice treated with poly I:C to induce deletion of p38. As shown in **Figure 7A**, deletion of p38 was very efficient in mice expressing Cre; consequently, phospho-p38 was undetectable. In contrast, the early activation of ERK1/2 was unchanged in the absence of p38, whereas at later timepoints an even stronger ERK1/2 activation was apparent. The spreading of macrophages in response to LPS was only moderately affected by the absence of p38 (**Figure 7B**). At the early 2 h timepoint, p38−/− macrophages showed a significantly attenuated increase in the contact area. This effect was seen with the automatic annotation method as well as after manual editing of the segmentation results. However, by 8 h and 24 this difference between p38−/− and p38+/+ macrophages had nearly vanished

and p38−/− macrophages displayed similar spreading behavior as p38+/+ cells. The 8 h time point in this experiment is the only instance, where the slight variation between the automated image analysis and the manual editing method led to a different result, in that a significant effect of p38-deficiency was found with the automated annotation but not after manual editing. Together, p38 appears to be required for maximal early spreading, but in marked contrast to the strong effects observed for Myd88 deficiency and inhibition of ERK1/2 activation, p38 is not involved at later timepoints.

### **DISCUSSION**

In this manuscript, we have described the application of a newly developed algorithm for automated image analysis in the investigation of macrophage spreading in response to TLR stimulation. The rapid acquisition of quantitative and reliable data from microscopy-based assays of changes in cell size and adherence is necessary for the comprehensive investigation of the typically relatively large number of candidate genes identified in systems approaches like transcriptome or proteomic screens. We report here that the algorithm developed, a combination of nuclei separation filter based on a watershed segmentation with subsequent cell segmentation by fast marching level set (Held et al., 2011), performed very well in resting macrophages and early after stimulation; at later timepoints an increase in variation was observed. However, our comparison of the automated algorithm with manually corrected annotations showed that overall very similar results were obtained across several experiments investigating the effects of perturbations in TLR signaling on macrophage spreading. Thus, at least for the relatively strong effects on macrophage spreading observed here, the two-step segmentation method presented and tested here can be used without the need for manual editing of the data, leading to a tremendous reduction in the time and labor required to obtain quantitative data on macrophage spreading.

Hence, using this method medium and high-throughput analysis of macrophage spreading appear feasible. For the investigation of more subtle differences, and for validation of effects found with the automated analysis tool, manual inspection, and editing of the annotation may be required. Of note, even with such a semi-automated method of combining tool-based annotation with manual editing the processing time of the microscopy data is reduced by a factor of two to three compared to manual annotation of cells, thereby enabling the investigation of much larger data sets in a reasonable time frame. Taken together, we are convinced that the method described here represents a considerable technical advance and valuable addition to the toolbox required for quantitative, unbiased, and automatic image analysis of innate immune cells.

To further improve the performance of the automatic segmentation tools several issues should be addressed in future work. First, the quality of the input data, i.e., fluorescence microscopy pictures, in terms of intensity of staining and signal-to-noise ratio of cells versus slide background, appears to be the most critical parameter. We have compared the cytosolic dye CFSE, the membrane lipid stain PKH and APC-labeled anti-CD11b staining (data not shown) and obtained the best results with surface molecule staining by anti-CD11b antibody. The reduced accuracy of the annotation tool at later time points after stimulation may be related to changes in CD11b surface expression and/or redistribution of CD11b in different cellular compartments,creating internal maxima that are falsely recognized as cell borders (**Figure 2B,C**). Thus, staining of other surface markers with strong expression independent of the activation status could lead to enhanced performance of the tool; possible surface markers on macrophages could be MHC-I, CD45, or Fc receptors. In addition, the use of fluorescent dyes yielding stronger signals for labeling of antibodies will be tried to increase signal-to-noise ratios. Another difficulty is the automatic segmentation and annotation of overlapping cells,

which may be solved by further development of the segmentation algorithm.

TLR4-triggered macrophage spreading on Permanox slides induced by stimulation with LPS was observed first after 1 h and increased steadily until 8–24 h. The spreading response could be a consequence of direct TLR-driven signals causing actin polymerization and cytoskeletal rearrangement, or it could depend on indirect effects of TLR-induced secretion of cytokines, e.g., TNF or IL-1. We have not dissected these possibilities here; to do so will require the use of inhibitors of protein synthesis (e.g., Cycloheximide), transport and secretion (e.g., Brefeldin A), or specific interference with certain cytokines by using knockout macrophages. By comparing the kinetics of secretion of the cytokines TNF, IL-6, and IL-12p40 with the spreading response, we observe that only TNF is produced early enough to have a potential role in the initial spreading response. To assess the role of TNF in macrophage spreading, inhibitors of the metalloprotease Adam17 (also known as TACE) could be used.

The central adapter protein of TLR signaling, Myd88, is required for LPS-induced cytokine production (**Figure 4**) and for the late and enhanced increase in the contact area; however, the early spreading response at 2 h was surprisingly normal in Myd88−/− macrophages (**Figure 5**). To our knowledge, macrophage spreading of Myd88−/− macrophages has not been analyzed before. Since TLR4 utilizes in addition to Myd88 the adapter protein TRIF (Yamamoto et al., 2003; Weighardt et al.,

**FIGURE 5 | Phenotype of Myd88−/− macrophages in cell spreading.** BMM from Myd88± and Myd88−/− mice were plated and stimulated as described in **Figure 3**. Macrophage spreading was analyzed by automated image analysis (left panel) and manual editing (right panel). Shown are mean

and SEM from one representative experiment of two performed. LPS (closed symbols), media control (open symbol), Myd88± (circles), Myd88−/− (squares). Statistical significance refers to Myd88± compared to Myd88−/− genotypes. ns = not significant; \*\*\* *p* < 0.0001.

blot control for ERK1/2 and p38 phosphorylation. Cell lysates were taken at indicated time points (15 min to 24 h) from two individual experiments. Grb2: loading control. **(B)** Effect of pharmacological blockade of ERK1/2 activation on spreading. Macrophages were pre-incubated with the

Contact area data obtained by automated image analysis (left panel) and manual editing (right panel). Statistical significance refers to PD184352-treated compared to no inhibitor. ns = not significant; \*\*\* *p* < 0.0001.

\*\*\* *p* < 0.0001.

control for p38 protein levels and phosphorylation of p38 and ERK1/2. Cell lysates were taken at indicated time points (15 min to 4 h). Grb2: loading

2004), signaling via this pathway may be responsible for the early increase in contact area. Macrophages from mice deficient in Trif or in both Myd88 and Trif will be useful to formally test this possibility. In addition, to differentiate between direct effects of Myd88 on macrophage spreading and indirect effects via secreted cytokines, transfer of supernatants from WT macrophages onto Myd88−/− macrophages should be informative. The MAPK p38 and ERK1/2 are activated by TLR4 signaling and were therefore investigated for a role in LPS-induced macrophage spreading. An unexpected dichotomy in the requirement for p38 and ERK1/2 for early and late spreading responses, respectively, was found (**Figures 6** and **7**). The spreading response in cells treated with the MEK1 inhibitor PD184352 was very similar to that of Myd88−/− macrophages, with intact spreading at 2 h but a strongly reduced response after 8 and 24 h. Hence, it appears that ERK1/2 is the main kinase of the Myd88-dependent spreading response. A caveat to consider here is the limitation of pharmacological inhibitors: first, PD184352 did not completely suppress the initial ERK1/2 phosphorylation (although strongly diminishing basal and induced expression), and secondly, it may have additional effects independent of MEK1 inhibition that could contribute for the observed late inhibition of spreading. Therefore, a definitive experiment to test the role of ERK1/2 will require the use of macrophages deficient in these MAPK proteins. This genetic approach was used here for p38 MAPK, with apparently complete deletion of p38

after poly I:C injection. In the absence of p38, a reduced early, but intact late spreading response was found. It is worth mentioning that p38-deficient macrophages, similar to Myd88−/− macrophages, had a slightly reduced basal contact area (**Figures 5** and **7**), which could also indicate differences in macrophage differentiation in the absence of Myd88 and p38. To corroborate the p38-dependence of early LPS-induced spreading, specific pharmacological inhibition of p38 in future experiments will be helpful. Such experiments have already been performed in a macrophage cell line, showing that p38 induces integrin-dependent spreading of J774 macrophages via activation of the Ras-like GTPase Rap1 (Schmidt et al., 2001). As this early defect was not seen in Myd88-deficient macrophages, the logical implication of the data is that p38 activation should be independent of Myd88. In fact, the LPS-induced phosphorylation of p38 and ERK1/2 MAPK is only somewhat delayed in Myd88-deficient macrophages (Kawai et al., 1999).

comparison of p38-deficient versus WT macrophages. ns = not significant;

Which adapters, GTPases and motor proteins are involved in the LPS-induced spreading response, and how are they controlled by the Myd88- and MAPK-dependent pathways described here? One established pathway from TLR4 to integrin activation is via activation of the focal adhesion kinase-related Pyk2 and the adapter protein paxillin (Williams and Ridley, 2000). Paxillin activation and cell spreading in response to LPS involves ERK1/2 dependent phosphorylation of Ser130 which is a prerequisite for GSK3-dependent phosphorylation of Ser126 (Cai et al., 2006). All members of the ERK1/2–GSK3–Pyk2–Paxillin module were found strongly phosphorylated after LPS at multiple sites in our published phosphoproteome analysis (Weintz et al., 2010). Of interest, paxillin and Pyk2 activation were only partially reduced in Myd88 deficient macrophages (Hazeki et al., 2003), consistent with their potential role in the intact initial spreading in Myd88-deficient macrophages observed here. The observation that Cdc42 and Rac GTPase activation proceeds independent of Myd88 early after LPS-stimulation (Kong and Ge, 2008) provides another potential mechanism for macrophage spreading. Clearly, while some of the players in macrophage spreading are well established, the regulated phosphorylation of more than 40 proteins with a Gene Ontology annotation of "actin-binding" or "cytoskeleton-binding" (Weintz

### **REFERENCES**


(Myo9b) controls cell shape and motility. *Proc. Natl. Acad. Sci. U.S.A.* 107, 12145–12150.


et al., 2010) indicates that multiple proteins contribute to the changes in macrophage shape, contact area and motility after TLR stimulation. Elucidation of the functional role of these proteins will require siRNA knockdown experiments, macrophages from knockout mice and the use specific pharmacological inhibitors. We believe that the method of semi-automatic measurement of macrophage spreading will greatly facilitate the timely, unbiased and quantitative investigation of these perturbations.

### **ACKNOWLEDGMENTS**

This project received funding from the Deutsche Forschungsgemeinschaft (SFB 643, TP A10; and SFB 796, B6, A4, and Z). The authors thank Katrin Jozefowski for mouse genotyping, and Matthias Engelbrecht for advice with statistics.


(2004). Enhanced dendritic cell antigen capture via toll-like receptorinduced actin remodeling. *Science* 305, 1153–1157.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 08 August 2011; accepted: 21 September 2011; published online: 18 October 2011.*

*Citation: Wenzel J, Held C, Palmisano R, Teufel S, David J-P, Wittenberg T and Lang R (2011) Measurement of TLR-induced macrophage spreading by automated image analysis: differential role of Myd88 and MAPK in early and late responses. Front. Physio. 2:71. doi: 10.3389/fphys.2011.00071*

*This article was submitted to Frontiers in Systems Physiology, a specialty of Frontiers in Physiology.*

*Copyright © 2011 Wenzel, Held, Palmisano, Teufel, David, Wittenberg and Lang . This is an open-access article subject to a non-exclusive license between the authors and Frontiers Media SA, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and other Frontiers conditions are complied with.*

### *Gesham Magombedze1, Pradeep B. J. Reddy2, Shigetoshi Eda1,3 and Vitaly V. Ganusov1,4,5\**

*<sup>1</sup> National Institute for Mathematical and Biological Synthesis, University of Tennessee, Knoxville, TN, USA*

*<sup>3</sup> Department of Forestry, Wildlife and Fisheries, Center for Wildlife Health, University of Tennessee, Knoxville, TN, USA*

*<sup>4</sup> Department of Microbiology, University of Tennessee, Knoxville, TN, USA*

*<sup>5</sup> Department of Mathematics, University of Tennessee, Knoxville, TN, USA*

#### *Edited by:*

*Kumar Selvarajoo, Keio University, Japan*

#### *Reviewed by:*

*Juilee Thakar, Yale University, USA Rob J. De Boer, Utrecht University, Netherlands*

#### *\*Correspondence:*

*Vitaly V. Ganusov, Department of Microbiology, University of Tennessee, 1414 Cumberland Ave, Knoxville, TN 37996, USA e-mail: vitaly.ganusov@gmail.com*

Vertebrates are constantly exposed to pathogens, and the adaptive immunity has most likely evolved to control and clear such infectious agents. CD4+ T cells are the major players in the adaptive immune response to pathogens. Following recognition of pathogen-derived antigens naïve CD4+ T cells differentiate into effectors which then control pathogen replication either directly by killing pathogen-infected cells or by assisting with generation of cytotoxic T lymphocytes (CTLs) or pathogen-specific antibodies. Pathogen-specific effector CD4+ T cells are highly heterogeneous in terms of cytokines they produce. Three major subtypes of effector CD4+ T cells have been identified: Thelper 1 (Th1) cells producing IFN-γ and TNF-α, Th2 cells producing IL-4 and IL-10, and Th17 cells producing IL-17. How this heterogeneity is maintained and what regulates changes in effector T cell composition during chronic infections remains poorly understood. In this review we discuss recent advances in our understanding of CD4+ T cell differentiation in response to microbial infections. We propose that a change in the phenotype of pathogen-specific effector CD4+ T cells during chronic infections, for example, from Th1 to Th2 response as observed in *Mycobactrium avium* ssp. *paratuberculosis* (MAP) infection of ruminants, can be achieved by conversion of T cells from one effector subset to another (cellular plasticity) or due to differences in kinetics (differentiation, proliferation, death) of different effector T cell subsets (population plasticity). We also shortly review mathematical models aimed at describing CD4+ T cell differentiation and outline areas for future experimental and theoretical research.

**Keywords: CD4+ T cells, differentiation, plasticity, mathematical modeling, Johnes disease**

### **INTRODUCTION**

Adaptive immune responses are in general required for protection against many if not most pathogens. CD4+ T cells are the key component of adaptive responses to both intracellular and extracellular pathogens. The major function of CD4+ (helper) T cells is to provide help to other lymphocytes to mount an efficient immune response. By secreting appropriate cytokines and expressing a variety of co-stimulatory molecules, CD4+ T cells are required for the generation of high affinity antibody responses to pathogens and for the formation of long-lived plasma cells and memory B cells (Crotty, 2011). Although it is currently believed that CD4+ T cells are not needed for the generation of cytotoxic T lymphocyte (CTL) responses against many intracellular pathogens such as viruses (Wiesel and Oxenius, 2012), help from CD4+ T cells is required to generate memory CD8 T cells which are able to expand upon secondary exposure to the pathogen (Prlic et al., 2007). CD4+ T cells are in general needed to control chronic viral infections such as lymphocytic choriomeningitis virus (Zajac et al., 1998; Prlic et al., 2007; Zhang and Bevan, 2011). Recent evidence also suggests that CD4+ T cells could directly impact virus replication by killing virus-infected cells which express MHC-II molecules (Swain et al., 2012). By secreting a variety of cytokines, effector CD4+ T cells can also recruit other cells including neutrophils and monocytes to the sites of infection (Huber et al., 2012). CD4+ T cells are also involved in dampening immune responses either via the action of thymus-derived regulatory T cells (Tregs) or via production of anti-inflammatory cytokines such as IL-10 (Pot et al., 2011; Josefowicz et al., 2012).

How CD4+ T cells become activated, how they differentiate into effector cells, how effector phenotype of CD4+ T cells is maintained, and whether T cell effector phenotype can be changed to better control infections has been a subject of intensive research. In some circumstances, during progression of a chronic disease the efficient pathogen-specific CD4+ T cell response is lost and pathological response leading to exacerbation of the disease arises. Such a "switch" occurs during *Mycobactrium avium* ssp. *paratuberculosis* (MAP) infection of cattle and sheep where initially dominant MAP-specific cellular response (T-helper 1, Th1) is lost over time of infection, and MAP-specific antibody response (Th2) appears as the disease reaches clinical stage (Begg et al., 2011). In other circumstances, inappropriate responses

*<sup>2</sup> Department of Pathobiology, University of Tennessee, Knoxville, TN, USA*

arise following the first priming event. For example, exposure to allergens often leads to the generation of CD4+ T cell response that results in allergic reactions (Th2) rather than in protective immunity (Th1) (Holt and Thomas, 2005).

It is generally possible to bias differentiation on naïve CD4+ T cells into a particular effector T cell subset (e.g., Th1 or Th2) by providing appropriate environmental conditions. However, regulation of the phenotype of differentiated effector CD4+ T cells has proven to be more challenging. We propose that change of the phenotype of pathogen-specific CD4+ effector T cells during a chronic infection or a chronic inflammatory condition can be achieved via two distinct mechanisms: "cellular" and "population" plasticity of T cell effectors. We illustrate how mathematical modeling has been used to understand factors driving naïve CD4+ T cell differentiation and plasticity of effector T cell responses in chronic infections.

### **CELLULAR AND POPULATION PLASTICITY OF CD4+ T CELL RESPONSES**

#### **T CELL DIFFERENTIATION**

Naïve CD4+ T cells differentiate into various subsets upon interaction with an antigen presented by the professional antigenpresenting cells (APCs) such as dendritic cells (DC). CD4+ T cells require 3 signals for their lineage commitment (Kenneth et al., 2008). The first signal is generated following the interaction between T-cell receptor (TCR) and the peptide presented in the context of major histocompatibility complex (MHC) class II on an APC (Yamane and Paul, 2012). The second signal is generated following the interaction between the CD28 co-receptor on the T cell and B7 family of co-stimulatory molecules such as CD80 or CD86 on the APC. The third signal is generated by inflammatory cytokines produced by the APC or other cells at the site of T cell activation. These cytokines direct differentiation of naïve CD4+ T cells into a particular effector subset. Effector CD4+ T cells can be categorized into three major subsets based on the type of cytokine they produce and the major transcription factor (TF) they express (**Figure 1**). If an APC secretes interleukin (IL)-12, naïve CD4+ T cells differentiate into Th1 effectors. Th1 effectors express a transcription factor T-bet and secrete the cytokines IFN-γ and TNF-α; these cells play an essential role in inhibiting replication of intracellular pathogens such as viruses (Hsieh et al., 1993; Lighvani et al., 2001; Kenneth et al., 2008). If an APC secretes IL-4, naïve CD4+ T cells differentiate into Th2 effectors. Th2 cells express TF GATA-3, secrete cytokines IL-4, IL-5, and IL-13 (Le Gros et al., 1990; Eltholth et al., 2009); these cells are critical during infection by extracellular pathogens such as extracellular bacteria and helminthes. In the presence of IL-6 and transforming growth factor (TGF)-β, naïve CD4+ T cells differentiate into Th17 cells. Th17 cells express a transcription factor ROR-γt and produce cytokines IL-17 and IL-22 (Harrington et al., 2005; Ivanov et al., 2006); these cells are important for control of certain bacterial and fungal infections. Th1, Th2, and Th17 cells are considered to be the major effector CD4+ T cells (Mosmann et al., 1986; London et al., 1998; O'Garra, 1998; O'Garra and Arai, 2000; Yates et al., 2000; Murphy and Reiner, 2002; Chakir et al., 2003; Motiwala et al., 2006; Callard, 2007; Dong, 2008; Kenneth et al., 2008; Liao et al., 2011; Hong et al., 2012; Yamane and Paul, 2012).

**FIGURE 1 | Major pathways of naïve CD4+ T cell differentiation into effectors.** Upon encountering the antigens presented by the professional antigen-presenting cells (APCs) naïve CD4+ T cells differentiate into Th1, Th2, or Th17 effector cells. Cytokines present in the environment during differentiation play the major role in determining the phenotype that the CD4+ T cell will acquire. Two other CD4+ T cell subsets include regulatory T cells (Treg) and T follicular helper cells (Tfh). Due to cellular plasticity differentiated effector CD4+ T cells may convert from one type into another. For example, Th17 cells under strong polarizing conditions (e.g., high concentrations of IL-12) may convert into Th1 cells.

Two other subsets of CD4+ T cells have been also identified (**Figure 1**). Tregs express TF FoxP3; these cells secrete anti-inflammatory cytokines like TGF-β and IL-10. Tregs maintain immune homeostasis by limiting the magnitude of immune response against pathogens and control inflammatory reactions (Sakaguchi, 2004). T follicular helper cells (Tfh) express a TF Bcl-6 and these cells are essential for the production of high affinity IgG antibodies (Crotty, 2011). Existence of Th9 and Th22 subsets was also recently suggested (Veldhoen et al., 2008; Eyerich et al., 2009).

### **CELLULAR PLASTICITY**

It has been thought for a long time that differentiation of CD4+ T cells into various effector subsets is an irreversible event; CD4+ T cells that have differentiated into a particular subset cannot revert into a different subset (Mosmann and Coffman, 1989). However, recent studies suggest that effector T cells retain some degree of functional plasticity and these cells can change their effector phenotype (Murphy and Stockinger, 2010; O'Shea and Paul, 2010) (**Figure 1**). For example, recent reports have shown that both *in vitro* (Murphy et al., 1996) and *in vivo* (Panzer et al., 2012) generated Th1 cells can acquire the Th2 characteristics (**Figure 1**). Factors determining such *cellular plasticity* of CD4+ T cell effectors remain poorly understood. Experimental work suggests that plasticity of Th1 and Th2 subsets strongly depends on their differentiation state (Murphy et al., 1996) and that it is very difficult to reprogram the terminally differentiated subsets. For example, under some polarizing conditions Th2 cells cannot revert back to Th1 cells partly due to the loss of IL-12 receptor on these cells (Zhu and Paul, 2010). The definition "terminally differentiated CD4+ T cells" is very subjective, though. Long antigenic stimulation of naïve CD4+ T cells *in vitro* under either Th1 or Th2 polarizing conditions has been used as a surrogate for strong terminal differentiation. However, CD4+ T cells are rarely exposed to one polarizing cytokine environment *in vivo*. Recent work has also shown that Th1 cells are plastic; they can convert into Th2 cells in the presence of IL-4 (Szabo et al., 1995; Zhu and Paul, 2010). However, this conversion of the population of Th1 cells into Th2 effectors can also be explained by development of Th2 cells from naïve CD4+ T cells present in the Th1 cell population (Szabo et al., 1995). Recent studies also showed that the Th17 effector subset is unstable as compared to Th1 and Th2 effector cells, since Th17 cells can be reprogrammed to produce Th1 and Th2 cytokines (Lee et al., 2009). Furthermore, Tregs are plastic when cultured under Th1 (Oldenhove et al., 2009; Wei et al., 2009) or Th17 conditions (Yang et al., 2008a). Taken together, current data indicate that cellular plasticity of effector CD4+ T cell responses may be rather the rule than exception (**Figure 1**). How such plasticity is regulated remains poorly understood, however. Epigenetics is now considered to be one of the key mechanisms that dictates the stability and cellular plasticity of effector T cell subsets (Wilson et al., 2009).

Cellular plasticity of Th1 cells *in vivo* was demonstrated during *Nippostrongylus brasiliensis* infection during which the conversion of Th1 into Th2 cells was dependent on exogenous IL-4 (Panzer et al., 2012). Recent work suggests that conversion of Th1 into Th2 cells may occur independently of IL-4 via STAT-5-coupled cytokine receptors (Zhu et al., 2003, 2004). Furthermore, IL-4-independent conversion of Th1 into Th2 cells driven by signaling via the Notch receptor was also reported (Amsen et al., 2004, 2007).

Cell heterogeneity is a factor that can partially explain the plastic nature of effector CD4+ T cell subsets (Zhu and Paul, 2010). Such heterogeneity may arise when effectors can produce more than one cytokine. For example, while Th1 cells can produce IFN-γ, IL-2, and TNF-α, only a few of these cells express all the cytokines simultaneously (Darrah et al., 2007). Data from *in vitro* experiments (Murphy et al., 1996) showed that naïve CD4+ T cells differentiate into Th2 cells when stimulated with an antigen-loaded APCs in the presence of IL-4. However, even in such polarizing conditions a small percentage of cells in the cultures (4%) secrete IFN-γ. Similarly, in the presence of IL-12 and anti-IL-4 antibodies, only 80% of the cells were IFN-γ positive (Th1) and the rest, 20%, could either be undifferentiated or be cells producing IL-4 (Th2). Interestingly, using IL-4 to re-stimulate these strongly polarized Th1 cells induces IL-4 production in at least 8% of the population. The source of these IL-4 producing cells is unclear as they could have been derived from the undifferentiated naïve CD4+ T cells or from Th1 effectors. Taken together, recent work suggests that the phenotype of pathogen-specific effector CD4+ T cells may change over the course of infection due to cellular plasticity of T helper subsets. Yet, factors that regulate the efficiency at which the conversion from one cell subset to another occurs are still poorly understood.

### **POPULATION PLASTICITY**

Population plasticity is another major mechanism that may contribute to the change in the dominant phenotype of effector CD4+ T cells during chronic infections. In this mechanism, the size of the population of T cell effectors can increase due to preferential proliferation or reduced death of cells in the population (**Figure 2**). Generally, T cells undergo apoptosis under various conditions like cytokine deprivation (Cohen, 1993; Akbar et al., 1996), TNF-α level (Zheng et al., 1995), or a repeated stimulation with specific antigen due to activation-induced cell death (AICD) (Green and Scott, 1994; Kearney et al., 1994). Various reports claim the possibility of acquired tolerance with selective loss of Th1 cells and the persistence of Th2 cells (Burstein et al., 1992; De Wit et al., 1992). Additionally, the higher sensitivity of Th1 cells to AICD compared to Th2 counterparts was demonstrated (Ramsdell et al., 1994), which is likely to be removed due to a higher expression level of FasL in Th1 cells. The possibility of AICD of antigen-specific CD4+ T cell effectors during chronic infections was reported (Zhang et al., 1997). Once the majority of Th1 cells undergo apoptosis accompanied by the proliferation of Th2 cells (population plasticity), few Th1 cells that are present in the heterogeneous population could convert to Th2 subtype by epigenetic mechanisms (cellular plasticity). Population plasticity may be the major contributor to the change of the phenotype of the pathogen-specific T cells in chronic infections. Yet, the kinetics of proliferation and death of different subsets of effector CD4+ T cells during chronic infections are still lacking. Estimating the rates of proliferation, death, and re-differentiation of T effectors will lead to better quantitative understanding factors regulating the size of antigen-specific T cells in many pathological conditions.

#### **TH1/TH2 DYNAMICS IN CHRONIC INFECTIONS**

CD4+ T cell responses play a critical role in several chronic infections such as LCMV and HIV (Bevan, 2004; Wiesel and Oxenius, 2012; Streeck et al., 2013). The dynamics of pathogen-specific Th1 and Th2 responses has been studied during a mycobacterial infection with MAP called Johne's disease (JD, **Figure 3**). In early stages of MAP infection, Th1 cytokines such as IFN-γ, IL-2, and TNF-α, are highly expressed in serum of infected animals

**FIGURE 2 | Population plasticity of effector CD4+ T cells in chronic infections.** During an acute phase of infection, naïve CD4+ T cells differentiate into a heterogeneous population consisting mainly of Th1 cells and a few Th2 cells. However, as the disease progresses into a chronic phase, there is a gradual loss of Th1 cells and accumulation of Th2 cells. Accumulation of Th2 cells may occur due to a higher proliferation rate/reduced death rate of Th2 cells than that of Th1 cells.

(Burrells et al., 1999; Stabel, 2000a), and culture of blood samples with MAP antigens lead to expansion of the population of IFN-γ producing CD4+ T cells. Expression of IFN-γ and TNF-α drives differentiation of naïve CD4+ T cells into Th1 effectors while suppressing differentiation of T cells into Th2 effectors (Harris et al., 2007; Amsen et al., 2009) (**Figure 3**). Th1 response via the production of IFN-γ plays a key role in controlling bacterial infection by promoting macrophage activation to kill intracellular bacteria and by up regulating MHC-II expression (Paludan, 1998). At later stages of MAP infection (clinical JD) infected animals shed a significant number of MAP in feces and produce a high level of anti-MAP serum antibody (Fecteau and Whitlock, 2010). Production of IFN-γ and IL-12 is generally reduced in cows with clinical JD (Stabel, 1996, 2000a; Burrells et al., 1999) whereas expression of a Th2 cytokine (IL-4) is elevated (Sweeney et al., 1998). IL-4 suppresses IFN-γ induced macrophage activation (Paludan, 1998) and inhibits autophagy-mediated killing of intracellular mycobacteria (Harris et al., 2007). These experimental findings suggest that during disease progression in MAP-infected animals there is a switch from the initially dominant MAP-specific cellular (Th1) response to the antibody (Th2) response (Stabel, 2000b).

What regulates the dynamics of this switch remains poorly understood, however. There are two possibilities: (1) the Th1/Th2 switch is the cause of disease progression and death of the infected animal, or (2) the Th1/Th2 switch is the consequence of disease progression which occurs independent of whether T-helper responses are present or not. How exactly Th1 response is lost and Th2 response arises is also unknown. In particular, the relative contribution of cellular vs. phenotypic plasticity of CD4+ T cell responses (**Figures 1**, **2**) to the kinetics and likelihood of the Th1/Th2 switch in MAP-infected animals is not known. The issue is further complicated by the results of longitudinal studies

on experimental infection of sheep with MAP that showed that the timing of Th1-Th2 switch varies between individual animals and that Th1 response [IFN-γ] may stay high even in late stages of MAP infection (Begg et al., 2011; Stabel and Robbe-Austerman, 2011).

The prevalence of apparently non-protective Th2 responses during a chronic infection occurs during leprosy caused by *Mycobacterium leprae* in humans. Similar to the MAP infection, leprosy is thought to be a dynamic process with changes in bacteria-specific cellular immune responses leading to clinical manifestations. *M. leprae* infects macrophages and their activation is a critical step for clearing the bacterial infection. When the infected macrophages are inactive, *M. leprae* evades the cellular immune response and replicates inside of the cell until the cell bursts. Without any external signal, macrophages are unable to mount any significant response to the bacteria, and the infection spreads largely unchecked. Macrophages are generally activated by IFN-γ-producing Th1 cells. Activated macrophages are more likely to kill intracellular bacteria by facilitating fusion of lysosomes with bacteria-harboring phagosomes (Kenneth et al., 2008). Patients with tuberculoid leprosy show very few lesions which are dominated by IFN-γ and very little bacteria can be recovered from the lesions. In the case of lepromatous leprosy, the infection is not contained, and there is a dominance of Th2 cell cytokines and elevated levels of anti-*M. leprae* antibodies in serum (Modlin, 1994). Reversal of cytokine pattern from Th2 to Th1 was reported during the shift from lepromatous leprosy to tuberculoid stage by administration of either IL-12 or IFN-γ to lepromatous patients (Modlin, 1994). Exact mechanisms by which such a therapy resulted in clearance of the pathogen from lesions remain poorly understood, but it may involve direct suppression of Th2 cell differentiation by IFN-γ, and therefore could arise due to population plasticity of CD4+ T cell responses (Modlin, 1994; Misra et al., 1995).

Modulation of the pathogen-specific effector T-helper responses has been also demonstrated in the case of Leishmaniasis, a disease caused by an infection with a protozoan *Leishmania major*. This parasite causes cutaneous leishmaniasis in mice and humans. Infection of mice with a low parasite dose leads to parasite containment associated with a Th1 type response, whereas infection with a high parasite dose leads to progressive disease associated with a Th2/antibody response (Menon and Bretscher, 1998). Similarly, humans with localized cutaneous leishmaniasis (LCL) display few lesions and the growth of the parasite is confined to the lesions. During diffuse cutaneous lesihmaniasis (DCL) the lesions are widely disseminated with an uncontrolled growth of the parasite. Th1 cytokines are dominant in LCL; they help in the elimination of the infection. However, in case of DCL prevalence of Th2 cytokines leads to uncontrolled growth of the parasite (Castellano et al., 2009). Whether the switch from the dominant Th2 response to the protective Th1 response in chronic infection is possible remains unclear, but it has been shown that clinical cure of patients with leishmaniasis occurs concomitantly with the loss of Th2 effectors and persistence of Th1 cells from the acute to the chronic stage of the disease (Castellano et al., 2009).

### **MATHEMATICAL APPROACHES IN MODELING CD4+ T CELL DIFFERENTIATION**

There have been many mathematical studies aimed at improving our understanding of mechanisms regulating T cell differentiation. Studies on mathematical modeling of Th1/Th2 responses can be categorized into three main subgroups.

The first subgroup of studies developed and analyzed mathematical models of differentiation of naïve CD4+ (Th0) cells into Th1 and Th2 subsets by including the dynamics of Th1/Th2 cytokines, intracellular molecules, and gene regulatory networks (Biedermann and Röcken, 1999; Fishman and Perelson, 1999; Yates et al., 2000, 2004; Bergmann et al., 2001, 2002; Richter et al., 2002; Bettelli et al., 2006; Callard, 2007; Fenton et al., 2008; Eftimie et al., 2010; Naldi et al., 2010; Vicente et al., 2010; Groß et al., 2011; Hong et al., 2011, 2012; Liao et al., 2011). Some of these models described differentiation of naïve CD4+ T cells into different effector T cell subsets via upregulation of the phenotypespecific TF (master regulators) such as T-bet, GATA-3, FoxP3, and ROR-γt (Höfer et al., 2002; Mariani et al., 2004; Yates et al., 2004; Callard, 2007; Van Den Ham and De Boer, 2008; Hong et al., 2011, 2012). These studies explained how positive and negative feedback loops between these master regulators result in differentiation of a particular subset of T effectors. Cytokines that are present in extracellular environment and are produced by effector T cells strongly influence the direction of naïve T cell differentiation. Signals provided by cytokines binding to cytokine receptors and by antigens binding to the T cell receptors are summarized internally and eventually determine the direction of cell differentiation. Some of the predictions of these mathematical models found confirmation in experimental papers (Zheng and Flavell, 1997; Chakir et al., 2003; Ivanov et al., 2006; Yang et al., 2008b; Liao et al., 2011; van den Ham et al., 2013). Further advances in understanding of T cell differentiation have been obtained using curated Boolean network models which included the dynamics of multiple genes in T cells such as those encoding for cytokines and cytokine receptors (Mendoza, 2006; Thakar et al., 2007; Kim et al., 2008; Santoni et al., 2008; Pedicini et al., 2010). Such multi-scale models capture communications between cells via cytokines and integrate intra- and extracellular dynamics of such signaling molecules (Santoni et al., 2008; Pedicini et al., 2010). Virtual deletion experiments of the key master regulators have been used to predict factors (e.g., TF, cytokines, or cytokine receptors) influencing differentiation of cells toward either Th1 or Th2 phenotype (Pedicini et al., 2010).

The second subgroup of studies modeled population plasticity of Th1/Th2 cell responses. These models included the processes of cross-regulation of Th1/Th2 cell responses either directly by cell-to-cell interactions or via production of Th1/Th2 cytokines (Fishman and Perelson, 1999; Yates et al., 2000, 2004; Bergmann et al., 2001, 2002; Fenton et al., 2008; Eftimie et al., 2010; Groß et al., 2011). Some of these models offered a theoretical explanation of the switch from an initially dominant pathogens-specific Th2 response to a later dominant Th1 response (*or vice versa*). These models, however, only focused on the dynamics of populations of CD4+ T cells and did not incorporate intracellular genetic and molecular networks that enable the cells to acquire different physiological states. For example, studies of Yates et al. (2000) and Bergmann et al. (2001) showed that when Th1 effectors fail to clear the antigen, initially dominant Th1 response is lost and Th2 response arises. In the Bergmann et al. (2001) model, the shift in dominance of effector T cell populations is regulated by differences in differentiation, cross-suppression and clonal expansion of each subset as the function of the antigen concentration. In the Yates et al. (2000) model, dominance of the particular effector T cell subset is driven by the level of Th1/Th2 cytokines. The latter model also investigated how population dynamics of T-helper responses is influenced by activation-induced cell death which limits clonal expansion and hence aids in resolving the T cell balance. It should be noted, however, that few if any of mathematical models in this subgroup have been developed to address the kinetics of effector T-helper responses during infections with biologically relevant pathogens.

The third subgroup of studies modeled cellular plasticity of effector CD4+ T cell responses. Mathematical models of this subgroup predict reversible phenotypic plasticity between effector Th17 cells to induced regulatory T cells (iTregs) and reprogramming of Th2-polarised cells to Th1 phenotype in Th1-polarising conditions (Naldi et al., 2010; Pedicini et al., 2010; Carbo et al., 2013). A typical example of such mathematical models is the work by Pedicini et al. (2010), which predicted master transcription regulators as attractors associated with development of Th1 and Th2 cells using a cytokine network model. This modeling study makes testable predictions on the mechanisms that regulate the balance between Th1 and Th2 cells and how loss of this balance can skew lineage selection. *In silico* virtual knockout experiments of GATA-3 predicted creation of attractors with high expression of IFN-γ. Furthermore, deletion of both T-bet and GATA-3 predicted increase in expression of several other non-specific Th2 TF such as IRF4, MAF, NFAT, STAT1, and STAT6. Although models in this subgroup often generate novel predictions these models are in general very complex involving description of tens of genes and their products. Predictions of these models will need to be tested in specifically designed experiments.

### **DISCUSSION**

Discovery of several novel subsets of effector CD4+ T cells including Th17 and Tfh cells rejuvenated interest into factors that influence differentiation of naïve CD4+ T cells into effectors and the stability of different effector CD4+ T cell subsets both *in vitro* and following immune response to antigens *in vivo*. One of the most intriguing observations is that even differentiated effector CD4+ T cells can change their phenotype if the environmental conditions change (Murphy and Stockinger, 2010). Factors that regulate such *cellular plasticity* of effector and memory CD4+ T cell responses still remain incompletely defined, and how and whether such plasticity can be explored therapeutically is unknown.

In a number of conditions including infections, autoimmune diseases, and allergic reaction, the host generates an effector CD4+ T cell response of inadequate phenotype that may lead to worsening of symptoms and often to exacerbation of the disease. In particular, during MAP infection of ruminants initially protective Th1 CD4+ T cell response is lost over time, and non-protective Th2 response arises (Stabel, 1996, 2000a; Burrells et al., 1999). What regulates this change in the immune response phenotype is unclear. Conversion of MAP-specific Th1 cells into Th2 over time (cellular plasticity) could be one potential mechanism. Alternatively, there may be quantitative differences in the rates of differentiation of naïve CD4+ T cells into two subsets of effectors, differences in the rates of proliferation, death, and migration of different subsets of CD4+ T cells to the site of infection (population plasticity). Finally, phenotype switch could be driven by other helper cell types, for example, thymusderived Tregs or periphery-induced Tregs. Experimentally, it will be a challenge to discriminate between these alternative mechanisms of Th1/Th2 switch during JD. As for other conditions (e.g., allergic reactions) mechanisms driving the change in phenotype of allergen-specific CD4+ T cell effectors following immunotherapy remain to be determined (Holt and Thomas, 2005; van Oosterhout and Motta, 2005). We believe that one of the important experimental challenges is to evaluate the rates at which different effector T-helper cell subsets proliferate and die during chronic inflammatory conditions (e.g., infections) and whether these rates are influenced by the type of inflammatory environment.

Many mathematical models on CD4+ T cell differentiation have been developed and analyzed. The vast majority of these models focus on the initial differentiation step of naïve CD4+ T cells into a particular effector subset. Such models are useful for the vaccine development where induction of an appropriate CD4+ T cell response will be critical for the vaccine efficacy. The discovery of cellular plasticity of effector CD4+ T cells calls for the need to develop novel mathematical models that

### **REFERENCES**


explain and predict how one T cell subset is converted into another subset. The use of gene expression and phenotypic data from *in vitro* and *in vivo* generated effector CD4+ T cells will be instrumental for testing and verifying such mathematical models.

Mathematical models have also been developed to explain population plasticity of effector T cell responses. These models are more relevant to chronic conditions such as persistent infections and autoimmune diseases. Yet, most of these models have been poorly parameterized and predictions of such models have not been adequately tested in well-designed experiments. More experimental data is needed to explain how proliferation, death, and differentiation of effector T cells are influenced by the environment and the subsets themselves. Also, data on the dynamics of effector T cells at the sites of infection will be useful for the development of models for specific infections. In all cases, development of quantitative mathematical models can be greatly enhanced by closer collaborations between mathematicians/modelers and wet-lab experimentalists.

### **ACKNOWLEDGMENTS**

The authors acknowledge the support of the National Institute for Mathematical and Biological Synthesis (NIMBioS), an Institute sponsored by the National Science Foundation, the U.S. Department of Homeland Security, and the U.S. Department of Agriculture through NSF Award #EF-0832858, with additional support from The University of Tennessee and the University of Tennessee Institute of Agriculture grant to Shigetoshi Eda and Vitaly V. Ganusov.

pathways for the generation of pathogenic effector TH17 and regulatory T cells. *Nature* 441, 235–238. doi: 10.1038/nature04753


278, 157–169. doi: 10.1016/S0022- 175900200-X


*Eur. J. Immunol.* 42, 2238–2245. doi: 10.1002/eji.201242619


Magombedze et al. CD4+ T cell differentiation

helper T cells. *Nat. Rev. Immunol.* 2, 933–944. doi: 10.1038/nri954


CD8 T-cell priming, memory generation and maintenance. *Curr. Opin. Immunol.* 19, 315–319. doi: 10.1016/j.coi.2007.04.010


*Immunol.* 42, 1080–1088. doi: 10.1002/eji.201142205


necessary and sufficient for Th2 cytokine gene expression in CD4 T cells. *Cell* 89, 587–596. doi: 10.1016/S0092-867480240-8


in T(H)1-T(H)2 responses. *Nat. Immunol.* 5, 1157–1165. doi: 10.1038/ni1128

Zhu, J., and Paul, W. E. (2010). Heterogeneity and plasticity of T helper cells. *Cell Res.* 20, 4–12. doi: 10.1038/cr. 2009.138

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 27 May 2013; accepted: 21 July 2013; published online: 16 August 2013. Citation: Magombedze G, Reddy PBJ, Eda S and Ganusov VV (2013) Cellular and population plasticity of helper CD4*+ *T cell responses. Front. Physiol. 4:206. doi: 10.3389/fphys.2013.00206*

*This article was submitted to Frontiers in Systems Biology, a specialty of Frontiers in Physiology.*

*Copyright © 2013 Magombedze, Reddy, Eda and Ganusov. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Mathematical and statistical modeling in cancer systems biology

### **Rachael Hageman Blair <sup>1</sup>\*, David L. Trichler 1,2 and Daniel P. Gaille<sup>1</sup>**

<sup>1</sup> Department of Biostatistics, State University of New York at Buffalo, Buffalo, NY, USA

<sup>2</sup> Department of Biostatistics, University of Toronto, Toronto, ON, Canada

#### **Edited by:**

Kumar Selvarajoo, Keio University, Japan

#### **Reviewed by:**

Amin Mazloom, University of Texas at Arlington, USA Shawn Gomez, The University of North Carolina at Chapel Hill, USA

#### **\*Correspondence:**

Rachael Hageman Blair, Department of Biostatistics, State University of New York at Buffalo, Kimball Tower, Room 709, 3435 Main Street, Buffalo, NY 14214, USA. e-mail: hageman@buffalo.edu

Cancer is a major health problem with high mortality rates. In the post-genome era, investigators have access to massive amounts of rapidly accumulating high-throughput data in publicly available databases, some of which are exclusively devoted to housing Cancer data. However, data interpretation efforts have not kept pace with data collection, and gained knowledge is not necessarily translating into better diagnoses and treatments. A fundamental problem is to integrate and interpret data to further our understanding in Cancer Systems Biology. Viewing cancer as a network provides insights into the complex mechanisms underlying the disease. Mathematical and statistical models provide an avenue for cancer network modeling. In this article, we review two widely used modeling paradigms: deterministic metabolic models and statistical graphical models. The strength of these approaches lies in their flexibility and predictive power. Once a model has been validated, it can be used to make predictions and generate hypotheses. We describe a number of diverse applications to Cancer Biology, including, the system-wide effects of drug-treatments, disease prognosis, tumor classification, forecasting treatment outcomes, and survival predictions.

**Keywords: cancer, metabolism, ODEs, steady-state, dynamic, graphical models, high-throughput data**

### **MATHEMATICAL AND STATISTICAL MODELING IN CANCER SYSTEMS BIOLOGY**

In the last half a century, tremendous progress in understanding the genetic and biochemical mechanisms underlying cancer has been achieved. Despite these advances, cancer remains a major health problem that is responsible for one in every four adult deaths (Siegel et al., 2011). High mortality rates indicate that this knowledge is not translating into effective cancer treatments (Lord and Ashworth, 2010). Chemotherapy was discovered in chemical warfare during World War I; it was first used to treat cancer in the 1940s when little was understood about the disease (Goodman et al., 1946), and remains the most common form of treatment for most types of cancers. Chemotherapy drugs target rapidly dividing cells; as a result, normal tissues with high growth rates suffer and patients often experience adverse and sometimes deadly side effects.

Over the past 15 years, drugs have emerged that target cancer metabolism, either directly through enzymes that facilitate metabolic reactions or indirectly through signaling pathways (Zhukov and Tjulandin, 2007; Heiden, 2011). Targeted therapy is typically less damaging to normal cells than chemotherapy. However, cancer cells are extremely robust for survival and often completely insensitive to perturbations or develop resistance over time. Drug resistance occurs when non-targeted genes or proteins kick in to *rescue* the cancer cell by rerouting growth requirements through alternative mechanisms and pathways. Drug resistance is a major limitation to targeted therapies. For this reason, they are most effective when used in combination with chemotherapy treatments. It is becoming apparent that, in order to develop effective targeted therapies that overcome resistance, the drug development paradigm will have to shift from single molecular targets to pathways (Astsaturov et al., 2010; Thangue and Kerr, 2011). Systems biology approaches will play a pivotal role in the development of drugs that do not succumb to resistance.

Mathematical models of complex biological systems are central to systems biology. They can be used as an exploratory tool to complement and guide experimental work. Simulations, known as *in silico* experiments, can be performed with mathematical models to validate hypotheses and make predictions about quantities that are difficult or impossible to measure *in vivo*. Predictions can provide much-needed insight into the pathways driving cancer progression, and the robust compensatory mechanisms that protect cancer cells from drug intervention. Model simulations can be used to predict the system-wide effects of molecular targets, e.g., determine the effects of molecular target(s) inhibition in specific populations. They can also serve as an important clinical tool, e.g., classify benign and malignant tumors, predict disease prognosis for individual patients, and predict outcomes of treatments.

High-throughput technologies offer the capability to simultaneously measure tens of thousands of molecular targets per sample. As costs steadily decline, the number of *omics* datasets characterizing the genome, proteome, and metabolome continues to grow. A number of publicly available resources have been developed to house data and functional annotation. These resources can be queried and have enabled scientists to better leverage omicsbased research efforts. To illustrate the size of such databases, as of March, 2012, Gene Expression Omnibus (GEO) contained data from 9,919 platforms, 710,229 samples, 28,873 series, and

2,720 manually curated datasets (Barrett and Edgar, 2006). The Progenetix database houses data from Comparative Genomic Hybridization (CGH) experiments that focus on copy number abnormalities in human cancer (Baudis and Cleary, 2001). The Cancer Genome Atlas (TCGA) contains the results of subjecting patient samples from a variety of cancer subtypes to a battery of common high-throughput assays such as gene expression, array comparative genomic hybridization (aCGH), SNP genotyping, methylation profiling, microRNA profiling, and some exon sequencing platforms (Collins and Barker, 2008). The Sanger Cancer Genome Project has generated a cancer gene census (Futreal et al., 2004), a catalog of somatic mutations in cancer (Forbes et al., 2010), as well as several bioinformatic resources born out of the interrogation of cancer cell lines.

The wealth of publicly available data offers an exciting opportunity to study cancer as a complex network. We are currently in an era where collecting data in a high-throughput fashion is the norm. However, our ability to interpret this datafor knowledge and discovery has not kept pace with the data collection efforts. Importantly, this message was echoed in NCI's recent funding opportunity addressing *provocative questions*, which pose game-changing scientific questions to drive progress against cancer (RFA-CA-11- 01; Varmus and Harlow, 2012). A series of questions were posed to inspire investigators to ". . .step back from the momentum of these discoveries and make sure we have left no stone unturned and no important but perhaps not obvious question left unexplored." Provocative question 17 asks thefollowing:"Since current methods to assess potential cancer treatments are cumbersome, expensive and often inaccurate, can we develop other methods to rapidly test interventions for cancer treatment or prevention?" Mathematical models serve as a link between experimental and computational biology, and can be used to address this question. Specifically, they can serve as a tool to drive experimental advances in terms of predication, classification, and hypotheses generation.

In this article, we describe two complementary and widely used modeling paradigms: deterministic models of cellular systems and graphical modeling. Deterministic models of cellular metabolism are constructed in a *bottom-up* approach from known stoichiometry, principles of mass balance, and physiological constraints, whereas graphical models are inferred from the data using linear statistical models in a *top-down* approach. These approaches offer vastly different perspectives on network behavior and have been instrumental for systems biology. We review the fundamentals of these modeling paradigms and highlight applications of models that have been developed to advance Cancer Systems Biology.

### **DETERMINISTIC MODELS OF CELLULAR METABOLISM AND CELL SIGNALING**

Cancer cells exhibit profound alterations to their metabolic and signaling pathways. Many drugs that are either available or in the development phase target proteins or enzymes in these pathways in an effort to slow or halt cancer growth (Bates et al., 2012). Cell proliferation, motility, and survival are tightly controlled in normal cells. However, adjustments in cancer cell signaling enable proliferation independent of exogenous signals, disrupt apoptosis, and elicit tumor angiogenesis and metastasis to surrounding tissues and vessels (Johnstone et al., 2002; Martin, 2003). Unlike their normal counterparts, cancer cells use aerobic glycolysis instead of oxidative phosphorylation for energy production (Warburg, 1956). Glutamine is central to cancer cell protein and nucleotide biosynthesis, and replenishes the TCA cycle for anabolic processes (Lu et al., 2010). Fatty acid biosynthesis occurs at high rates and most fatty acids are produced *de novo* regardless of nutrition. (Medes et al., 1953; Ookhtens et al., 1984). These metabolic and signaling signatures are common to most forms of cancer.

Ordinary differential equations (ODEs) represent the most widely used approach for modeling cellular dynamics. The underlying assumption is that reactions occur under well-mixed conditions and that the abundance of reactants is not too low. The differential equations are derived from laws of mass balance and describe the rate of change of a species (*dC*/*dt*) in terms of production and utilization, i.e., *dC*/*dt* = production − utilization (**Figure 1A**). In many cases, the stoichiometry of pathways are well understood, and the topology of the system can be modeled easily with ODEs (Ogata et al., 1999; Matthews et al., 2009). However, the underlying processes, e.g., reaction fluxes and transport rates, rely on parameters that are often unknown and require challenging underdetermined estimation from time course data (**Figure 1B**; Erguler and Stumpf, 2011). Another challenge is that these systems can often exhibit sharp transients on different time scales (stiffness),which requires computationally intensive numerical integration (Shampine et al., 2003; MacLachlan et al., 2007). These factors ultimately limit the scale of dynamic models. Consequently, they are used to investigate small subsets of reactions and pathways.

ODE models have been used extensively to examine the dynamic properties of cancer signaling pathways. A model of tumor suppressor p53 and oncogene Mdm2 revealed high variability in the oscillatory behaviors of cells following DNA damage (Geva-Zatorsky et al., 2006). NF-κB signaling plays a critical role in intracellular signaling, apoptosis, and resistance to chemotherapy. A computational model was used to distinguish the roles of NF-κB kinase isoforms, which regulate NF-κB through coordinated system dynamics (Hoffmann et al., 2002). Extensions of this model have been used to characterize feedback loops in the system and identify the activation of downstream pathways (Covert et al., 2005; Werner et al., 2005; Cheong et al., 2008). Several different mathematical models have been developed for the MAPK (mitogen-activated protein kinase) pathway (35 models between 1960 and 2005; Orton et al., 2005). Despite differences in detail and complexity, these models are able to explain the data and make insightful system-wide predictions about the pathway dynamics. Most of the differences between model outputs can be attributed to model boundaries and simplifications. This has been suggested to be a reflection of the robustness of ODE modeling and the biological system at hand (Orton et al., 2005).

Advances in high-throughput technologies have spurred the development of comprehensive genome-scale metabolic models (Oberhardt et al., 2009). These models have developed from extensive curation of the data and literature. The metabolic system is described by hundreds of metabolic reactions, multiple compartments, and highly interconnected pathways. Constraint based analysis (CBA) has been used to investigate the *steady-state* behavior of these systems under a variety of conditions. In the

steady-state, metabolites are stable and exhibit no change in concentration levels. Adopting this assumption reduces a complex dynamical system of ODEs to linear system and eliminates the need for large-scale parameter estimation (**Figure 1C**).

The purpose of steady-state analysis is to identify feasible flux values that satisfy the steady-state assumptions and maximize an objective function describing the physiological objectives of the cell (Lee et al., 2006). The solution space is bounded with system constraints, e.g., stoichiometric, thermodynamic, and enzyme capacity constraints. In single cell organisms, such as *Escherichia coli* and *Saccharomyces cerevisiae*, the cellular objective is to proliferate, and critical reactions and pathways are included in biomass function which is maximized (Edwards and Palsson, 2000; Förster et al., 2003). In these cases, optimizing cellular growth is analogous to maximizing the likelihood of survival. Defining cellular objectives is less straightforward in mammalian and human systems, which consist of a variety of interacting tissues and cells (Duarte et al., 2007; Livnat Jerby and Ruppin, 2010; Selvarasu et al., 2010). However, unlike normal cells, cancer cells want to proliferate and exhibit biomass requirements which can be leveraged in CBA modeling approaches. Recently, a genome-scale model has been used to characterize the Warburg effect in cancerous cells

(Shlomi et al., 2011). The model was validated against the full panel of NCI-60 cancer cell lines, and provided novel insights into phases of metabolic behavior through cancer progression. A smaller model centered around a core set of critical enzymes and coding genes was used to predict novel drug targets (Folger et al., 2011).

ODEs are the most popular modeling technique largely because of their simplicity. Several other modeling paradigms that vary in complexity have been applied to study cancer cellular metabolism, signaling, and response to treatment. Boolean models have been used to represent reactions as logical gates with two states: on and off (Lähdesmäki et al., 2003; Morris et al., 2010). Partial differential equations (PDEs) are significantly more complex than ODEs with respect to parameter estimation. Detailed information about spatial dynamics and interactions between components is required (Sleeman and Levine, 2001; Ribba et al., 2006; Friedman et al., 2007). Perturbation-response modeling approaches are based on fundamental linear response rules, which leverage flux conservation. This approach has been used to examine toll like receptor (TLR) signaling and tumor necrosis factor related apoptosis inducing ligand (TRAIL) resistance (Piras et al., 2011; Selvarajoo, 2011). Pharmacokinetic modeling has also been used to describe the time-dependent distribution of drugs in the system (Gerlowski and Jain, 1983; Reitz et al., 1990; Sanga et al., 2006).

### **GRAPHICAL MODELS**

Probabilistic graphical models (PGMs) can be used to describe directed and undirected relationships between variables (Koller and Friedman, 2009). In this setting, each variable (e.g., genes, proteins) is a node in the network and viewed as a random variable, which is subject to uncertainty. The links in the network convey a relevant measure of association, e.g., correlation (undirected) or causality (directed). The network structure can be decomposed into small regions and translated into a product of conditional probabilities, which represents the joint probability distribution. Undirected graphs are known as Markov Networks and portray symmetric relationships (**Figure 2A**). A link in this model is present if the linked nodes are associated after controlling for the

influence of other nodes in the graph (conditional association). In a directed graph, an edge *A* →*B* implies that independent variable *A* (parent node) is upstream of the dependent variable *B* (child node) in the underlying causal process (**Figure 2B**). Furthermore, the directed edge implies a causal effect of *A* after the influence of the remaining nodes upstream of *B* (ancestors of *B*) have been controlled for or removed. Bayesian Networks (BNs) are directed acyclic graphs (DAGs), which contain no cycles, and thereby prohibit feedback in the model. Chain graphs contain a mixture of directed and undirected edges.

A fundamental challenge is to infer graphical models from data. There are two distinct and difficult learning tasks: parameter estimation and structural learning. Parameter estimation is for the parameters of the conditional probabilities for a given network structure, and can be carried out using maximum likelihood approaches (Koller and Friedman, 2009). In structural learning, the aim is to identify the most likely network topology that came from the observational data. Structural learning is especially challenging because the number of possible network topologies is super-exponential with the number of nodes (Chickering et al., 1994). As a result, enumeration of all possible network topologies is impossible even for small problems, and machine learning and optimization techniques must be utilized (Koller and Friedman, 2009).

PGMs have been applied to investigate a number of different cancers and data types. Several applications involve prediction and classification tasks, which have direct clinical relevance. Markov networks were used to predict breast cancer survival after patients received different forms of treatments, e.g., combinations of chemotherapy, radiotherapy, and hormonal therapy (Pérez-Ocón et al., 2001). BNs were used to integrate clinical and microarray data for the classification of breast cancer patients into good and poor prognosis groups (Gevaert et al., 2006). Kahn et al. developed a BN called MammoNet for radiological decision support in distinguishing malignant and benign mammary tumors. The highly accurate classifier (88% correct diagnosis in test cases) was constructed from observational data, patient history, and expert advice from experienced radiologists (Kahn et al., 1995).

This group later developed a similar BN classifier called OncOs to differentiate among bone lesions on the appendicular skeleton (Kahn et al., 2001). A practical value of these models is to provide a probabilistic guide for a clinician to diagnose and treat different cancers. Another use of PGMs is to sort out the underlying mutations which put individuals at high-risk. Conjunctive Bayesian Networks (CBNs; Beerenwinkel et al., 2007), which describe an accumulation of events, have been used to model the accumulation of mutations using CGH mutation data from the Progentix database (Gerstung et al., 2009). The inference problem is to identify CGNs,which represent the dependencies among accumulating mutations in renal cell carcinoma, breast, and colorectal cancers. The models identified multiple independent mutations, which triggered downstream complex pathways.

A strength of PGM frameworks is the flexibility to integrate across diverse data types. Recently, a PGM methodology based on factor graphs known as PARADIGM (Pathway Recognition Algorithm using Data Integration on Genomic Models) was proposed, which integrates multiple high-throughput data sets together to identify perturbed molecular pathways (Nigro et al., 2005). This method was applied to breast cancer using gene expression data, and glioblastoma using gene expression and copy number data, to identify pathways and disease subclasses which correlate with survival. PARADIGM was recently applied to the same task using a more comprehensive set of breast cancer data in the CGA, including, mRNA, copy number alterations, micro RNAs, and methylation data. The method revealed disease subclasses and specific class signatures, which would not have been identified without leveraging the different data sources. Specific perturbations in immune response and interleukin signaling (IL-4, IL-6, IL-12, and IL-23) were also shown to be drivers of the classification and to have promising prognostic value. For example, patients with gene signature that favors high-T helper 1 cytotoxic T-lymphocyte response and represses Th2 driven humoral immunity, are more likely to have a better survival outcome.

Expression quantitative trait loci (eQTL), protein QTL, and metabolic QTL combine genotyping and high-throughput phenotyping of a population (Jansen et al., 2009). Genotype-phenotype network inference leverages this data and the natural variation that occurs within a population (Rockman, 2008). EQTL data on skin tumor progression in mice revealed markedly different patterns in the genetic architecture of malignant skin tumors (Quigley et al., 2011). This rich data includes genotypes and gene expression from F2 mice on benign and malignant skin tumors, as well as normal skin samples. EQTL data from a mouse model of breast cancer was used to identify *Sipa1*, a susceptibility and progression locus in both mice and humans (Crawford et al., 2007). PGM based algorithms utilize directed graphs to approximate the network of causal relationships among phenotypes and genotypes in segregating populations, but applications to cancer data are yet to be explored (Neto et al., 2010; Hageman et al., 2011).

There has been recent progress in sparse genome-scale models for undirected graphs, with applications that include protein signaling, breast cancer gene expression, and the genetics of gene expression (Carvalho et al., 2008; Friedman et al., 2008; Edwards et al., 2010; Yoshida and West, 2010). Sparse models can be estimated when the number of variables greatly exceeds the sample size. Importantly, estimation in graphical models requires large sample sizes for accuracy. Although sparse modeling deals with the issue of many variables, sufficient sample size is still required for meaningful results.

Graphical reasoning about biological problems underlies many approaches that are not formal PGMs. Cluster analysis is a class of techniques whose motivation lies in the concept of *modularity*, which has gained popularity more or less simultaneously in molecular biology, systems biology, developmental biology, and evolutionary biology (Wagner et al., 2007). Clustering (Gordon, 1999) can be viewed as a graph partitioning since members of the same cluster are considered to be connected in terms of whichever measure of association is adopted, and different clusters are relatively disconnected from each other. The associations between clusters may be specified in a variety of ways and no attempt is made to specify all the links in the graph.Viewing high-throughput data through clusters and modules increases our ability to distinguish subtle signals in tumorigenesis (Segal et al., 2005). This type of analysis is often easier to interpret than traditional lists of differential expression. Clustering methods have been extensively applied to identify and classify different cancer subtypes, and associate clusters with survival, e.g., (Furey et al., 2000; Guyon et al., 2002; van't Veer et al., 2002; Sørlie et al., 2003; Rich et al., 2005; De Souto et al., 2008). Weighted Correlation Network Analysis (WGCNA) was recently developed as a method for identifying co-expression modules, relating modules to one another, relating modules to external phenotypes, and identifying *hub genes* that are highly connected within the module (Langfelder and Horvath, 2008). This method was used to identify a co-expression module in glioblastoma, which was also present in breast cancer. *ASPM*, a hub gene in the module,was experimentally validated as a potential uncharacterized glioblastoma target (Horvath et al., 2006).

### **DISCUSSION: CHALLENGES AND OPPORTUNITIES**

Data integration remains a major fundamental challenge for the field of systems biology, which has limited our ability to take full advantage of omics data for knowledge and discovery (Kitano, 2002; Sullivan et al., 2010). Comparisons and integration *within* omics data types are complicated by a number of factors. Several different platforms are available that use different technologies and vary in coverage. Differences exist in sample quality, array processing, the organism under investigation, tissue type, and experimental conditions (e.g., diet). Integration *between* data types is an even larger challenge (**Figure 3**). It is important to understand how these different biological domains connect and give rise to a phenotype or disease. Methods that integrate between and across diverse data types are only beginning to emerge (Nigro et al., 2005). Mathematical modeling is a promising avenue for this endeavor. In Cancer Biology, data integration is of particular importance because of the complex interplay between genetics, cell signaling, and metabolic pathways.

Mathematical and statistical models are capable of integrating biological knowledge that is outside of the observational data. In a number of applications, the use of Bayesian methods that integrate *a priori* knowledge into the model have been shown to improve model behaviors and predictive output. We have described applications of BNs which incorporate *expert advice* from radiologists,

which can be viewed as a model prior (Kahn et al., 1995). In metabolic modeling, flexible Bayesian priors have been used to guide the parameter estimation process. In this context, priors favor parameter estimates which respect known physiology of the system, e.g., steady-state, dynamic trends, feasible bounds on concentration levels, and fluxes (Calvetti and Somersalo,2006;Calvetti et al., 2006). In graphical models, priors have been developed in the form of energy functions to guide network inference (Imoto et al., 2004). Priors have been used to encode known relational information from databases such as KEGG into the network inference process (Werhli and Husmeier, 2007; Mukherjee and Speed, 2008). Priors have also been used to enforce sparsity in the network structure and prevent over-fitting (Hageman et al., 2011).

Developing mathematical models which are consistent with and predictive of the true underlying biological mechanisms is a central goal of systems biology. The experimental design and perturbations have been shown to have major influence on parameter estimation, and subsequently the output and accuracy of the computational model (Apgar et al., 2010). Graphical model network inference can be subject to a large proportion of false positive edges (Li et al., 2010). Environmental and experimental design factors that are not accounted for in the model can further misguide models (Remington, 2009). Assessing and improving the utility of mathematical models in the context of systems biology will continue to be an active area of research.

A continuous cycle between mathematical modeling and the wet-bench is critical to move systems biology forward. As George Box famously stated, "all models are wrong, but some are useful" (Box and Draper, 1987). Sensitivity analysis should routinely be performed to assess how sensitive the model output (predictions) are to model parameters and input (data). However, this is often not routine. Sensitivity analysis can also be used

#### **REFERENCES**


Zhou, Y., Devarajan, K., Silverman, J., Tikhmyanova, N., Skobeleva, N., Pecherskaya, A., Nasto, R., Sharma, C., Jablonski, S., Serebriiskii, I., Weiner, L., and Golemis, E. (2010). Synthetic lethal screen of an egfr-centered network to improve targeted therapies. *Sci. Signal.* 3, ra67.

to guide model reductions and expansions, e.g., marginalizing over quantities that play little to no role in the system dynamics. Mathematical models can provide, via model driven predictions and hypotheses generation, a cheap and fast catalyst for experimental advances in systems biology. On the other hand, models which are more "wrong" than "useful" can lead to the design and execution of experiments and studies which are unlikely to be successful. Contrary to *in silico* studies, this can waste a lot of time and money, and ultimately promote skepticism in the modeling approach.

### **CONCLUDING REMARKS**

In summary,mathematical models of networks can describe a wide range biological processes. We have described two complementary modeling approaches: deterministic modeling of cellular metabolism and graphical modeling, which offer different insights into biological systems. Although they have been used to drive progress in Cancer Systems Biology, they remain far from mainstream. At present, there is an overwhelming need to view cancer as a complex network in order to understand drug resistance, and develop viable targets. It is also critical to better interpret and integrate data to get at the mechanisms which drive the disease, classify cancer subtypes, and predict treatment outcomes. In the coming years, we believe that mathematical and statistical models will be pivotal in advancing our understanding, and that they hold tremendous promise for the future of Cancer Systems Biology.

### **ACKNOWLEDGMENTS**

David L. Trichler received support from the Natural Sciences and Engineering Research Council of Canada (NSERC) and from MITACS (Mathematics of Information Technology and Complex Systems).


*coli* k-12 gene deletions. *BMC Bioinformatics* 1, 1. doi:10.1186/1471- 2105-1-1


biological pathways and processes. *Nucleic Acids Res.* 37(Suppl. 1), D619–D622.


cancer. *J. R. Stat. Soc. Ser. C Appl. Stat.* 50, 111–124.


H. (2002). Gene expression profiling predicts clinical outcome of breast cancer. *Nature* 415, 530–536.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 30 March 2012; accepted: 05 June 2012; published online: 28 June 2012.*

*Citation: Blair RH, Trichler DL and Gaille DP (2012) Mathematical and statistical modeling in cancer systems biology. Front. Physio. 3:227. doi: 10.3389/fphys.2012.00227*

*This article was submitted to Frontiers in Systems Physiology, a specialty of Frontiers in Physiology.*

*Copyright © 2012 Blair, Trichler and Gaille. This is an open-access article distributed under the terms of the Creative Commons Attribution Non Commercial License, which permits noncommercial use, distribution, and reproduction in other forums, provided the original authors and source are credited.*

**69**

# Assessing uncertainty in model parameters based on sparse and noisy experimental data

#### *Noriko Hiroi <sup>1</sup> \*†, Maciej Swat 2† and Akira Funahashi <sup>1</sup>*

*<sup>1</sup> Systems Biology Laboratory, Department of Bioscience and Informatics, Keio University, Yokohama, Japan*

*<sup>2</sup> European Molecular Biology Laboratory, European Bioinformatics Institute, Hermjakob Team: Proteomics, Cambridgeshire, UK*

#### *Edited by:*

*Kumar Selvarajoo, Keio University, Japan*

#### *Reviewed by:*

*Jeffrey Varner, Cornell University, USA*

*Jason Edward Shoemaker, Japan Science and Technology Agency, Japan*

#### *\*Correspondence:*

*Noriko Hiroi, Systems Biology Laboratory, Department of Biosciences and Informatics, Keio University, 3-14-1, Hiyoshi, Kohoku-ku, Yagami Building 14, Room 420 west, Yokohama, Kanagawa, Japan e-mail: hiroi@bio.keio.ac.jp*

*†These authors have contributed equally to this work.*

To perform parametric identification of mathematical models of biological events, experimental data are rare to be sufficient to estimate target behaviors produced by complex non-linear systems. We performed parameter fitting to a cell cycle model with experimental data as an *in silico* experiment. We calibrated model parameters with the generalized least squares method with randomized initial values and checked local and global sensitivity of the model. Sensitivity analyses showed that parameter optimization induced less sensitivity except for those related to the metabolism of the transcription factors c-Myc and E2F, which are required to overcome a restriction point (R-point). We performed bifurcation analyses with the optimized parameters and found the bimodality was lost. This result suggests that accumulation of c-Myc and E2F induced dysfunction of R-point. We performed a second parameter optimization based on the results of sensitivity analyses and incorporating additional derived from recent *in vivo* data. This optimization returned the bimodal characteristics of the model with a narrower range of hysteresis than the original. This result suggests that the optimized model can more easily go through R-point and come back to the gap phase after once having overcome it. Two parameter space analyses showed metabolism of c-Myc is transformed as it can allow cell bimodal behavior with weak stimuli of growth factors. This result is compatible with the character of the cell line used in our experiments. At the same time, Rb, an inhibitor of E2F, can allow cell bimodal behavior with only a limited range of stimuli when it is activated, but with a wider range of stimuli when it is inactive. These results provide two insights; biologically, the two transcription factors play an essential role in malignant cells to overcome R-point with weaker growth factor stimuli, and theoretically, sparse time-course data can be used to change a model to a biologically expected state.

**Keywords: parametric identification, generalized least squares, sensitivity analysis, fisher information matrix, bifurcation analysis**

### **INTRODUCTION**

Parametric identification is a significant process of model building. The identification problem concerns the possibility of drawing inferences from observed samples to an underlying theoretical structure. The basic results for linear simultaneous equation systems under linear parameter constraints were found in 1950, and extensions to non-linear systems and non-linear constraints were made by Fisher (1961) and others.

There exist some steps of parametric identification: (1) checking structural identifiability, to clarify practical difficulties such as multimodality and lack of practical identifiability; (2) analysing sensitivity and ranking parameters; (3) model calibration including problem formulation, numerical solution, and global optimization methods of parameters; and based on this knowledge, performing (4) optimal experimental design.

These processes are performed to explain observed biological phenomena, or to fill gaps between the molecular level and larger patterns. Meanwhile, we may identify the key mechanisms of a system in a model, which can allow us to predict missing components, concepts, or unobserved phenomena, and serve as a guide for further experiments.

During each division cycle, cells need to duplicate their genomes and distribute the two copies equally to the two daughter cells. The processes of DNA-duplication (S-phase) and cell division (mitosis) are separated by two gap phases (G1 and G2). During these phases, several mechanisms operate to prevent cells from continuing the cell cycle under inappropriate conditions. Normal cells can interrupt the cell cycle in the gap phases through growth inhibitory mechanisms that activate the retinoblastoma proteins (Rb) or p53 transcription factors. In cancer cells, these growth inhibitory pathways are often disrupted, leading to unscheduled proliferation (Hanahan and Weinberg, 2000).

We used Yao's 2008 model (Yao et al., 2008), which is consistent with experimental data exhibiting bimodality. The model represents the underlying mechanisms of a restriction point (Rpoint), which is the critical event for a mammalian cell to commit to proliferation independently from extracellular growth stimuli.

Normal cells respond to extracellular growth factors. Their absence arrests the cell cycle in the G1 phase. However, growth factors are required only until a few hours prior to the initiation of S-phase. This moment in G1 was first described in 1974 by Pardee (1974) and is named the R-point. It was clarified later that cells that pass the R-point can progress to S-phase independently of mitogens (Sherr and Roberts, 2004). Importantly, Pardee found that the R-point was defective in cancer cell lines. In addition, cancer cells were much more resistant to the inhibition of protein synthesis, which is supposed to be required for the R-point, suggesting that the required R-point factors are either stabilized in cancer cells or not necessary to progress the cell cycle (Campisi et al., 1982). An example of their findings is when the Rb protein has its activity inhibited, and the machinery of the R-point is disrupted and the cell lines are transformed into malignant lines.

This model correctly reconstructs the most fundamental behavior of the molecular network system of the mammalian cell cycle, such as bimodality, by its structure. The molecular mechanism, which this model represents, is also significant to control the switching among different physiological cellular states: from normal cell proliferation to malignant, or differentiation and cell death. These switching mechanisms between normal proliferation and other states are the key to tumourigenesis, the variation in leukocyte production, and so on. The missing property of this model is that it has never been fitted to a time-course data of molecules. There exist other models that represent cell cycle mechanisms; however, many of them have not yet been tested with high resolution experimental data to follow the dynamics of the system. This is a difficulty when using mathematical models, even if they have good potential to predict important insights.

The model calibration problem consists of finding a model to minimize the distance among model predictions and the experimental data. There exist several strategies for model calibration. One is the maximum likelihood. In this analysis, a probabilistic distribution in the noise is considered but without considering any uncertainty in the parameters. Another is Bayesian estimation, which introduces information about a prior probabilistic distribution of the parameters and noise.

We applied generalized least squares for our parameter optimization, which requires almost no prior information (Balsa-Canto et al., 2008). Prior to and after optimization, we performed both local sensitivity analysis (LSA) and global sensitivity analysis (GSA) (Rodriguez-Fernandez and Banga, 2010). LSA is usually performed to measure how sensitive the model is to small changes in the original parameter values that are first given. On the other hand, GSA is performed to measure how sensitive the model is to changes in the parameters over the full range of plausible values. The objective of performing the sensitivity analyses was to rank the parameters in order of importance for observation, then use the rank to assist in fixing parameters to improve practical identifiability.

In order to find necessary additional information through experiments, analysing the parameter sensitivity and checking the global ranking and identifiability are needed (Balsa-Canto and Banga, 2011). We used these results to design several rounds of parameter optimization. The objective of the ranking was to assess the importance of individual parameters. Several criteria have been suggested to locally rank parameters (Balsa-Canto and Banga, 2011). Relative local parametric sensitivities are computed for a number of *n*Ihs samples using the Latin Hypercube Sampling approach within parameter bounds to generalize it to a global rank (Balsa-Canto and Banga, 2011).

We performed bifurcation analysis to understand how the parameter calibration affected the behavior of the model (Ermentrout, 2002). Many numerical models, when applied to real biological systems, involve non-linearities that make possible the model's chaotic behavior and oscillation. At the same time, many models are difficult to solve analytically because of their complex structure. Numerical solutions have an advantage in such cases in that they can be used to perform further analyses with those models. The cell cycle model we chose shows oscillation as one of the characteristics of this model. Bifurcation analysis allowed us to test how the characteristics of the systems depend on the parameters. Two-parameter curves show us a range of parameters that may produce multiple states.

Here, we describe all the above investigation results and discuss the potential of parameter fitting to a sparse dataset to improve model behavior when representing physiological conditions. Finally, we discuss how to make further improvements with additional experiments and simulations.

### **METHODOLOGY**

### **MODEL AND DATA**

The model we used for our analyses was originally published by Yao et al. (2008) and was analyzed following the procedures listed below. A diagram of the reconstructed model is shown in **Figure 1**, and the differential equation set is shown in the Appendix. The experimental data, which we used for the parameter fitting, were produced as described in the Experimental Methods.

### **MODEL RECONSTRUCTION**

We reproduced Yao's 2008 Model with Cell Designer (Funahashi et al., 2008). The Yao 2008 model is in Biomodels.net (Chelliah et al., 2013) (no.318). We imported the Systems Biology Markup Language (SBML) (Hucka et al., 2004) file (BIOMD0000000318.xml) to CellDesigner, and then recreated it as a reaction network. All the kinetic laws, parameters, and annotations (RDF) from biomodels.net were kept in the model.

We modified the reaction network so as to be close to that described in Yao's study (Yao et al., 2008). The model consists of 7 ODEs (Appendix), thus there are 7 species (proteins) in our version of the reaction network (**Figure 1**). Nevertheless, there are only 5 proteins in Yao's network as shown in Yao's Figure 1 (Yao et al., 2008). We assume that this happened because they had omitted two of the reaction species in their figure to focus on the activation-inhibition process of the network to simplify the diagram; as a result, inactive proteins are not shown in their figure. We included these inactive reaction species to rebuild their model correctly.

In our reaction network, the above 5 proteins in Yao's Figure 1 (Myc, E2F, Rb, CycD, and CycE) are shown as "Active" proteins (which have dashed rectangles around the proteins), and the other 2 "Inactive" proteins (phosphorylated Rb and Rb-E2F

**FIGURE 1 | Reconstructed diagram of Yao's 2008 model (Yao et al., 2008) in Cell Designer (Funahashi et al., 2008).** Each square indicates protein, and rectangles with dotted lines indicate activated forms of those proteins. The whole the diagram is included inside a compartment, which represents a cell, with double yellow lines. White circles with a crossing line indicate a reactant source, and gray circles with a crossing line indicate waste. All edges correspond to the fluxes from a reaction species to the others.

complex) are required to express the original mathematical model (to be 7 ODEs). Highlighted reactions (colored in green, red, and black) in the model are mapped to the reactions in Yao's original figure. We confirmed that our modified model generates the same simulation results as the original BIOMD0000000318.xml.

### **ANALYSIS METHODS**

We used the Matlab toolbox Advanced Model Identification using Global Optimization (AMIGO) (Balsa-Canto and Banga, 2011), which includes options for local and global sensitivity analyses, local and global ranking of parameters, parameter estimation, and Fisher Information Matrix evaluation. XPPAUT (Ermentrout, 2002) was used for the basic simulation and bifurcation analysis of the model. In the following sections, we briefly describe each analysis.

### *Parameter optimization*

We performed model calibration by generalized least squares because the method does not require any prior information of the model. The generalized least squares is described as:

$$\mathbf{J}(\boldsymbol{\theta}) = \sum\_{\boldsymbol{\varepsilon}=\mathbf{l}}^{\mathbf{n}\_{\boldsymbol{\varepsilon}}} \sum\_{O=1}^{\text{nO}^{\boldsymbol{\varepsilon}}} \left( \mathbf{y}^{\boldsymbol{\varepsilon},\cdot} \boldsymbol{\Omega}(\boldsymbol{\theta}) - \boldsymbol{\chi} \mathbf{m}^{\boldsymbol{\varepsilon},\mathbf{O}} \right)^{\mathbf{T}} \mathbf{Q}^{\boldsymbol{\varepsilon},\cdot} \mathbf{y}^{\boldsymbol{\varepsilon},\cdot} \boldsymbol{\Omega}(\boldsymbol{\theta}) - \boldsymbol{\chi} \mathbf{m}^{\boldsymbol{\varepsilon},\mathbf{O}} \mathbf{y}^{\boldsymbol{\varepsilon}}$$

where Q is the quadratic cost function. In our case, we used "standard least squares" with constant variance. Briefly, this is encoded as

inputs.PEsol.PEcost\_type="lsq"; % "lsq" (weighted least squares default) |"llk" (log likelihood) |"user\_PEcost" inputs.PEsol.llk\_type="homo"; % to be defined for llk function, "homo" |"homo\_var" |"hetero"

where "lsq" indicates Weighted Least Squares Funtion. For the cases where no information about the experimental error is available, "homo" is given homoscedastic noise with known constant variance.

θ, which gives minimum *J(*θ*)*, is the least square estimator. This method can provide the best estimate for a linear model. Qε*, <sup>O</sup>* is a non-negative definite symmetric weighting matrix. The weighting coefficients ωε*, <sup>O</sup>*S*<sup>S</sup>* <sup>=</sup> <sup>1</sup>*,...,*nε*, <sup>O</sup>*<sup>S</sup> located in the diagonal of the matrix are positive or zero and fixed *a priori*. Basically, if ωS= 1, it means to assign the same level of importance to all data; if ωs= 0*,* it means a datum is eliminated because it is deemed not relevant; if <sup>ω</sup>S<sup>=</sup> max*(y*mε*, <sup>O</sup>)*2, the square of the maximum experimental data for the observable O and the experiment ε reduces the effect of having observations of different orders of magnitude. We used objective value in order to estimate if the parameter optimization improved fitting of the model to our experimental data. It is also mentioned frequently as residual standard error, and known if the value is exactly 0 then the model fits the data perfectly.

### *Local Sensitivity Analysis (LSA)*

Local (Relative) Sensitivity Analysis (LSA) was performed with AMIGO for the case of (a), with original parameter settings of Yao's model, and (b), optimized parameters with our experimental data, to rank the parameters in order of importance for the observable variables.

### *Rank parameters based on LSA*

The parameters were ordered according to the value of Sε*, <sup>O</sup> <sup>p</sup>*. We used the R programming language to produce the graphs of LSA results (R Development Core Team, 2008).

### *Global Sensitivity Analysis (GSA)*

Global Sensitivity Analysis (GSA) was performed to measure how sensitive the observables are to changes in the parameters over the full range of plausible values: (a), with default values of Yao's original model, and (b), with optimized parameters based on experiments. We assessed the importance of individual parameters and also ranked parameters based on the results of GSA, the criteria of which were originally suggested by Brun et al. (2001), but were extended to the formula shown below by Balsa-Canto et al. (Balsa-Canto and Banga, 2011). The result of parameter ranking based on GSA is indicated by the order of decreasing msqr, which is best suited to serve as a ranking criterion (Balsa-Canto and Banga, 2011).

msqris defined as:

$$\delta^{\rm msqr}\_{\rm P} = \frac{1}{\mathbf{n}\_{\rm Ihs}\mathbf{n}\_{\rm t}\mathbf{n}\_{\rm O}\mathbf{n}\_{\rm S}} \sqrt{\sum\_{\rm lhs=1}^{\mathbf{n}\_{\rm lhs}} \sum\_{\varepsilon=1}^{\mathbf{n}\_{\rm O}} \sum\_{\rm O=1}^{\mathbf{n}\_{\rm O}} \sum\_{\rm S=1}^{\mathbf{n}\_{\rm S}} \left(\mathbf{S}^{\varepsilon, \mathbf{O}}{}\_{\rm p}(\mathbf{t}^{\varepsilon, \mathbf{O}}{}\_{\rm p})\right)^{2}}$$

We used the R programming language to produce the graphs of GSA results (R Development Core Team, 2008).

### *Bifurcation analysis*

We performed Bifurcation analysis of Yao's model with (a) default and (b) optimized parameters by XPPAUT. Bifurcation analysis was based on the parametric dependence of dynamic systems encoded as differential equations. This approach is called the continuation method. Its name is derived from the fact that the number and type of steady states can vary as a function of one or more parameters. Typically, one starts with a stable steady state and then varies a particular parameter in very small increments and calculates the type of the steady state at the next point of parameter space. The parameter we used here was the stimulus, S. For the 2-dimentional bifurcations plots, we scanned S vs. the number of other parameters. We let XPPAUT scan the region around their default or their optimized values starting at a low stable steady state. We defined the range from 0.1 to 10 times their starting values for each parameter to test, and between 0.0 and 1.5–2.5 for the stimulus, S.

### **EXPERIMENTAL METHODS**

### *Cell culture and synchronization*

3Y1 rat embryonic fibroblasts were cultured in 5% CO2 at 37◦C in Dulbecco's modified Eagle's medium (DMEM) supplemented with 10% fetal calf serum (FCS) (Hiroi et al., 2006). Cell synchronization was performed by the thymidine double block (Hiroi et al., 1999). Exponentially dividing cells were incubated at 37±◦C for 18 h in medium containing 0.56 mM 20 deoxythymidine. Then the cells were washed with fresh DMEM-10% FCS without 20-deoxythymidine and then recultured for 15 h in drug-free medium. The cells were synchronized at the next G1/S boundary by incubating them for a further 15 h in medium containing 0.56 mM 20-deoxythymidine. After the removal of the second thymidine-block, cells were harvested at the indicated times and subjected to flow cytometry.

### *DNA flow cytometry*

DNA content was determined by flow cytometry. 5 <sup>×</sup> <sup>10</sup><sup>5</sup> cells were washed once in phosphate buffered saline (PBS) and fixed in 70% ethanol for 30 min on ice. The cells were centrifuged at 400 × *g* for 5 min, and the pellet was incubated at 37◦C for 20 min in 500μl of PBS containing 0.1 mg/ml RNase A. The cells were then pelletted and stained with 100 μl of 25μg/ml propidium iodide in PBS. Finally, the stained cells were suspended in 0.1% BSA/PBS and analyzed using a flow cytometer (Beckman-Coulter). The data were acquired and analyzed by the provided computer program (Beckman-Coulter, WinCycle). A sequence of single-parameter DNA histograms was analyzed to determine the proportions of cells in each phase.

### *Western blot detection*

Western blot analysis was performed as described (Hiroi et al., 2002). In preparation for western blotting, 5 <sup>×</sup> <sup>10</sup><sup>5</sup> cells were lysed in 100μl of radioimmunoprecipitation (RIPA) buffer (150 mM NaCl, 1% NP-40, 0.1% sodium dodecyl sulfate (SDS), 50 mM Tris-HCl (pH 7.5), 0.1 mM Na-orthovanadate, 0.1 mM NaF, 1 mM dithiothreitol (DTT), 1 mM phenylmethylsulfonyl fluoride, 1μg/ml pepstatin, 1μg/ml leupeptin, and 1μg/ml aprotinin). After a 10 min incubation on ice, lysed cells were centrifuged at 20 000 × *g* for 10 min at 4◦C. After adjustment of the protein concentration, the supernatants were used for western blotting. The proteins or control peptide for each target protein in SDS loading buffer (2% SDS, 10% glycerol, 60 mM Tris-HCl, 100 mM DTT, and 0.001% bromophenol blue) were boiled for 5 min, separated by SDS-polyacrylamide gel electrophoresis (16% polyacrylamide gels), and blotted onto Immobilon-P*SQ* membranes (Merck Millipore, Billerica, MA). Sample transfer was confirmed with gel staining (coomassie brilliant blue; CBB) and a secondary-layered backup membrane. The filters were blocked with 5% skim milk in Tris Buffered Saline with Tween-20 (TBS-T) (150 mM NaCl, 20 mM Tris-HCl (pH 7.6), 0.1% Tween-20) for 100 min and incubated with primary antibodies (diluted 1:1000 to 1:2000 with 5% skim milk in TBS-T) for 1 h at room temperature. The filters were then washed, incubated for 1 h at room temperature with the secondary antibody (sheep anti-mouse or donkey anti-rabbit) conjugated with horseradish peroxidase (Amersham Biosciences, Piscataway, NJ), and washed with TBS-T. Immunoblotted bands were detected by using the ECL system (Amersham Biosciences, Piscataway, NJ) with the same exposure time for all uses of a particular antibody.

### **RESULTS**

### **MODEL CALIBRATION; THE FIRST ROUND OF PARAMETER FITTING TO EXPERIMENTAL DATA**

We performed model calibration with the generalized least squares method using multi-start solver, which mimics Monte-Carlo sampling of the initial parameter guesses.

For this study, we used the protein amount of cyclin D and cyclin E at each phase in the cell cycle. Additionally, we used the protein amount of total Rb (Supplemental Figure 1). The parameter fitting was performed for 12 parameters of 3 reaction species (cyclin D, cyclin E, and total Rb in nuclei; equals the sum of hypoand hyper-phosphorylated Rb).

We chose part of the parameters for optimization because (1) in Yao's original paper, they indicated that a part of the model parameters comes from experiments, so we decided to keep the original values, and (2) the other 12 parameters were estimated via numerical tests. We used these parameters for the fitting to our experimental data. And (3), the aim of using only a part of the parameters for fitting was to reduce error in the process of parameter estimation.

The original parameter set is shown in **Table 1**, middle column, and the results of optimization of the parameter values are shown in **Table 1**, right-most column. The time-course of each molecule with the original (A) and new parameter sets after the first round of parameter fitting (B) are shown in **Figure 2**. The optimized parameter produced closer curves to experimental data than the simulation results with the original parameter set. Now we performed local and global sensitivity analyses to test if these 12 parameters changed the sensitivity of the model to estimate how this parameter fitting affected the sensitivity of the model.

### **LOCAL SENSITIVITY ANALYSIS (LSA)**

We performed LSA with published parameter values (**Figure 3A**, blue line) and with the 1st set of optimized parameters (**Figure 3A**, red line), and calculated the ratio between default and optimized in order to visualize the changes in local sensitivity of **Table 1 | The original and 1st sets after parameter optimization.**


*A minus sign means the same value as the original.*

parameters (**Figure 3B**). LSA was performed for all 24 parameters in the model. The sensitivity analyses showed that the parameter optimization of the time-course data induced less sensitivity except for the parameters related to metabolism of transcription factors c-Myc (dM, kM, and kkM) and E2F (dE).

### **GLOBAL SENSITIVITY ANALYSIS (GSA) OF OBSERVABLES**

Next, we performed GSA with the original and optimized parameters. We compared the sensitivities of 12 identified parameters and newly optimized parameters (**Figure 4**).

The result showed that the optimized parameters were less sensitive, except for one parameter related to c-Myc activity. These two kinds of parameter sensitivity analyses suggested a specific role for the transcription factors compared with the other reaction species in the model, the cyclins.

Next, we performed bifurcation analyses with the original parameter set and the 1st optimization parameter set to investigate the effect of parameter fitting to the model behaviors.

### **FIRST BIFURCATION ANALYSIS**

We performed bifurcation analyses to investigate how parameter optimization using time-course data changed the dynamical characteristics of the model. The result with the original parameter set showed two bifurcation points, the so-called saddle nodes where the stable and unstable (blue and red, respectively) meet (**Figure 5A**). Bistability and hysteresis can be recognized in the model behaviors. On the other hand, the 1st set of optimized parameters showed transcritical bifurcation, i.e., a stable steady state becomes unstable and vice versa (**Figure 5B**). This means that the bimodality had been lost after the parameter optimization. This result further suggests that the key molecules to overcome the R-point, which are components of the model, seem to accumulate in the cell, and theoretically, cells that can no longer stop the accumulation by optimizing the parameter values convert to a malignant condition. Even if such conditions could actually be induced in a malignant cell, the cell line we used maintains contact inhibition and does not proliferate in an anchorage-independent manner.

Next, we performed a second parameter optimization by reconsidering the optimization target based on the results of our own sensitivity analyses and knowledge about *in vivo* biochemical reactions, and examined whether the newly optimized parameter set would rescue the model bimodality.

New biochemical insights were found by Aoki et al. (2011), where they showed that in the *in vivo* phosphorylation process, a target molecule that has two possible phosphorylation residues must have a different phosphorylation process than that *in vitro*. Based on this knowledge, we selected the parameters kP1 and kP2, which relate to Rb phosphorylation. At the same time, we excluded 4 parameters (KCE, kkCD, kRE, and KS) because of their low sensitivities in the results of both LSA and GSA. We aimed by this exclusion to produce a parameter set that had less sensitivity.

### **NEW ROUNDS OF PARAMETER OPTIMIZATION AND SENSITIVITY ANALYSES**

We included the results of sensitivity analyses and performed a 2nd parameter optimization. The optimized parameters are indicated in **Table 2**, and the fitting results are shown in **Figure 6**. We performed sensitivity analyses with these 2nd sets of optimization parameters (**Figure 7**). Both LSA and GSA showed less sensitivity in total than the 1st set of optimized parameters. We used this 2nd set of optimized parameters for further bifurcation analyses.

### **SECOND BIFURCATION ANALYSIS**

We performed a second bifurcation analysis with the newly optimized set of parameters (**Figure 8**). The 2nd set of optimized parameters showed bistability with a narrower range of hysteresis (**Figure 8B**). This result suggests that the sensitivity changed less than the original, but the model behavior changed to be more sensitive to the change of the extracellular stimulus level (S).

To investigate the bistable properties of the optimized model in more detail, we performed a two-parameter space analysis (**Figures 8C–K**). These results showed that the Rb and c-Myc active-inactive state changes could happen with relatively small amounts of extracellular stimuli (**Figures 8C,E,I**). These changes may affect the behavior of the two key cyclins, cyclin D and cyclin E. CyclinD is independent from the activity of E2F, and cyclin E is dependent on the activity of E2F. Cyclin D is required in an earlier stage of the cell cycle than cyclin E. Together, these results suggest that by fitting the model to a malignant cell, the model behaves such that cyclin D levels can easily accumulate with a small amount of extracellular stimuli, but once cyclin E starts to accumulate, there is no mechanism to stop the cell cycle. This could mean that the R-point does not work properly in the cell line we used.

result, cross: experimental result), lower panel shows phosphorylated (brown line), dephosphorylated (green line) and their sum (black line) of simulation data, with experimental result (black cross). The three species were fitted to

parameters are: "kRE," "kkE," "kkM," "kCDS," "kR," "KS," "kkCE," "KE," "KCE," "degRP," "kkCD," and "kb." Optimized parameters are shown in **Table 1**, right-most column. The objective value for the fit in **(B)** is 1.18.

This raises the question as to why the model behaved more sensitively after parameter optimization of the growth factor stimuli than in the original condition. Nevertheless, the parameters were optimized into less sensitive conditions. We designed and performed another parameter optimization to check if this alternation of model behavior was correlated with the sensitivity.

#### **BISTABILITY INDEPENDENT OF GLOBAL SENSITIVITY**

We performed another parameter optimization in order to address parameter sensitivity and whether the bimodality of this model has causality. We optimized low sensitive parameters based on the sensitivity analyses results of the original parameter set (Supplemental Figure 3; dM, KM, kkM, dE, kRE, kR, dR, degRP, dCE, KCE, kkCE, kkCD, kCDS, KS). **Figure 9** shows the timecourse of 3 fitted species, and the optimized parameter values are listed in **Table 3**. The third optimization process allowed to make objective value smaller than the first round result (objective value of the first round parameter fitting: 1.18; objective value of the third round parameter fitting: 0.67). Even the fitting of siumulated curves to the experimental data were improved, the results of LSA indicated that we could not increase sensitivity at any parameter among the 24 (**Figure 10A**). On the other hand, GSA results showed that some parameters are more sensitive compared to the original parameters (3 parameters among 9 comparable parameters), and the 1st set of optimized parameters (7 parameters among 8 parameters) (**Figure 10B**). We performed bifurcation analyses with this parameter set; however, we did not see bistability of this model with the third set of optimized parameters. This result suggests that model bistability does not depend on the global sensitivity of parameters.

#### **DISCUSSION**

We showed our results of model fitting to sparse time-course data. Generally, even if the data can cover only some of the variables, parameter optimization can change the model behavior to be different than the original. In our case, the original model indicated a

**FIGURE 4 | Comparison of global parameter rank of the original parameter set with the 1st optimized parameter set.** The blue line indicates the result of local sensitivity analysis with the original parameter set, and the red line indicates the result with the 1st optimized parameter set. (optimized → opmitized).

healthy proliferating mechanism in that case R-point should work strictly. On the other hand, cancer cells are believed not to have proper R-point mechanisms; as a result, a cell can overcome the R-point with a small amount of growth factors. Our results show that at least some cancer cell-like properties can be produced via parameter optimization to time-course data of malignant cell lines (**Figure 8**).

We tested if the bistability of the model is correlated with the sensitivity of the parameters, because we aimed to reduce the parameter sensitivities by optimization to make the model behavior robust against parameter changes; however, the range of hysteresis had been reduced via parameter optimization, and as a result, the bistability of the model became unstable with a small change of extracellular stimuli (S, **Figure 8**). Our results did not suggest that the bistability of this model is dependent on the parameter sensitivity. Moreover, our results, which suggest the significance of the transcription factors and different behaviors of cyclin D and cyclin E, may indicate that the bistability of the cell cycle machinery could depend more on the strict context of the activation processes of these molecules.

The choices of the parameters for the second optimization were based on the results and the hypothesis by Aoki et al. (2011). Our idea is if we accept their hypothesis, the reason why *in vivo* specific double phosphorylation process happens is intracellular crowding. And it is independent from the specific molecular binding such as anchorage protein for MAPK. Then, the hypothesis should stand generally for *in vivo* double phosphorylation of single substrate. Therefore, we re-optimized the parameters of double phosphorylation processes of Rb. On the other hand, Rb protein has many other phosphorylation sites (Rubin et al., 1998). More than 10 phosphorylation sites of this protein had been counted. The parameter values may be different for each reaction of phosphorylation. However, we possibly estimate the difference would not affect to the critical behaviors of the model, such as bistability, etc. Because we have assumed that the multi-phosphorylation step of single substrate is a linear system, instead of a system, which shows switch-like, non-linear behavior, based on the results of Aoki's 2012 (Aoki et al., 2011). In this case, we may contract these multiple reactions into shorter steps as follows. When the first phosphorylation step shows linear process, and double phosphorylation also, and further, too, these reaction schemes are characteristically the same with a signal cascade which simply activates the next reaction species

#### **Table 2 | Optimization results.**


*Newly optimized 1: normal bounds; newly optimized 2: the values are those estimated with smaller bounds (increasingly enlarged where necessary) used for estimation; newly optimized 3: "Km" included.*

sequentially. We may describe this type of signal cascade with the first species and the last species with single activation reaction. Multiple-phosphorylation case is the same the sequentially activating cascade if the systems is essentially linear. We may describe the whole reaction process with the first site and the last site, and it seems double-phosphorylation reaction. We cannot eliminate the possibility that the reaction step includes actually multi-phosphorylaitons over two, then the parameter value may be multiplied into some other value. However, the change will not make strong impact to the bistable behavior of the entire model.

There could be another reason why the model property changes via parameter optimization, which is a more specific condition. One possible reason for the change of bifurcation behavior and its consequences is the difference of cell synchronization method of the fitting materials. By comparison with Yao's Supplemental Figure 2, however, the synchronization level of our sample seems the same or better than that of their cells (Supplemental Figure 1); therefore, this may not be the reason for that weak bistability is produced. This means that we may not simply conclude that the cellular synchronization condition affected the behavior of the optimized model. On the other hand, the timing of synchronization seems different between Yao's experimental data and ours, and this could affect the bistable property. The cells we used showed quicker cell cycle than the case of Yao's experiments. This is consistent with the results

**FIGURE 8 | Bifurcation analyses with the 2nd set of optimized parameters. (A,B)** the results of bifurcation analysis with the original parameter set and the 2nd optimized set. **(C–K)** Two parameter space analyses. All x-axes indicate *S* values. The y-axis of each graph indicates **(C)** degRP.



of bifurcation analyses, which showed the smaller jump and hysteresis from a state to the other, which means overcoming cell cycle checkpoint, in this case R-point, and moving to the next phase, in the words of cell cycle. The loose restriction at R-point could results short cycle of cellular proliferation.

We had found there exist three different types of parameter conditions in the correlation with the model bistability; one improved fit of cyclin D.

is the original (default) condition by Yao's work. The condition produces clear bistability. The second condition is the 2nd round parameter set in this paper or the parameter set for Supplemental Figure 4, which can produce narrow range of bistability. The last is which produced the best fitting results to our time-course data of Cyclin D (3rd round of this paper) or E (Supplemental Figure 5), however the both of these parameter sets could not produce bistability. Among our limited results, the following 4 parameters showed straightforward trends as the condition to reproduce bistability of the model. kRE contributes bistability when it takes only the value 180 ∼ 194, both less or larger than it cannot produce bistability. As same as the case of kRE, kCDS can take less value than 4.926, kkCE can take less value than 1.1414, KCE can take less value than 1.0793 to reproduce bistability of the model. These parameters affect almost all of the time course of molecular concentration except c-Myc ([MC]). This may happen according to the characteristics of our material cells, Rat fibroblast 3Y1. This cell line does not express c-Myc before receiving the depletion signal of growth factor in culturing medium (Tsuneoka et al., 2003). We need to investigate both the theoretical properties of the model and biological data to make them consistent with each other.

These results indicate that even sparse and noisy experimental data can be used to improve a mathematical model by fitting to those data. In the case of Yao's model and our experiments, the parameter optimization allowed the model to adapt

to physiological (cancer cell) conditions, even though the experimental data did not include enough information to identify the whole the parameter set, but instead suggested one relevant set of parameters to reduce the sensitivity against changes and to maintain bistability.

When we need to identify the whole parameter set, we should add more experimental data for other molecules, or perform more optimization with a different set of initial conditions. Partial evidence for the potential of changing initial conditions was shown in our several rounds of parameter optimization (**Figures 2**, **7**, **9**). We could produce better fitting to the experimental data by performing several rounds of parameter optimization; however, at the same time, the new parameter set changed the model behavior fundamentally (**Figures 5**, **8**), and the possible causes may involve changes in the dynamics of molecules that lack experimental evidence. This means that providing experimental data for those molecules which have not yet provided experimental data for fitting would improve parameter optimization.

In this study, we did not perform practical identifiability analysis to consider if the model unknowns may be uniquely estimated under given experimental conditions. The results from practical identifiability may helpful to assess parameter estimate reliability and to compare possible experimental designs. Such analysis is especially important to improve experimental design. To perform this analysis, we need to be careful with noise. Fortunately, however, a lack of practical identifiability is not critical for its solvability. Adequate global optimization solvers can be employed to deal with the presence of suboptimal solutions.

In total, our results showed that optimizing parameters by using experimental data is useful to get the model closer to physiological conditions, even if experiments have not yet fully shown the effect on the targeting system. At the same time, we need enough resolution from experiments to provide good identifiability for the model parameters.

In the future, we will perform Optimal Experimental Design (OED) to determine a dynamic scheme of the measurements that generates the richest information in order to estimate parameters with greater precision. To provide measurements that maximize the quantity and quality of the information provided by the experiments while minimizing the experimental burden is the desired goal to connect practical experimental information with mathematical models of molecular mechanisms.

### **ACKNOWLEDGMENTS**

We are grateful to Prof. Hiroaki Kitano (The Systems Biology Institute, Tokyo, Japan) for allowing us to use the experimental data that we produced while working under his supervision.

### **SUPPLEMENTARY MATERIAL**

The Supplementary Material for this article can be found online at: http://www.frontiersin.org/journal/10.3389/fphys.2014.00 128/abstract

### **REFERENCES**


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 15 September 2013; accepted: 14 March 2014; published online: 04 April 2014.*

*Citation: Hiroi N, Swat M and Funahashi A (2014) Assessing uncertainty in model parameters based on sparse and noisy experimental data. Front. Physiol. 5:128. doi: 10.3389/fphys.2014.00128*

*This article was submitted to Systems Biology, a section of the journal Frontiers in Physiology.*

*Copyright © 2014 Hiroi, Swat and Funahashi. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

### **APPENDIX**

### **ODE EQUATIONS**

The following is the full ODE system for the Yao 2008 model. S stands for the systems' forcing function, in the form of the serum concentrations, which here is a constant with values for the whole duration of the experiment/simulation of 0.5 and 3%.


# Phosphoproteomics-based systems analysis of signal transduction networks

### *Hiroko Kozuka-Hata, Shinya Tasaki and Masaaki Oyama\**

*Medical Proteomics Laboratory, Institute of Medical Science, University of Tokyo, Minato-ku, Tokyo, Japan*

#### *Edited by:*

*Kumar Selvarajoo, Keio University, Japan*

#### *Reviewed by:*

*Bhawana Agarwal, Medical College of Wisconsin, USA Jeffrey Varner, Cornell University, USA*

*\*Correspondence:*

*Masaaki Oyama, Medical Proteomics Laboratory, Institute of Medical Science, University of Tokyo, 4-6-1 Shirokanedai, Minato-ku, Tokyo 108-8639, Japan. e-mail: moyama@ims.u-tokyo.ac.jp*

Signal transduction systems coordinate complex cellular information to regulate biological events such as cell proliferation and differentiation. Although the accumulating evidence on widespread association of signaling molecules has revealed essential contribution of phosphorylation-dependent interaction networks to cellular regulation, their dynamic behavior is mostly yet to be analyzed. Recent technological advances regarding mass spectrometry-based quantitative proteomics have enabled us to describe the comprehensive status of phosphorylated molecules in a time-resolved manner. Computational analyses based on the phosphoproteome dynamics accelerate generation of novel methodologies for mathematical analysis of cellular signaling. Phosphoproteomics-based numerical modeling can be used to evaluate regulatory network elements from a statistical point of view. Integration with transcriptome dynamics also uncovers regulatory hubs at the transcriptional level.These omics-based computational methodologies, which have firstly been applied to representative signaling systems such as the epidermal growth factor receptor pathway, have now opened up a gate for systems analysis of signaling networks involved in immune response and cancer.

**Keywords: signal transduction, phosphoproteomics, quantitative proteomics, computational modeling, systems biology**

### **INTRODUCTION**

Signal transduction networks are known to regulate complex biological events in orchestration with subsequent transcriptional regulation (Hunter, 2000; Schlessinger, 2000). Previous in-depth analyses on cell signaling under a variety of experimental conditions have revealed many of the key molecules and related events that result in each biological effect. Regarding the intensively studied signaling systems such as the epidermal growth factor (EGF) receptor pathway, the accumulated experimental evidence has clearly demonstrated the complexity of the interaction network involved in the signaling (Oda et al., 2005; Jones et al., 2006). As phosphorylation-dependent protein interaction networks play a major role in transmitting signals, a comprehensive and fine description of their status would contribute substantially toward understanding the regulatory mechanisms at the system level. Recent proteomics technology based on high-resolution mass spectrometry (MS) has enabled us to quantitatively describe the activation dynamics on phosphorylated signaling molecules in a comprehensive and unbiased manner (Blagoev et al., 2004; Zhang et al., 2005; Olsen et al., 2006; Oyama et al., 2009). Computational systems analysis based on the phosphoproteome dynamics data paves the way to theoretical approaches for defining regulatory principles that govern complicated signaling processes. Some statistical methodologies including mathematical modeling (Tasaki et al., 2006, 2010), Bayesian network (Bose et al., 2006; Guha et al., 2008), or partial least square regression (Wolf-Yadlin et al., 2006; Kumar et al., 2007) have already been applied to EGF signaling. An integrated approach based on both phosphoproteomic and transcriptomic data has also revealed a global view of cellular

regulation at the transcriptional level (Oyama et al., 2011). In this article, we introduce the recent progress of proteomics-driven computational analyses applied to the signaling behavior of representative biological pathways and the potential impact on the system-level analyses of heterogeneous signaling networks related to immune response and cancer.

### **EMERGENCE OF HIGH-THROUGHPUT PHOSPHOPROTEOMICS TECHNOLOGY FOR LARGE-SCALE IDENTIFICATION AND QUANTIFICATION OF CELLULAR PHOSPHORYLATED MOLECULES**

Recent advancement in liquid chromatography–tandem mass spectrometry (LC–MS/MS) measurement technology has greatly improved the throughput and sensitivity of protein measurements. We can now identify thousands of proteins in a single study (Brunner et al., 2007; de Godoy et al., 2008). In order to efficiently describe the status of phosphorylated molecules, a variety of biochemical methodologies have been developed for their enrichment. Immobilized metal affinity chromatography (IMAC; Stensballe et al., 2001; Ficarro et al., 2002), strong cation exchange (SCX) chromatography (Ballif et al., 2004; Beausoleil et al., 2004), metal oxide chromatography (MOC; Pinkse et al., 2004; Larsen et al., 2005) were intensively evaluated as core analytical methodologies in the previous reports. For targeting tyrosine phosphorylation, anti-phosphotyrosine antibodies were applied to efficiently purify the corresponding molecules (Rush et al., 2005). Through these sophisticated enrichment methods, current shotgun proteomics technology based on high-resolution LC–MS/MS has enabled the detection of thousands of phosphorylated molecules

from representative cell lines such as human HeLa cells (**Figure 1**; Olsen et al., 2006, 2010).

Another important advance in MS-based systems analysis is development of protein/peptide labeling strategies for quantitative proteomics (**Figure 1**). Several methodologies for *in vivo*/*in vitro* labeling have been established for relative quantification of the activation status of signaling molecules. The representative *in vivo* protein labeling methodology termed stable isotope labeling by amino acids in cell culture (SILAC) can be conducted by incorporating distinguishable stable isotopes into specific amino acid residues such as lysine and arginine during cell culture (Ong et al., 2002, 2003). Another approach to introduce differential labels *in vitro* is chemical tagging of specific amino acid residues such as cysteine. The isotope-coded affinity tag (ICAT), which consists of a cysteine-directed reactive group, a linker with stable isotope signatures, and a biotin tag, is applied to purify labeled peptides by biotin–avidin affinity (Gygi et al., 1999; Han et al., 2001). As for amine-directed tagging, the isobaric tag for relative and absolute quantitation (iTRAQ) enables comparative quantification of four or eight samples in a single analysis (Ross et al., 2004).

By combining these technologies, time-resolved activation profiles of ligand-induced phosphoproteome were depicted in a quantitative manner (**Figure 2**). The original approach to describe phosphotyrosine-dependent signaling dynamics led to the identification of 81 effectors in human HeLa cells upon EGF stimulation (Blagoev et al., 2004). The global phosphoserine/threonine/tyrosine-related proteome analysis for the EGF signaling system in the same cell line yielded a networkwide view of the dynamic behavior of 6,600 phosphorylation sites on 2,244 proteins (Olsen et al., 2006).

In a recent study, a highly time-resolved description of EGF/EGFR signaling was measured in human epithelial A431 cells (Oyama et al., 2009). The quantitative activation data on the EGF-regulated tyrosine-phosphoproteome were measured at 10 time points after EGF stimulation (0, 0.5, 1, 2, 5, 10, 15, 20, 25, and 30 min), generating a detailed view of their multi-phase network dynamics. In this study, temporal perturbation of the signaling dynamics was also conducted with a kinase inhibitor to clearly distinguish between sensitive and robust pathways to this treatment. This approach showed that phosphoproteomics-based time-resolved description of the network dynamics functioned as an analytical basis for evaluating temporal perturbation effects in relation to specific signaling interactions, leading us to obtain a system-level view of the regulatory relationships in signaling dynamics.

### **COMPUTATIONAL MODELING OF SIGNAL TRANSDUCTION NETWORKS BASED ON QUANTITATIVE PHOSPHOPROTEOME DATA**

Although phosphoproteomics-based temporal description of signaling networks provides system-level information on dynamic regulation of signal transduction via phosphorylation/dephosphorylation, the most important challenge for elucidating the mechanistic aspects of signal transduction is the

establishment of statistical methodologies for performing computational modeling with increasing species, states, and reactions over the signaling network. In a recent study, some computational frameworks have been developed for analyzing flux-based signaling information on quantitative phosphoproteomics data (**Figure 3**). In the initial approach, self-organizing maps were applied to identify EGF signaling modules based on time-resolved description of 78 tyrosine phosphorylation sites on 58 proteins in human mammary epithelial 184A1 cells (Zhang et al., 2005). The cells with varying human ErbB2 (HER2) expression levels were further analyzed to characterize HER2-mediated signaling effects on cell behavior (Wolf-Yadlin et al., 2006). Partial least squares regression (PLSR) was applied to estimate the phosphotyrosine clusters exhibiting self-similar temporal activation profiles, leading to identification of the signals that were strongly correlated with cell migration and proliferation and could function as a "network gage" of cell fate control (Wolf-Yadlin et al., 2006; Kumar et al., 2007).

Bayesian network modeling based on multiple sets of quantitative phosphoproteome data could generate probabilistic networks that represented core aspects of the models with a directed graph of influence on protein phosphorylation. In combination with the literature-based protein–protein interaction data on the EGFR/ErbB signaling, this statistical approach not only recapitulated known portions of the signaling pathways but also inferred novel relationships between the related molecules (Bose et al., 2006; Guha et al., 2008). In a recent study, a computational framework based on data assimilation was also developed for analyzing mutated EGFR signaling through phosphoproteomics-driven numerical modeling (Tasaki et al., 2010). The hybrid functional petri net with extension (HFPNe) is a computational modeling architecture which can deal with discrete biological events as well as continuous ones and enables us to analyze temporal data on biological entities such as phosphorylated signaling molecules within the data assimilation framework. The HFPNe-based computational modeling of aberrant EGFR signaling led to reduction of the factors responsible for mutational effect to several alterations in the reaction parameters and provided a mechanistic description of the disorders of their cell signaling networks at the system level.

Phosphoproteome dynamics data can be integrated with the transcriptome dynamics to analyze the regulatory mechanisms more systematically. In a very recent study, time-resolved phosphoproteome and transcriptome data on 17β-estradiol (E2) and heregulin (HRG)-induced signal-transcription programs were quantitatively analyzed to elucidate regulatory pathways in breast cancer signaling (Oyama et al., 2011). Reconstruction of protein interaction networks based on the phosphoproteome data shed light on the activated signaling molecules over the network, while statistical evaluation of transcription factor-binding site motif significance for the entire gene expression data led us to focus on the core transcriptional regulators. Functional association of these factors using pathway databases revealed ligand-dependent

signal-transcription regulatory programs in both of wild type and drug-resistant breast cancer MCF-7 cells, leading us to extract the pathways activated in drug-resistant cells.

### **FUTURE PROSPECTS**

Recent advances in proteomics technology have presented us with a system-wide view of phosphorylation-dependent signaling network dynamics in a quantitative manner. Mathematical analysis of phosphoproteomics-based networks will lead to a better understanding of the critical factors controlling network behavior and provide a computational platform to explore potential drug targets for specific disease conditions and theoretically estimate the effect of the corresponding drugs on a network-wide scale prior to clinical application. As signaling network structures depend on cellular context (Morandell et al., 2008), cell-specific signaling network architectures need to be described independently using phosphoproteomics to characterize the behavior of each signaling system.

### **REFERENCES**

Ballif, B. A., Villén, J., Beausoleil, S. A., Schwartz, D., and Gygi, S. P. (2004). Phosphoproteomic analysis of the

developing mouse brain. *Mol. Cell. Proteomics* 3, 1093–1101.

Beausoleil, S. A., Jedrychowski, M., Schwartz, D., Elias, J. E., Villén, Although this emerging technology has been applied to only a limited fraction of signaling networks including the EGFR pathway, further accumulation and integration of phosphoproteome data on heterogeneous immune and cancer signaling networks should accelerate elucidation of general and condition-specific principles that govern signaling network behavior and pave the way to understanding complex cellular responses from a systems perspective.

#### **ACKNOWLEDGMENTS**

We gratefully acknowledge our colleagues at the Institute of Medical Science, University of Tokyo for helpful discussions and comments. This work was supported by Grants-in-Aid for Scientific Research (C) and Scientific Research on Innovative Areas from Japan Society for the Promotion of Science (JSPS) and The Ministry of Education, Culture, Sports, Science and Technology (MEXT).

J., Li, J., Cohn, M. A., Cantley, L. C., and Gygi, S. P. (2004). Large-scale characterization of HeLa cell nuclear phosphoproteins.

*Proc. Natl. Acad. Sci. U.S.A.* 101, 12130–12135.

Blagoev, B., Ong, S. E., Kratchmarova, I., and Mann, M. (2004). Temporal

analysis of phosphotyrosinedependent signaling networks by quantitative proteomics. *Nat. Biotechnol.* 22, 1139–1145.


spectrometry. *Nat. Biotechnol.* 19, 946–951.


13C-substituted arginine in stable isotope labeling by amino acids in cell culture (SILAC). *J. Proteome Res.* 2, 173–181.


Inoue, J., Yamamoto, T., Miyano, S., Sugano, S., and Oyama, M. (2010). Phosphoproteomics-based modeling defines the regulatory mechanism underlying aberrant EGFR signaling. *PLoS ONE* 5, e13926. doi:10.1371/journal.pone.0013926


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 20 August 2011; accepted: 13 December 2011; published online: 03 January 2012.*

*Citation: Kozuka-Hata H, Tasaki S and Oyama M (2012) Phosphoproteomicsbased systems analysis of signal transduction networks. Front. Physio. 2:113. doi: 10.3389/fphys.2011.00113*

*This article was submitted to Frontiers in Systems Physiology, a specialty of Frontiers in Physiology.*

*Copyright © 2012 Kozuka-Hata, Tasaki and Oyama. This is an open-access article distributed under the terms of the Creative Commons Attribution Non Commercial License, which permits noncommercial use, distribution, and reproduction in other forums, provided the original authors and source are credited.*

# Why do CD8+T cells become indifferent to tumors: a dynamic modeling approach

### *Colin Campbell 1,***†***, Ranran Zhang2,***†***, Jeremy S. Haley3, Xin Liu4,Thomas Loughran4,Todd D. Schell 3, Réka Albert <sup>1</sup> and Juilee Thakar 1\**

*<sup>1</sup> Department of Physics, The Pennsylvania State University, University Park, PA, USA*

*<sup>2</sup> Duke-NUS Graduate Medical School Singapore, Singapore*

*<sup>3</sup> Department of Microbiology and Immunology, The Pennsylvania State University College of Medicine, Hershey, PA, USA*

*<sup>4</sup> Penn State Hershey Cancer Institute, The Pennsylvania State University, College of Medicine, Hershey, PA, USA*

#### *Edited by:*

*Kumar Selvarajoo, Keio University, Japan*

#### *Reviewed by:*

*Gabor Balazsi, The University of Texas MD Anderson Cancer Center, USA Thomas Dandekar, University of Wuerzburg, Germany*

#### *\*Correspondence:*

*Juilee Thakar, Department of Pathology, Yale University School of Medicine, 300 George Street, Suite 505, New Haven, CT 06511, USA. e-mail: juilee.thakar@yale.edu, jthakar@phys.psu.edu*

†*Colin Campbell and Ranran Zhang are joint first authors.*

CD8+T cells have the potential to influence the outcome of cancer pathogenesis, including complete tumor eradication or selection of malignant tumor escape variants. The Simian virus 40 largeT-antigen (Tag) oncoprotein promotes tumor formation inTag-transgenic mice and also provides multiple target determinants (sites) for responding CD8+ T cells in C57BL/6 (*H-2b*) mice. To understand the *in vivo* quantitative dynamics of CD8<sup>+</sup> T cells after encountering Tag, we constructed a dynamic model from *in vivo*-generated data to simulate the interactions between Tag-expressing cells and CD8+ T cells in distinct scenarios including immunization of wild-type C57BL/6 mice and of Tag-transgenic mice that develop various tumors. In these scenarios the model successfully reproduces the dynamics of both theTag-expressing cells and antigen-specific CD8+T cell responses. The model predicts that the tolerance of the site-specific T cells is dependent on their apoptosis rates and that the net growth of CD8+ T cells is altered in transgenic mice. We experimentally validate both predictions. Our results indicate that site-specific CD8+ T cells have tissue-specific apoptosis rates affecting their tolerance to the tumor antigen. Moreover, the model highlights differences in apoptosis rates that contribute to compromised CD8+ T cell responses and tumor progression, knowledge of which is essential for development of cancer immunotherapy.

**Keywords: CD8+T cells,T-antigen, tumor, dynamic model, apoptosis and proliferation rates**

### **INTRODUCTION**

Tumors are masses of host cells containing both genetically unstable cancer cells and supporting host cells, including cells of the immune system. Tumor progression causes destructive pathogenesis within the host and ultimately death. Tumor antigens, particularly those that are unique to the tumor, can elicit an adaptive immune response (Qin and Blankenstein, 2000; Patel and Chiplunkar, 2009; Xu et al., 2009; Behboudi et al., 2010). In particular, CD8+ T cells (TCD8s) can eliminate continuously arising nascent transformed cells, inhibit carcinogenesis, and maintain cellular homeostasis under normal conditions; this process is known as immunosurveillance (Dunn et al., 2002; Schreiber et al., 2004). Tumors, nevertheless, can escape immunosurveillance through both antigenic loss and the promotion of immunosuppression, which leads to their progression. The immune response to cancer is often studied either by tumor implantation or by inducing autochthonous tumor formation in specific tissues in mice. Tumor development can be induced in transgenic mice by expressing oncoproteins under tissue-specific promoters. Transgenic mice which develop autochthonous tumors are especially interesting since the tumor antigens are often self antigens derived from non-mutated cellular proteins and tumor formation occurs over an extended period of time, reproducing some of the immunological roadblocks which limit effective immunotherapy.

Dynamic models of tumor–immune interactions have provided insights into the processes leading to immune response failure during tumor progression. Such models have been applied to study the effect of immunotherapeutic approaches (Day et al., 2006; Castiglione and Piccoli, 2007; Kirschner and Tsygvintsev, 2009) and to characterize the various stages of the immune response to infection and cancer, with particular focus on TCD8s (De Boer et al., 2003; Bocharov et al., 2004; Antia et al., 2005; de Pillis et al., 2005). These models highlight the importance of a variety of features of the immune response to tumors, including the density of tumor antigen, the duration of the interaction between MHCI-peptide complexes and the T cell receptor (TCR), TCD8 activation rates, immunological memory, and recruitment of precursor cells. Our study provides a unique perspective relative to previous studies, in that we focus on a comparative analysis of T cell responses in tumor-bearing versus wild-type (WT) mice.

The Simian virus 40 (SV40) large T-antigen (Tag) is a potent virus-encoded oncoprotein that can transform a variety of cell types (Ahuja et al., 2005). The oncogenic activity of Tag stems from its ability to inactivate tumor suppressor proteins (Rb and p53) as well as to initiate cell cycle progression (Butel and Lednicky, 1999). Tag can induce responses by MHC-I-restricted TCD8s, as is observed for other tumor antigens. Four unique Tag determinants recognized by TCD8s have been defined in C57BL/6 mice (sites I, Campbell et al. Quantitative study of tumor pathogenesis

II/III, IV, and V; Mylin et al., 2000; Tevethia and Schell, 2001). The TCD8 response to these four determinants forms a quantitative hierarchy in which site IV, for which the most T cells accumulate, is immunodominant followed by subdominant responses to sites I and II/III. The response to site V, however, is immunorecessive, as responding TCD8s are only detected following immunization with Tag variants lacking the three dominant determinants, or after site V-specific immunization (Tanaka et al., 1989; Fu et al., 1998). The expression of Tag as a self-antigen within Tag-transgenic mice can lead to TCD8s unresponsiveness by mechanisms promoting both central and peripheral tolerance (Tevethia and Schell, 2001).

We construct a dynamic model describing tumor progression, elimination of tumor cells, TCD8 expansion, and decay in the context of the TCD8 response to SV40 Tag. We developed this model to describe the TCD8 response in both WT C57BL/6 mice responding to immunization, where Tag-expressing cells are eliminated, and in mice that express Tag as a transgene, leading to the development of autochthonous tumors and TCD8 tolerance. For the purposes of this model we define tolerance as the absence of a functional T cell response in the presence of tumor antigen. In transgenic mice, site-specific TCD8s become tolerant at different time points (Theobald et al., 1997; Morgan et al., 1998; Colella et al., 2000; Nugent et al., 2000; Cordaro et al., 2002; Otahal et al., 2006; Fujimura et al., 2010) and this characteristic behavior is not seen when the WT mice encounter the Tag as a foreign antigen. Thus, the model parameterized to reproduce this observation gives insights into the characteristics of TCD8s that are changed during tumor development.

Our model quantitatively reproduces the TCD8 response to immunization with Tag-transformed cells in WT mice (Pretell et al., 1979; Mylin et al., 2000) and the qualitative behavior of the TCD8 response in mice bearing Tag-induced pancreatic tumors (Otahal et al., 2006), osteosarcoma (Schell et al., 2000), or brain tumors (Schell et al., 1999). The model reveals strong constraints on the proliferation rates and decay rates of TCD8s during tumor formation. Additionally, the model gives insight into how the activation, proliferation, and apoptosis rates of TCD8s impact the expansion and contraction phase during the site-specific immune response in normal mice, as well as during tissue-specific responses and development of tolerance. Our results indicate that though the inherent characteristics of the site-specific T cell clones are different, the overall TCD8 response dynamics are surprisingly similar when encountering antigen in different tissues. We predict and experimentally validate inequalities in the activation and decay rates of the TCD8 responding to unique determinants. We also theoretically predict the rate constants leading to tumor formation, the apoptosis rates of different TCD8 clones, the peaks of TCD8 activity in various tumor models and the mechanism of tolerance.

### **MATERIALS AND METHODS DYNAMIC MODEL**

Our model describes the growth of tumor by modeling the Tagexpressing tumor/malignant cells (*M*) and their removal by site I-, II/III, IV-, and V-specific TCD8s (*Ti*; **Figure 1**). The basic model has five ordinary differential equations and assumes that the cells form well mixed populations. Since the dynamics after tumor clearance are not considered in the current study, the memory T cell state is

not modeled. The site-specific TCD8s are initially activated against the tumor cells and are subsequently suppressed by the increasing tumor size in addition to their apoptosis. Tag-expressing cells exponentially proliferate at rate *r* which was estimated from the initial growth phase of the tumors (see e.g., Mallet and De Pillis, 2006). The four site-specific TCD8s are assumed to kill tumor cells (*M*) at the same rate *b*; however, *b* is modulated by a Michaelis– Menten function such that the rate of killing of tumor cells stabilizes when the tumor size increases above β (**Figure A1** in Appendix).

Tag-expressing cells present four determinants I, II/III, IV, and V that are recognized by MHCI molecules. MHCI-peptide complexes are then recognized by TCR, which leads to the differentiation of naive cells and subsequently the recruitment of TCD8s to the tumor site. In the model, Tag-expressing cells induce a proportion *ni* of competent site-specific TCD8s at a rate proportional to *c*. Based on our recent experimental results (T. Schell, unpublished) a 0.3/1/4 ratio for site V/I/IV TCD8 activation from naïve T cells was used to determine *ni*. Since the experimental observations could be reproduced by using the same value of the proliferation rate *c*, it was kept the same for all site-specific TCD8s. Thus *nic* represents the activation rate of TCD8s from precursor cells and *c* represents the proliferation of TCD8s. TCD8s undergo natural death at rate *wi* which is different for the four determinants. The immunogenicity of the determinants is decided by the site-specific activation and decay rates. Tumor cells alter the microenvironment leading to

the suppression of the TCD8 response (Ganss and Hanahan, 1998; Gajewski et al., 2006). This suppression is modeled as a negative modulation of the proliferation and activation rates in response to a higher number of tumor cells. In the case of proliferation, the effect is modeled with a Michaelis–Menten function which sets proliferation to 0 for *M* = 0 and saturates at *c* for *M* >> σ. The negative modulation of activation rates is modeled in the opposite way, using a "repressive" Michaelis–Menten function: for *M* << α*i*, activation occurs at a constant rate *ni*; for *M* >> α*i*, it asymptotically approaches 0. In the absence of spatial compartments, inhibition of the activation of new TCD8s represents the suppression of the recruitment of the TCD8s at the site of tumor which can occur by diffusible cytokines and chemokines produced by the tumor cells.

In Tag-transgenic mice both central (Faas et al., 1987; Schell et al., 1999; Colella et al., 2000; Zheng et al., 2002) and peripheral tolerance (Ye et al., 1994; Schell et al., 2000; Cordaro et al., 2002; Otahal et al., 2006) is observed in response to tumors. Tolerance is modeled by the absence of functional TCD8s even in the presence of the Tag-expressing cells. In central tolerance self-reactive T cells are deleted during development in the thymus and there is no recruitment at the site of tumors, which can be modeled by setting the value of *ni* to 0. During peripheral tolerance self-reactive T cells that escape to the periphery are maintained in a state of unresponsiveness or are deleted following activation. The gradual process of peripheral tolerance is simulated by the tumor-induced suppression of TCD8s and by retraining the parameter values (explained in Section "Results").

Thus the dynamics of tumor cells (*M*) and TCD8s specific for site *i* (*Ti*) are given by

$$\begin{aligned} \frac{dM}{dt} &= rM - \frac{M}{M+\beta} \sum\_{i} bT\_i\\ \frac{dT\_i}{dt} &= c \frac{M}{M+\sigma} \left( n\_i \left( 1 - \frac{M}{M+\alpha\_i} \right) + T\_i \right) - \boldsymbol{w}\_i T\_i, \\ &\quad \mathbf{i} \in [\mathbf{I}, \mathbf{II}/\mathbf{III}, \mathbf{IV}, \mathbf{V}] \end{aligned}$$

Note that though a constant source of site-specific precursor cells (constant *ni*) was used in the model, their effective number is not unlimited. In the WT model precursor cells stop differentiating after the clearance of Tag-expressing cells, and in the tumor models activation of new TCD8s is inhibited by large tumors (so that ≈15 TCD8s are activated for 1000 tumor cells) and has a minimal effect after the first few days.

### **SIMULATED IMMUNIZATIONS**

Immunization of transgenic mice was modeled, similarly to Kirschner and Panetta (1998), by introducing a variable *si* that follows the clearance dynamics of Tag-transformed cells injected in WT mice by obeying the relation *dsi*/*dt* = −γ*Si*. In the presence of immunization, the *T* equation becomes

$$\begin{split} \frac{dT\_i}{dt} &= \left[ (c + s\_i) \frac{M}{M + \sigma} \right] \left[ n\_i \left( 1 - \frac{M}{M + \alpha\_i} \right) + T\_i \right] \\ &- \left. \dot{w}\_i T\_i, \quad i \in \{ \text{I, II/III, IV, V} \} \right] \end{split}$$

### **NULLCLINE ANALYSIS**

The nullcline analysis (**Figures A2B–E** in Appendix) is performed to study the effect of parameter values on the temporal trajectories of the tumor cells and site-specific TCD8s (**Figures 2B–E**). The nullcline analysis finds the equilibrium of the system when *M* and *Ti* are in a steady state (*dM/dt* = *dT/dt* = 0). Considering a single type of site-specific TCD8 for simplicity the equations become:

$$\begin{aligned} \frac{dM}{dt} &= 0 = rM - \frac{M}{M+\beta}bT\\ \frac{dT}{dt} &= 0 = c\frac{M}{M+\sigma}\left(n - n\frac{M}{M+\alpha} + T\right) - \omega T \end{aligned}$$

Solving these equations in terms of *M* and *T* yields

$$\begin{aligned} M &= \frac{b}{r}T - \beta\\ T &= \frac{c n \alpha M}{\left(M + \alpha\right)\left(M(w - c) + \sigma w\right)} \end{aligned}$$

For the biologically accepted range of β (>0) in our model, *M* =*T* = 0 is an unstable steady state. That is, for small number of tumor cells (*M*) and TCD8s (*T*), the value of tumor cells

will always increase in the tumor model. Additional biologically significant steady states are discussed in Section "Results."

### **SUMMARY OF PUBLISHED EXPERIMENTAL DATA**

We used published experimental data corresponding to TCD8s derived from the spleen of WT C57BL/6 mice and three Tagtransgenic mouse strains that develop distinct tumors. Data corresponding to splenic TCD8 responses were used since they are available for all models and are representative for the systemic response to the antigen. In the case of brain tumors, T cell accumulation at the tumor site is correlated with spleen dynamics with a time lag (Ryan and Schell, 2006).

Tag-specific TCD8 cells are measured by MHC tetramer staining after encountering antigen which correlates with the number of TCD8s that produce IFNγ upon peptide-specific *in vitro* stimulation (Mylin et al., 2000). The published data and our own were obtained from two independent sets of experiments that were quantitatively different. *In vivo* model systems are often difficult to standardize. Hence, to account for lab-specific differences and to align the data, we scaled the data from Mylin et al. (2000) by a multiplicative factor that maximized the agreement.

The number of endogenous TCD8s in Tag-transgenic mice can be very low and difficult to detect. Hence induction of detectable TCD8 responses can be stimulated by immunization of mice with Tag-transformed cells expressing full-length WT Tag or Tag variants in which specific determinants have been eliminated by mutagenesis. If immunization is not sufficient to induce the T cell response, due to deletion of T cell precursors during T cell development, then splenocytes from WT mice, which contain naïve TCD8s as well as other immune cell subsets, are injected (adoptively transferred) into the Tag-transgenic mice. In such experiments, saturating amounts of splenocytes were given to achieve maximal T cell response.

### **CHEMICALS AND REAGENTS**

All chemical reagents were purchased from Sigma-Aldrich (St Louis, MO, USA). RPMI-1640 with Glutamax and fetal bovine serum (FBS) were purchased from Invitrogen (Carlsbad, CA, USA). Benzonase® Nuclease was purchased from EMD Chemicals (San Diego, CA, USA). Annexin V Apoptosis Detection Kit, rat anti-mouse CD16/CD32 Fc block, Cytofix/Cytoperm, PermWash, fluorescein isothiocyanate (FITC), phycoerythrin (PE), allophycocyanin (APC) or APC-Cy7-labeled anti-mouse CD8α, PE-labeled anti-CD90.1, and FITC-labeled anti-5-bromo-2-deoxyuridine (BrdU) antibody were purchased from BD Biosciences (San Jose, CA, USA).

### **ANIMALS**

Male and female C57BL/6 (*H-2b*) mice (4–6 weeks old) were purchased from The Jackson Laboratory (Bar Harbor, Maine) and routinely used between the ages of 7 and 12 weeks. SV11 (H-2b) mice express full-length SV40 T Ag under the control of the SV40 promoter (Brinster et al., 1984). Line SV11 mice were maintained by breeding hemizygous Tag transgene+ males with C57BL/6J females and transgene positive animals identified as previously described (Schell et al., 1999). TCR-IV transgenic mice expressing the TCRα and TCRβ chains specific for Tag site IV have been previously described (Tatum et al., 2008) and were maintained by breeding transgene positive males with B6.PL-*Thy1a*/CyJ females. All mice were maintained in the animal facility at the Pennsylvania State University College of Medicine, Hershey, PA, USA and experiments were performed under guidelines approved by the Institutional Animal Care and Use Committee.

### **IMMUNIZATION AND IDENTIFICATION OF SITE-SPECIFIC TCD8s**

The B6/WT-19 cell line was derived previously by transformation of B6 mouse embryo fibroblasts with WT SV40 strain VA45-54 (Pretell et al., 1979; Tevethia et al., 1980). Production and characterization of the Db/Tag site I (Db/I), Kb/Tag site IV (Kb/IV), Db/influenza virus (Flu) nucleoprotein (NP) 366–374 (Db/Flu), and Kb/HSV gB 498–505 (Kb/gB) PE-conjugated tetramers were described previously (Mylin et al., 2000). For immunization, <sup>5</sup> <sup>×</sup> 107 live B6/WT-19 cells were injected by the intraperitoneal route. For adoptive transfer, SV11 or transgene negative mice were injected intravenously with lymphocytes derived from TCR-IV transgenic mice containing 5 <sup>×</sup> <sup>10</sup><sup>5</sup> clonotypic site IV-specific TCD8s. For tetramer staining, mouse spleens were harvested at the indicated time points post immunization and processed to single cell suspensions as previously described (Schell et al., 1999). Erythrocyte-depleted splenocytes were washed twice in PBS– FBS [PBS supplemented with 2% (vol/vol) FBS], resuspended at <sup>2</sup> <sup>×</sup> 107 cells/ml in PBS–FBS, and incubated with rat anti-mouse CD16/CD32 (33 mg/ml) for 15 min on ice. Following incubation, cells were washed once in PBS–FBS and resuspended in fluorescence-activated cell sorter (FACS) buffer [PBS–FBS supplemented with 0.1% (wt/vol) sodium azide]. Aliquots containing <sup>2</sup> <sup>×</sup> 106 cells were prepared and the appropriate MHC tetramer plus anti-mouse CD8α antibody were added. Alternatively, TCR-IV transgenic T cells were identified according to their surface expression of CD90.1. In this case, cells were incubated with anti-CD90.1 antibody at room temperature for 15 min as well as anti-CD8 and MHC tetramer to minimize non-specific staining. Proliferation and apoptosis analysis focused on the population of CD8+, Tetramer+, and CD90.1+ cells. Cells were then resuspended in FACS buffer and kept on ice or processed into apoptosis assay prior to flow cytometry.

### **APOPTOSIS ASSAY**

<sup>2</sup> <sup>×</sup> 106 erythrocyte-depleted and MHC tetramer- and/or anti-CD90.1-stained cells were incubated with conjugated Annexin V and 7-AAD (1:100 dilution) in 100μl 1× Annexin V staining buffer for 15 min at room temperature in the dark. Cells were immediately assessed by flow cytometry (BD FACSCalibur or FAC-SCanto). At least 10000 events were collected in the live cell gate and analyzed for Annexin V and 7-AAD staining. Annexin V negative, 7-AAD positive cells were considered non-viable and excluded from further analysis.

### *IN VIVO* **5-BROMO-2-DEOXYURIDINE INCORPORATION ASSAY**

Mice received a 1-mg dose of 1 mg/ml BrdU solution (diluted in PBS) 3 h before sacrifice by intraperitoneal injection at the indicated times post immunization. Splenocytes were stained for BrdU incorporation using a modified staining protocol (BD Biosciences). Briefly, 2 <sup>×</sup> 106 splenocytes were stained with MHC tetramers and anti-mouse CD8α as described above. Cells were then resuspended in 100μl of Cytofix/Cytoperm (Becton Dickinson) and incubated for 30 min at room temperature. The cells were washed once with 1× PermWash, resuspended again in 100μl of Cytofix/Cytoperm and incubated for 10 min at room temperature. Cells were washed again and resuspended in 100μl Cytofix/Cytoperm and incubated for 5 min at room temperature. After one wash, cells were incubated at 37˚C for 1 h with 20 U Benzonase nuclease in 100μl DPBS with 1 mM MgCl2 and washed once. Cells were then stained with 5μl of FITC-labeled anti-BrdU antibody (eBioscience) in 40μl 1× PermWash for 20 min at room temperature. Cells were washed and then fixed with 2% paraformaldehyde in PBS and analyzed by flow cytometry as above.

### **STATISTICAL TESTS**

We performed the Welch two-sample *t*-test to assess whether the ratio of the percentage of site IV-specific TCD8s proliferating in WT to the similar percentage in SV11 mice is significantly greater than the similar ratio of the apoptotic cells. Three data points were taken from the WT mice and four data points were taken from SV11 mice. To construct the two groups for the statistical test we used the percentage of proliferating and apoptotic cells in all combinations in which WT values were in the numerator.

### **RESULTS**

### **OVERVIEW OF TUMOR GROWTH (***M***) AND TCD8 DYNAMICS**

To study the characteristics of TCD8s modulation during tumor development we developed a dynamic model of the interactions between tumor cells (*M*) and TCD8 (*Ti*) cells elicited in response to the four unique determinants of Tag. In this section we discuss the repertoire of dynamical behaviors that emerged from the model by describing the site IV-specific TCD8s in response to the growing tumor (refer to Materials and Methods and Appendix for details). Tag-expressing cells exponentially proliferate at rate *r* and are killed by TCD8s at a rate that grows with tumor size, until the tumor size becomes large compared to β (**Figure A1** in Appendix); after which the killing rate saturates to a rate *b*. Site IV-specific TCD8s are activated against the tumor cells and proliferate but their differentiation is subsequently suppressed by the increasing tumor size and they also undergo natural death at rate *w*iv. Proliferation and activation of TCD8s is suppressed by large tumors and the suppression is modeled by Michaelis–Menten functions parameterized by σ and α*<sup>i</sup>* respectively.

A repertoire of dynamical behaviors emerged from the model including clearance of Tag-expressing cells, as in WT mice, or tumor formation, as in Tag-transgenic mice. If Tag-expressing cells are cleared, the T cell response contracts (**Figure 2A**). Tagexpressing cells and T cells can also reach homeostasis (**Figure 2B**), as is observed when mice are immunized prior to the development of Tag-induced pancreatic tumors (Otahal et al., 2006). In some cases, the number of TCD8 and Tag-expressing cells fluctuate for extended periods of time before reaching a steady state (**Figure 2C**). If the TCD8 response is incapable of controlling the proliferating Tag-expressing cells, they increase exponentially, and TCD8s either undergo tolerance, becoming unresponsive (**Figure 2D**), or themselves expand continuously (**Figure 2E**).

To understand the effect of the parameter values on the dynamic behavior we performed a nullcline analysis (refer to Materials and Methods and Appendix for details). The nullcline analysis provides the long-term outcome resulting from the trajectories of change in the concentrations of Tag-expressing cells and TCD8s. The clearance scenario (**Figure 2A**; **Figure A2A** in Appendix) shows increasing initial trajectories of Tag-expressing cells and TCD8s. The increasing efficiency of TCD8-mediated killing of tumor cells leads to first a slowing down of the increase, and later a decrease in tumor cell numbers (modeled by the value of *M*). When the number of Tag-expressing cells is below a certain threshold value given by the nullcline analysis, TCD8s begin decreasing and eventually Tag-expressing cells and TCD8s are depleted. In the clearance scenario the TCD8 response is maximized by setting the Michaelis– Menten constant α*i*, which models the effect of tumor size on the recruitment of new TCD8s, higher than the Tag-expressing cells. Though lower values of α*<sup>i</sup>* are used in the tumor models, increasing α*<sup>i</sup>* cannot clear the tumor cells because it stimulates the differentiation of effector cells from naïve T cells which is a linear process and has a limited role in the control of exponential tumor growth (**Figure A1** in Appendix).

The nullcline analysis identified the salient parameters that drive the system from one behavior type to another (**Figures 2B–E**). Unlike in the clearance scenario in which tumor cells are cleared, a larger value of Michaelis–Menten constants β and/or σ, which model the effect of tumor size on the TCD8 activity, launch a protective TCD8 response characterized by a steady state like behavior (**Figure 2B**; **Figure A2B** in Appendix). Further increase in these Michaelis–Menten constants (β and/or σ) leads to extended oscillations of TCD8s and tumor cells (**Figures 2C** and **3**; **Figure A2C** in Appendix). As expected, decreased tumor growth (lower *r*) and increased TCD8-mediated killing (*b*) (**Figure 2E**; **Figure A2E** in Appendix) pushes the system toward tumor clearance. The extreme cases when *r* < 0 are not relevant in the systems that lead to tumor formation. In the following sections we consider the dynamics of all the determinants in which case site-specific TCD8 apoptosis rates and inhibition by tumor cells lead to the characteristic TCD8 response against each determinant.

Though various parameters can affect the fate of the tumors and consequently the TCD8 response, only the net growth of TCD8s, described by the difference between the rate of activation (*c*) and cell death (*wi*), decides the tolerance behavior (unresponsiveness of the TCD8 cells). Decrease in the net growth transitions the system from a clearance state (if *c* − *wi* is positive and large) observed in WT mice to a tumor state observed in transgenic mice (if *c* − *wi* is negative; see **Figure 2D**; **Figure A2D** in Appendix). Thus to reproduce the tolerance of TCD8s observed in Tag-transgenic mice *wi* is assumed to be greater than *c* in all tumor models. Our parameter analysis (**Figure A3** in Appendix) indicates a large deviation from the experimental results when the above condition on the net growth of TCD8s is not implemented.

#### **TCD8 NET GROWTH AND TOLERANCE**

The TCD8 response to Tag in WT mice and in transgenic mice bearing tumors provides two unique immunological environments which are modeled by assuming positive and negative net growth (*c* − *wi*) of TCD8s in WT and transgenic mice respectively. The fact that the TCD8 response is strong in WT mice but is undetectable in transgenic mice (Schell et al., 2000; Otahal et al., 2006) lends support to this assumption. The early expression of antigen in Tag-transgenic mice might lead to a lower rate of expansion (*c*) of TCD8s, allowing tumors to grow. Indeed, it is experimentally observed that high amounts of antigen reduce the proliferation of TCD8s during tolerance (Ganss and Hanahan, 1998).

To validate our hypothesis of the positive versus negative net growth of TCD8s in WT versus transgenic mice we experimentally assessed proliferation and apoptosis of TCD8s in both WT mice and Tag-transgenic mice bearing brain tumors (Schell et al., 1999). It is difficult to obtain the absolute value of the rate of activation (*c*) and the rate of apoptosis (*wi*) through experimentation, which renders it improper to directly compare the experimental values. However, we noticed that if the net growth is positive in WT (*w*WT < *c*WT) and negative in transgenic mice (*w*Tumor > *c*Tumor), it implies that for a specific TCD8 clone, the ratio of proliferation rates in WT to transgenic mice is greater than the similar ratio of apoptosis rates [(*c*WT/*c*Tumor) > (*w*WT/*w*Tumor)]. The converse

implication is satisfied if the rates of T cell proliferation are comparable in WT and tumor-bearing mice. To test this relationship, we utilized TCD8s from TCR transgenic mouse line TCR-IV in which 90% of the TCD8s were specific for site IV (Tatum et al., 2008). Splenocytes isolated from TCR-IV transgenic mice were adoptively transferred into groups of 3∼4 WT or SV11 mice before immunization with Tag-expressing cells. The percentage of proliferating and apoptotic TCR-IV cells recovered from the recipients was assessed through *in vivo* BrdU incorporation assay (**Figure 3B**) and Annexin V apoptosis assay (**Figure 3C**) on days 3 and 4 after immunization, respectively. These data were subsequently used to estimate the ratio of proliferation (P ratio – *c*WT/*c*Tumor) and apoptosis (A ratio – *w*WT/*w*Tumor) rates of site IV-specific TCD8s in WT to transgenic mice. As shown in **Figure 3D**, the value of the proliferation ratio is higher than the apoptosis ratio for both time points, and the difference is statistically significant [*P* < 0.032 (*t* = 2.07) at 3 days and *P* < 0.0001 (*t* = 4.57) at day 4 after immunization]. Moreover, different trends of change in the frequency of TCR-IV cells in WT and brain tumor-bearing transgenic mice

significant, there was an increase of site IV-specific TCD8s percentage in WT mice, but a decrease of site IV-specific TCD8s percentage in SV11. Site IV-specific TCD8s was defined as CD8+, Tetramer IV+ (indicating T cells specific for epitope IV), and CD90.1+ (indicating T cells derived from the TCR-IV mouse line). **(B)** The percentage of proliferating site IV-specific TCD8s in total CD8+ T cells of WT and SV11 mice after immunization as assessed by

proliferation ratio) and the same ratio of percentages of apoptotic site IV-specific TCD8s (A ratio: apoptosis ratio) for both time points. Each circle indicates one possible ratio between WT data and SV11 data at the same time point, either proliferation ratio (•) or apoptosis ratio (◦). Proliferation ratio is higher than the apoptosis ratio at day 3 (\**P* < 0.032, *t* = 2.07) and day 4 (\*\**P* < 0.0001, *t* = 4.57).

between days 3 and 4, with a slight increase in TCR-IV T cells in WT and slight decrease in TCR-IV T cells in transgenic mice (**Figure 3A**) suggest that the rate of apoptosis is higher than the rate of proliferation in the presence of tumors. The results suggest that the net growth in TCD8s is negative in transgenic mice, while positive in WT mice, assuming that the proliferation rate of the TCD8 clones in WT and transgenic mice is in a comparable range. Next, we discuss the dynamics of site-specific TCD8s in WT and transgenic mice.

### **RESPONSE OF WT MICE TO IMMUNIZATION WITH TAG-TRANSFORMED CELLS**

We used our own experimental data (closed symbols in **Figure 4**) and the data from (Mylin et al., 2000; open symbols in **Figure 4**) to model the behavior of the TCD8 response to the four H-2brestricted Tag determinants following immunization with Tagtransformed cells in WT mice. Mice were immunized with 5 <sup>×</sup> 107 SV40 Tag-transformed cells and the site-specific TCD8 response was analyzed by staining with site-specific MHC tetramers at the indicated time points. The simulations of the TCD8 activity were then fit to the experimental data (**Figure 4**).

**Figure 4** shows that TCD8s specific for site IV are highest in numbers followed by TCD8s specific for site I and then TCD8s specific for site II/III. The model can reproduce the observation that the addition of excess site I-specific precursor cells reverses the hierarchy so that the number of TCD8s specific to site I is higher than that of site IV-specific TCD8s (Tatum et al., 2010). This

The response of site I-, II/III-, IV-, and V-specific TCD8s is shown by the solid, dashed, dash-dot, and dotted lines, respectively. The gray arrows represent the terms in the mathematical model dominating the dynamics. The data from Mylin et al. (2000); empty symbols) and current study (filled symbols) representing site I (squares), II/III (triangles), and IV (diamonds) specific TCD8s is shown. The data from Mylin et al. (2000) is scaled by a multiplicative factor to minimize the variation between the two experiments. The initial conditions were, TCD8s = 0, and Tag-expressing cells (non-proliferating) = 106. The parameter values are *n* = 5 × 0.3/1/4 proportion for site V/I/IV respectively, *b* = 0.50/day, *c* = 1.08/day, *w*<sup>I</sup> = 0.04/day, *w*II/III = 0.09/day, *w*IV = 0.02/day, *w*<sup>V</sup> = 1.02/day, α*<sup>i</sup>* = 1E100 cells.

result indicates that the dominance hierarchy, at least among the three most dominant determinants, is impacted by the precursor frequency. However, the undetectable numbers of site V-specific TCD8s are not explained by a lower precursor frequency but due to their smallest net growth upon exposure to the antigen. These cells become detectable upon immunization with mutated Tag expressing only site V determinant because of the availability of tumor cells to drive their proliferation. For the fit shown in **Figure 4** at the peak of the response, the hierarchy of the Tag site-specific TCD8 response is dependent on the site-specific activation of new TCD8s and the apoptosis rates. The proliferation rate (*c*) is kept the same for all site-specific TCD8s since it was not required to be different to reproduce the observed data in **Figure 4**.

The TCD8 dynamics can be characterized by an initial linear conversion of naïve cells into activated TCD8s in response to tumor cells, followed by exponential growth of the TCD8s. As a result, the number of tumor cells rapidly decreases toward 0 (not shown). In simulations, tumor cells are cleared around day 10, which coincides with the experimentally observed time after which minimal effector TCD8 are proliferating. When tumor cell numbers become negligible, the contraction phase of TCD8s begin which is dominated by exponential decay (at the rate *wi*). In this phase, the relationships between the rates of apoptosis (*wi*) of site-specific TCD8s can be estimated.

TCD8s targeting the dominant site IV and subdominant site I make up 80% of the Tag-specific TCD8 response. The model predicts that the only way to reproduce the highest and most prolonged site IV-specific TCD8 response is to have a lower apoptosis rate for site IV-specific TCD8s compared to site I-specific TCD8s (*w*IV < *w*I). To test this novel prediction of site-specific apoptosis rates, we immunized groups of three WT mice by intraperitoneal injection of C57BL/6-derived Tag-transformed cells. The percentage of apoptotic and proliferating TCD8s specific for site I and IV was assessed 9 and 14 days after immunization as estimations of the respective rates of apoptosis and proliferation. In addition, the percentage of site I-specific and site IV-specific TCD8s in splenocytes was assessed 7, 9, 14, and 23 days after immunization. Since the model estimates the relationship between the two rates of apoptosis when the activation of the TCD8s is minimal, we tested our prediction after day 14 when no significant proliferation was observed in the experiments (**Figure 5A**). As shown in **Figure 5B**, the percentage of apoptosis for site I-specific *TCD8* was significantly higher than for site IV-specific TCD8s (*P* < 0.012). A differential cell death rate was also reflected in **Figure 5C**, which showed the percentage of site-specific TCD8s in splenocytes. While the percentage of site IV-specific TCD8s remained consistent after day 9, there was a drop in the percentage of site I-specific TCD8s between day 9 and day 14. Taken together, these results suggest that site I-specific TCD8s undergo cell death at a higher average rate compared to site IV-specific TCD8s, explaining the prolonged high level accumulation of site IV-specific T cells. We note that the same rates of proliferation for site I- and IV-specific TCD8s are used in the model; a parsimonious assumption which is also supported by **Figure 5A**. However, day 9 is at the end of the expansion phase and many parameters in addition to the proliferation rates may play a role in explaining the observations (**Figures 5A,B**).

### **CD8+ T CELL RESPONSE IN TRANSGENIC MICE**

The three Tag-transgenic systems that autonomously develop tumors vary in the lifespan of the mice, metastasis of the tumors, responsiveness to immune-therapies and tolerance of TCD8s. The computer simulations were run to cover the duration of the life span for each transgenic mouse line. The simulated tumor phenotype reproduces the exponential tumor growth and TCD8 expansion during the early period of antigen expression, followed by an unresponsiveness of TCD8s. The tolerance onset time is different for site-specific TCD8s in different transgenic mouse lines. This information was used to parameterize the apoptosis rates (**Table 1**; **Figure A3A** in Appendix) and all the other parameters were kept the same across different tumor models. **Figures 6A,B** reproduce the sequential loss (tolerance onset) of site-specific TCD8s in pancreatic tumor (Otahal et al., 2006) and osteosarcoma (Schell et al., 2000) models, respectively. Tissue specific death rates for TCD8s against each site (*wi*) reproduce the correct tolerance onset in different transgenic mouse lines. We observed interesting regularities between the rates of apoptosis. For example, the apoptosis rate of site I-specific TCD8s is always greater than that of site

IV-specific TCD8s as seen in the WT model. Thus while the specific values of the parameters can be different, the unavoidable similarities pointed out by our dynamic model can improve our understanding of the tumor growth and TCD8 response.

Changes in the other parameters including the growth rate of tumor cells (*r*) and the rate of TCD8 mediated killing of tumor cells (*b*) were not necessary to reproduce any of the observations, though they could affect the modeled tumor size. The unresponsiveness of TCD8s targeting particular determinants was confirmed by the absence of TCD8 activation after simulating *in silico* immunization with Tag-expressing cells (modeled by *si*; **Figure 7**). In the brain tumor model (Schell et al., 1999) TCD8s specific for the three most dominant Tag sites undergo central tolerance; hence we only see the TCD8s specific for the immunorecessive site V which remain responsive throughout the lifespan of these transgenic mice. However, the simulations indicate that the activity of these TCD8s is lower than in WT mice (**Figure 6C**).

Next, we used our model to predict the peak of TCD8 accumulation, since tumor treatment is most effective when active TCD8s are high. The tumor as well as WT models predict that site I TCD8s peak

**FIGURE 5 | Site I- and site IV-specific TCD8s have different apoptosis kinetics after immunization. (A)** Percentage of proliferating cells assessed by BrdU proliferation assay and **(B)** Annexin V positive cells indicating apoptotic cells among site I (filled) and site IV (empty) specific TCD8s. The site I-specific TCD8s undergoing apoptosis are significantly higher

than site IV-specific TCD8s (\**P* < 0.012). **(C)** Percentage of site I-specific TCD8s (-) and site IV-specific TCD8s () among total splenocytes 7, 9, 14, and 23 days after immunization. Comparing to site IV-specific TCD8s, there is a drop in the percentage of site I-specific TCD8s between days 9 and 14.


#### **Table 1 | Model parameters and description.**

earlier than site IV TCD8s (Mylin et al.,2000). In osteosarcomas,site I-specific TCD8s reach higher numbers than in pancreatic tumors (**Figure 6B**). The peaks in TCD8 accumulation in the osteosarcoma model occur at later time points (at 18 days for *T* CD8I and 24 days for *T* CD8IV) as compared to the peaks of site I-specific TCD8s (4th day) and site IV-specific TCD8s (7th day) in the pancreatic model (compare **Figures 6A,B**). The brain tumor model (**Figure 6C**) predicts an earlier peak of site V-specific TCD8s compared to the osteosarcoma model (days 23 and 30, respectively) since TCD8s targeting the dominant sites are absent, resulting in increased availability of antigen. Thus our model can detect the differences in the timing of the peaks in different models.

In conclusion, the tumor models give insight into the tissue specificity, for example revealing that osteosarcomas elicit a stronger response as compared to pancreatic tumors. All tumors inhibit activation of naïve cells and reduce the net growth of TCD8 cells but differences in the apoptosis rates of the recruited TCD8s is a critical factor in determining a tissue-specific response.

**(***b* **= 3.5, β = 100) and subjected to** *in silico* **site IV immunizations of equal strength on (A) day 35 and (B) day 180.** Insets **(A,B)** show tumor cells on a log scale. Site I-specific TCD8s is shown with a dashed line and site IV-specific TCD8s with a solid line (TII/III and TV have negligible values). The early immunization results in a steady state of tumor cells and site IV-specific TCD8s, whereas the later immunization fails to significantly influence tumor growth.

#### **TUMOR CONTROL**

Tumors can be controlled by various immune therapies including immunization with tumor antigen and the adoptive transfer of immune cells (Hersey, 2010; Moschella et al., 2010; van den Broek et al., 2010). To gain insight into the mechanisms of tumor control we simulated a known case of control of pancreatic tumors upon immunization (Otahal et al., 2006). In this study early immunization on day 35 can prevent pancreatic tumor appearance whereas immunization after day 180 cannot prevent tumor formation. Our simulations reproduce this behavior. Simulated immunization leads to a sharp increase in site IV-specific TCD8s above the endogenous levels, followed by a decrease that correlates with a drop in tumor cells. Experimentally it is expected that the TCD8 response upon immunization surpasses the response generated against the endogenous tumor. Thus the model could simulate the effect of early and late immunization in case of pancreatic tumors.

However, in transgenic mice Tag-expressing cells persist in lower numbers in case of early immunization so that the pancreatic functions are not disrupted. To reproduce this observation the rate of TCD8 proliferation, the maximal rate of TCD8 mediated killing of tumor cells, and the threshold number of tumor cells that inhibit killing have to be increased. After increasing these parameters the dose of immunization could be decreased and the control of tumors with early immunization (**Figure 7A**) but not late immunization (**Figure 7B**) was reproduced in the pancreatic tumors (**Figure 7**, inset). Thus our model suggests that exposure to external antigen in transgenic mice facilitates the detection of tumor cells by TCD8s and the immunogenicity of Tag.

### **DISCUSSION**

The current work sheds light upon mechanisms that determine TCD8 tolerance onset and the characteristics leading to the sitespecific TCD8s response. Both the dynamic model and the experimental data show that the apoptosis rates of TCD8 clones are context dependent and that the response in WT C57BL/6 mice is quantitatively stronger than the response in Tag-transgenic mice. This weaker response in Tag-transgenic mice aids the establishment of tolerance in the presence of progressing tumors. We hypothesize that the weaker response is driven by the immediate encounter of the peripheral T cells to Tag expressed on the tumor leading to undetectable TCD8 numbers in the transgenic mice. **Figure 3** supports this hypothesis by showing a higher apoptotic population in tumor-bearing mice. In fact, in SV11 mice transferred cells encounter tumor antigen earlier than Tag from the cells used for immunization, suggesting that an early encounter with the endogenous tumor antigen also limits response to the exogenous antigen (Ryan and Schell, 2006). In the case of natural tumors, it is possible that the net growth of TCD8s varies over the course of tumor progression. The accumulation of responsive TCD8s can be enhanced by using anti-CD40 or anti-CTLA-4 antibodies (Otahal et al., 2007; Ryan et al., 2008). The tissue-specific differences in the peaks of the TCD8 response will affect the time at which immune therapy will be most effective. Differential expression of negative regulators of the receptors such as PD-L1 and Tim3 ligand within tumors of distinct tissues might lead to tissue-specific apoptosis.

Though TCD8 tolerance is often observed, particularly in the setting of cancer or transgene expression, the mechanisms leading to tolerance are not clear. Factors that have received considerable attention are the density of peptide/MHC-I complexes, the affinity of TCR for peptide/MHC complexes and the avidity of the interaction between T cells and antigen-presenting cells (Abbas et al., 2004). These factors affect the activation of TCD8s and their effects on the TCD8 response are modeled indirectly by assuming high-dose inhibition, antigen-induced cell death (Kabelitz et al., 1993), and immune suppression. While we also include saturation of TCD8s in response to large tumors (Graw and Regoes, 2009), we compare the rate of activation and apoptosis in distinct mice in which tolerance is either observed or not. Although experimentally it is difficult to measure the densities of peptide-MHCI complexes and affinities of TCR, our approach allows us to establish relationships between different parameters and test them experimentally. Moreover, our model suggests that the differentiation of naïve T cells is also affected during tolerance. Various evidence such as the disruption of MHCI-peptide and TCR complex (Nagaraj et al., 2007), TCD8 exhaustion (Moskophidis et al., 1993) and tumor-induced TCD8 suppressive microenvironment

(Lee et al., 1999; Khong and Restifo, 2002) support the inhibition of the differentiation from naïve T cells during tolerance. Studies of the TCD8 response to the Tag determinants suggest that tolerance is related to immunodominance since TCD8 specific for the immunorecessive determinant (site V) are the least sensitive to tolerance even in the highly tolerogenic brain tumor model (Schell et al., 2000). In this context, it is interesting that subdominant site I-specific TCD8s undergo tolerance earlier than immunodominant site IV-specific TCD8s, suggesting that higher levels of site I/MHC versus site IV/MHC complexes may be achieved *in vivo* during tumor progression. We assume that the dependence of the T cell response on the number of tumor cells is described by Michaelis–Menten kinetics which is a commonly used functional form to model saturating response at high doses in biological systems. The support for such behavior comes from observations in chronic infections and cancers which limit the activation of the immune responses even when the antigen is not cleared (Kabelitz et al., 1993; Wigginton and Kirschner, 2001). We note that the Michaelis–Menten function is a special case of a Hill function which may also be a good choice but it has an additional unknown parameter as compared to the Michaelis–Menten function.

Simulations in WT mice suggest that the differential activities of site-specific TCD8s are driven by the site-specific rates of apoptosis. The rate of apoptosis was estimated in the contraction phase when Tag-expressing cells are cleared. At earlier time points the percentages observed in the experiments (**Figures 3C** and **6B**) are the outcome of the dynamics modeled by the rate of activation (*n*), proliferation (*c*), and apoptosis (*wi*). In the absence of experimental estimates the rate constants used in our model represent an average rate of apoptosis over time. Hence we validate modelpredicted inequalities rather than attempting to estimate the exact values of apoptosis rates. Though free decay is a commonly made assumption in dynamic models, antigen concentrations and the duration of antigen exposure can affect the apoptosis rate of the TCD8 cells which can be included in a future extension of the current model (Porter and Harty, 2006). On the contrary, the accurate prediction of the rate of proliferation (*c*) was not possible (**Figure 4**) due to the sparse data (Trinchieri et al., 1976) and various parameters affecting the expansion phase including TCD8 apoptosis (*wi*; **Figure 5B**), differentiation from naïve T cells (*ni*), and the proliferation of TCD8s (*c*). The use of similar proliferation rates for TCD8s specific to site I and IV is supported by the experimental data (**Figure 5A**). We maintain the same rate of proliferation for all site-specific TCD8s, a simplification which we believe is valid for the dominant sites. The rules may be different for site V since the addition of more site V-specific precursors does not overcome the weak response to site V (Otahal et al., 2005). However, estimating the true proliferation rate of site V-specific TCD8s is challenging since these cells remain below the threshold of detection following immunization of WT mice with Tag. Overall the data support a mechanism in which differences in the rate of apoptosis explain the prolonged high level accumulation of site IV-specific TCD8s cells relative to TCD8s responding to the subdominant determinants.

The current model is an outcome of a step-wise process to reproduce the TCD8 response in WT and Tag-transgenic mice. As a first step we modeled all the immune processes that are hypothesized to be important in TCD8 activation. Next, we reduced the model based on what is required to reproduce the known experimental observations, for example separately modeling pMHC complexes on the antigen-presenting cells did not lead to an improvement of the model but increased the number of unknown parameters (Handel and Antia, 2008). We do not explicitly include naïve TCD8s in the model, assuming that they are not a rate limiting factor in the tumor-induced T cell interactions. However, in the biologically accepted range of parameters (**Figure A3** in Appendix) the effective value of the rate of differentiation of precursor cells is never unlimited in our model. Many mathematical models have assumed a constant number of precursor cells and in those models estimates of the effective rate of thymic production vary within 25% of the *nic* values used in our model. The mathematical formulations modeling TCD8 response to tumors generally incorporate higher values of the source for TCD8s compared to the models of TCD8 response to viruses. This could be because the immune response is measured following immunization which usually inflates the response. We also would like to note that the value of the rate of proliferation is also in the range of estimations by other studies (De Boer et al., 2003; Bocharov et al., 2004).

In the current model we did not explore the possibility of introducing competition between TCD8 clones because recent results indicate that in WT mice competition between the endogenous TCD8s responding to sites I and IV does not play a significant role in limiting the magnitude of the T cell response (Tatum et al., 2010). We did previously observe that the presence of the dominant Tag determinants can limit the response to site V (Mylin et al., 2000). The absence of a detectable endogenous site Vspecific TCD8 response upon immunization with Tag-expressing cells makes it difficult to make assumptions about the interactions between physiological levels of TCD8s specific for dominant and recessive epitopes. Though we cannot rule out the possibility of competitive interactions, non-competitive interactions mediated by weak engagement of TCR with site V-MHCI complexes can also drive the immunorecessive response as modeled in the current study. In conclusion, the mathematical model presented here is one of the few attempts to characterize *in vivo* TCD8 responses to known autochthonous tumors and it systematically analyzes the

expansion and contraction phases during the TCD8 response to a known tumor antigen.

In the future, this model could be expanded by including competitive interactions between site-specific clones for different antigens and by separately modeling the reactivation of memory T cells in response to the antigen (Camus and Galon, 2010). Modeling of naïve T cells as a separate entity will also allow us to study the effects of adoptive transfers, which are currently under clinical investigation for several cancer types. One could also incorporate immune cells such as T regulatory cells which are implicated in inducing tolerance and model the recovery of responsive TCD8 cells (Sharabi and Ghera, 2010). Moreover, the case of uncontrolled T cell and Tag-expressing cell growth is similar to autoimmune response and though it is not a focus of the current study, the model can be used to study the relationship between tumor and autoimmunity since tolerance (Schuetz et al., 2010) and dysregulation of immune responses (Reeves et al., 2009) are implicated in both diseases. While opportunities exist to build on this basic model, relevant *in vivo* data are needed to inform and parameterize the expansion. Moreover, standardization of experimental techniques will be useful since the observations are not only affected by personal and lab-specific factors but also by the mice strains used. We show here that our model not only provides novel predictions that can be experimentally validated but also gives important insights based on sparse data. Models such as ours will be increasingly developed and used to provide novel predictions and biological understanding of the complex interaction between the immune system and cancer.

### **ACKNOWLEDGMENTS**

This work was partially funded by research grant R01-CA-025000 from the National Cancer Institute/National Institutes of Health (to Todd D. Schell). We would like to thank Alan M. Watson and Dr. Angela M. Tatum for their advice, Aijun Liao for her assistance with experiments, and Nate Sheaffer and Dr. David Stanford for help with acquisition and analysis of flow cytometry data. Juilee Thakar is thankful to the Cancer Research Institute for a postdoctoral fellowship.

Ethics statement: All animal studies were performed under active protocols approved by the Pennsylvania State University Institutional Animal Care and Use Committee.

#### **REFERENCES**


Transgenic mice harboring SV40 Tantigen genes develop characteristic brain tumors. *Cell* 37, 367.


*Theor. Biol.* 247, 723–732.


recognition of antigen is a mechanism of CD8+ T cell tolerance in cancer. *Nat. Med.* 13, 828–835.


against an immunorecessive epitope. *J. Immunol.* 177, 255–267.


*Proc. Natl. Acad. Sci. U.S.A.* 91, 3916–3920.

Zheng, X., Gao, J. X., Zhang, H., Geiger, T. L., Liu, Y., and Zheng, P. (2002). Clonal deletion of simian virus 40 large T antigen-specific T cells in the transgenic adenocarcinoma of mouse prostate mice: an important role for clonal deletion in shaping the repertoire of T cells specific for antigens overexpressed in solid tumors. *J. Immunol.* 169, 4761–4769.

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 23 March 2011; accepted: 18 June 2011; published online: 11 July 2011. Citation: Campbell C, Zhang R, Haley JS, Liu X, Loughran T, Schell TD, Albert R and Thakar J (2011) Why do CD8*+ *T cells become indifferent to tumors: a dynamic modeling approach. Front. Physio. 2:32. doi: 10.3389/fphys.2011.00032*

*This article was submitted to Frontiers in Systems Physiology, a specialty of Frontiers in Physiology.*

*Copyright © 2011 Campbell, Zhang, Haley, Liu, Loughran, Schell, Albert and Thakar. This is an open-access article subject to a non-exclusive license between the authors and Frontiers Media SA, which permits use, distribution and reproduction in other forums, providedthe original authors and source are credited and other Frontiers conditions are complied with.*

### **APPENDIX**

### **THE EFFECT OF VARIATION OF THE PARAMETERS ON THE RATE OF KILLING OF TUMOR CELLS BY TCD8s**

*Nullcline Analysis*

In this section we elaborate upon the biologically relevant dynamical regimes exhibited by the model. While we show the evolution of the number of tumor cells and TCD8 cells as a function of time in **Figure A2**, we here show the two variables plotted against one another. This method allows for straightforward identification of the system dynamics for any possible number of TCD8s and tumor cells.

The directional field, which is represented by gray arrows in the lower panels of **Figure A2**, indicates the evolution of tumor cells and TCD8s with time. For instance, when the number of tumor cells and TCD8s are in an area of phase space where the directional field is increasing in both dimensions, the number of Tag-expressing cells and TCD8s cells will both increase for a small advancement in time, and their next position in phase space will decide the change in both variables for the next advancement in time. The trajectory is shown by the solid black line. Broken black lines show the boundary between areas in phase space that are increasing or decreasing with respect to TCD8s (dash-dotted line) or tumor cells (dashed line). The intersection of nullclines corresponds to steady states of the system. Zero tumor cells and TCD8s is always a steady state; for the parameters investigated here, the nullclines typically admit at most one additional steady state (see main text for the mathematical form of the nullclines).

The clearance scenario (**Figure A2A**) shows the system's starting point in a location where the directional field is increasing in both dimensions. The trajectory crosses the tumor cell nullcline, whereupon the number of tumor cells begins decreasing. Once it drops below the TCD8 nullcline, the number of TCD8s also begins decreasing; the system eventually reaches a steady state

located very close to 0 tumor cells and TCD8s. The different parameters used in **Figures 2B–E** alter the properties of the directional field and nullclines, and so influence the evolution of the system despite the fact that they start in the same location in phase space.

#### *Parameter values and their sensitivity*

There are a maximum of 18 parameters in the model. For simplicity the parameters are assumed to have the same value in different tissues in the absence of tissue-specific information to constrain the values. However, site-specific apoptosis rates were implemented to reproduce the different onset times of tolerance of site-specific TCD8s. To study the effect of parameters we systematically varied the parameter values from 50 to 200% of their initial values [i.e., *pi* = *kp*0, *k* ∈ (0.5,2)]. We calculated the percent deviation due to a given parameter modification for variable *V* at time τ as

$$d\_{\nu(t=\tau)} = \left| \frac{V(p = p\_0, t = \tau) - V(p = p\_i, t = \tau)}{V(p = p\_0, t = \tau)} \right| $$

Averaging over all time points gives an average percent deviation (APD) for variable *V* and deviation *k* for parameter *p*.

In the model, the peak of the TCD8 accumulation corresponds to the system crossing the TCD8 nullcline in phase space (see main text). The main parameters that drive the response of the TCD8s are *c* and *wi*. We modify these parameters and quantify the difference of the ensuing dynamics from the original via an APD. There are a maximum of 18 parameters in the model and the parameters are assumed to have the same value in different tissues for simplicity in the absence of tissue-specific information to constrain the values. However, site-specific activation of naïve cells and apoptosis rates were implemented to reproduce the onset times of tolerance. To study the effect of parameters we systematically varied the parameter values from 50 to 200% of their initial values (i.e., *pi* = *kp*0, *k* ∈ (0.5,2)). We calculated the percent deviation due to a given parameter modification for variable *V* at time τ as

$$d\_{\nu(t=\tau)} = \left| \frac{V(p = p\_0, t = \tau) - V(p = p\_i, t = \tau)}{V(p = p\_0, t = \tau)} \right|$$

Averaging over all time points gives an APD for variable *V* and deviation *k* for parameter *p*.

This measure identifies at a glance the sensitivity of the system to variations in particular parameters. We show as an illustrative example the APDs for the pancreatic, osteosarcoma, and WT cases (**Figures A3B–D**). Reducing *wIV* in a tumor case serves to bolster the immune response; *T*IV effectively chases after IC more vigorously before giving up, until eventually it is able to clear the tumor entirely. In the pancreatic case (**Figure A3B**), w is not decreased enough to show the regime clearance. In the osteosarcoma case, the full transition is seen. In the case of brain tumors, *T*IV plays a minimal role; varying *w*IV has comparatively little effect on the system dynamics.

TCD8s.

cases are shown on the *x*-axis for the wild-type (black squares), pancreatic tumor (black diamonds), osteosarcoma (white squares), and brain tumor (white diamonds) models. *n*<sup>I</sup> fixes the value of *n*IV( = 4*n*I) and *n*<sup>V</sup> ( = 0.3*n*I), and multiplicative factor (*x*-axes) on the variables is shown as an average percent deviation (APD) in the pancreatic, osteosarcoma, and wild-type cases (*y*-axes). Very small APDs (<10−9) are not shown.

### *Kunihiko Kaneko\**

*Research Center for Complex Systems Biology, University of Tokyo, Tokyo, Japan*

#### *Edited by:*

*Kumar Selvarajoo, Keio University, Japan*

#### *Reviewed by:*

*Kumar Selvarajoo, Keio University, Japan*

*Masa Tsuchiya, Keio University, Japan*

#### *\*Correspondence:*

*Kunihiko Kaneko, Research Center for Complex Systems Biology, University of Tokyo, Meguro, Komaba, Tokyo 153-8902, Japan. e-mail: kaneko@complex.c.u-tokyo. ac.jp*

Keio University, Japan, I discuss my approach to biology, what I call complex systems biology. The approach is constructive in nature, and is based on dynamical systems theory and statistical physics. It is intended to understand universal characteristics of life systems; generic adaptation under noise, differentiation from stem cells in interacting cells, robustness and plasticity in evolution, and so forth. Current status and future directions in systems biology in Japan are also discussed.

Interviewed by Kumar Selvarajoo and MasaTsuchiya at Institute for Advanced Biosciences,

**Keywords: plasticity, adaptation, robustness, stem cell, noise**

### **Q1: PROFESSOR KANEKO, CAN YOU INTRODUCE YOURSELF AND YOUR RESEARCH?**

I have been a physicist, and although I am more and more involved in biology now, I think my approach is quite a physicist-type. I started my graduate studies in the field of non-equilibrium phenomena in terms of stochastic process, and then worked on chaos, a deterministic dynamics that produce irregular, "unpredictable" behavior (Kaneko, 1986). Then, my study shifted to chaos in space and time, having many degrees of freedom. I introduced the "coupled map model," which proved to be a powerful tool that allows one to study the properties of dynamical systems with many degrees of freedom. Key concepts that derived from it such as collective dynamics and chaotic itinerancy have had impacts on a variety of fields, ranging from turbulence in fluid dynamics to neural activities in the brain. A book "Complex Systems: Chaos and Beyond" (Kaneko and Tsuda, 2000) that I wrote together with Ichiro Tsuda, to a certain extent I believe, mediated the "Japanese own taste for complex systems" (as reviewed in Nature; Shlesinger, 2001) to scientists abroad.

Based on these studies and concepts, I proposed "Complex Systems Biology" at around 1994, to unveil universal properties in a life system. It is not easy to judge if some features in the present organism are chance or necessity, as they are shaped as a result of one-time evolution in this Earth. It is not so sure if the features appeared again when the tape of life were replayed. To unveil universal, essential features in life, it is then ideal to construct some basic process of life (such as reproduction, adaptation, differentiation, and so forth) and examine generic features therein. This is a constructive approach Tetsuya Yomo at Osaka University and I proposed in mid 1990s. The earlier works including collaborated studies with experimental biologists Yomo and Makoto Asashima at University of Tokyo are described in the book "Life: An Introduction to Complex Systems" (Kaneko, 2006). During these years I have served as a director of Center-of-Excellence Project "Search

for the Logic of Life as a Complex System" (1999–2004) and the ERATO project "Kaneko Complex Systems Biology" (2004– 2010), and am a head of Center for Complex Systems Biology at University of Tokyo.

### **Q2: WHEN AND HOW DID YOU BECOME INTERESTED IN BIOLOGICAL RESEARCH?**

From the beginning of graduate studies at 1979 I was interested in "what life is." My intention was to understand its universal characteristics, and what distinguishes life from non-living matter. So I hoped to understand what life is, theoretically, in terms of physics. At that time, Prigogine's "dissipative structure" was popular among statistical physicists, in which the ultimate goal would be to understand life as a spatiotemporal pattern possible in far-from-equilibrium state. However, there was a large gap between such studies in physic-chemical systems and life systems. So I could not start biological research seriously until 1992, when I first met Tetsuya Yomo at a meeting organized by Professor Yuzuru Fushimi. I was then working on "globally coupled maps," in which simple identical dynamic elements interact with every other in the same way. I found that even though these elements are identical, their behaviors start to differ from each other with time and then form a few groups within which the behaviors are identical but the behaviors of elements belonging to different groups are distinct (Kaneko, 1990). As this "differentiation" occurs across elements sharing the identical "rule," I had thought that this might be similar with the differentiation of cells that share the identical gene. This similarity, however, had remained to be at "metaphorical" level. At that time Tetsuya discovered that bacteria sharing the same gene differentiated into active and inactive types, even in a well mixture culture (Ko et al., 1994), and was seeking mechanistic interpretation for it. I explained to him how my simple elements of coupled maps differentiated, with a remark on "why not bacteria that have more complex dynamics within?". So, we started collaboration. This

study on prototypical cell differentiation was also theoretically interesting, as the number of elements ("cells") change through division and death, which aspect had not been studied in physics before. Since then, we have continued collaboration to unveil basic logic in cell reproduction, heredity, adaptation, development, and evolution.

### **Q3: CAN YOU SHARE INITIAL EXPERIENCES WORKING WITH BIOLOGISTS?**

I think I was quite lucky with this initial experience. Tetsuya Yomo has always been interested in general (universal) aspects, and does not like such explanation that life system is special and finely designed through evolution.We always try to understand a characteristic property of life system as a general consequence of a system that grows autonomously. So we have had a common picture, and have enjoyed the collaboration. This, however, would be atypical experience, since his way of thinking is far away from traditional biologists. By the way, my way of thinking will be probably out of standard physicists'. In the beginning we misunderstood the way of thinking of each other as that of a typical biologist and physicist, but that was wrong. Biologists generally are not so much interested in universal properties or minimal models. For example, they often ask me "why do not you add this and that processes to make my model more realistic," while I make effort to reduce the complicatedness in the model. In physics, there are several abstract models that do not fit the details of the nature but are essential to understand universal features and unveil general laws, say Carnot cycle, ideal gas, Ising model for phase transition, and so forth. In this sense, I believe that learning how physics has succeeded in extracting universal laws in nature is essential to establish theories for a life system.

### **Q4: PLEASE HIGHLIGHT YOUR MAJOR FINDINGS IN A SIMPLE WAY**

Our "complex systems biology" is distinguishable from the socalled "systems biology" developed in recent years (Kaneko, 2006). By the word "complex," we do not mean "complicated." Systems that consist of many elements, e.g., molecules within a cell or cells within an organism exhibit homeostasis. Such systems should be constrained so that consistency between each element and the whole system is maintained, while keeping the reproduction of both elements and the system. Indeed, we found that one can derive widely applicable rules in the dynamics of gene expression and cellular organizations during adaptation to environment, reproduction, cell differentiation, and evolution. To list a few examples:

(i) Cell reproduction: in a cell that reproduces itself, all molecules are replicated keeping the composition to some degree. We found that this constraint on consistent reproduction leads to universal law on statistical distribution of abundances of each protein, as well as their fluctuations around their average values across cells (Furusawa and Kaneko, 2003). This law will give a criterion for a steady state of cells, while the fluctuations over cells are important in adaptation and evolution, as will be discussed below.


### **Q5: WHY DO YOU FEEL STRONGLY FOR THE EXISTENCE OF GOVERNING RULES IN LIVING SYSTEMS?**

Of course there is no logical reasoning to demonstrate the existence of universal laws in a living system, represented by few degrees of variables. However, trained biologists have intuition on activity, plasticity, and stability that make things "lively." They somehow characterize a "liveliness" by compressing detailed information in such life system. Probably, they have some concept on liveliness or biological activity, which does not necessarily require a huge number of parameters, but is represented by a few. This suggests that there exists some underlying logic in life that is represented in terms of few variables. So far we do not know such variables explicitly, though. This situation somewhat gives me an impression that we are in the time just before "thermodynamics" was established; We had sense on "hot" or "cold" but had not yet established the quantitative concept of temperature. Later we reached the concept of temperature and entropy, from which we reached the universal laws in thermodynamics. We have a sense on biological activity, plasticity, and robustness, but have not reached a proper mathematical formulation yet. Anyway, living state is a very common form of things (at least in this Earth and probably in the Universe I hope), and as a genuine physicist, it is natural to expect the existence of universal laws that govern such state.

### **Q6: NOISE IN BIOLOGY HAS RECEIVED SIGNIFICANT ATTENTION IN RECENT YEARS. WHAT ARE YOUR EXPERIENCES ON THIS?**

At the end of 1990s when we proposed "isologous-diversification" theory for cell differentiation, one key issue was that the amplification of noise then leads to robust cell distribution through cell–cell interaction (Kaneko and Yomo, 1994, 1999). Our view was summarized as "noise-amplification leads to noisetolerant cell society," as small variation in protein concentration is amplified in the irregular oscillation I mentioned earlier in Q5.

By looking at the data on the distribution of fluorescence of proteins in bacteria cells obtained by flow cytometry experiments by Yomo's group at around 2000, we soon recognized that the cellto-cell variance is quite large, and furthermore the distribution of fluorescence (or protein concentration) does not obey Gaussian (normal) distribution, but the logarithm of them does. We have shown that this is a necessity outcome of a multiplicative stochastic process – this is common in catalytic reactions, as the rate equation of chemical reaction, in general, has a multiplicative form between substrate and catalyst and their concentrations fluctuate (Furusawa et al., 2005). The Gaussian distribution of logarithm of the concentration that cell-to-cell variation by noise sometimes ranges to the order of magnitude.

Then we were more interested in relevance of such fluctuations to adaptation and evolution. During evolution, individuals with higher fitness are selected. Developmental dynamics that give rise to such individuals are continuously bombarded by noise in signal transduction, transcription, or translation. Since this generally perturbs the optimal phenotype, most studies focus on how developmental systems reduce or eliminate such disturbances. However, considering the recent observations of large noise in gene expression, it is natural to ask whether there is any positive role that noise plays in the biological organization and evolution. By combination of statistical physics based theory and evolutionary experiments in the lab, we have demonstrated that there is a positive correlation between noise and the rate of evolution. In other words, developmental robustness to noise facilitates robustness against mutation [see also Q4 (iv)]. This adds a new dimension to the classic problem of nature versus nurture as it suggests a strong relationship between phenotypic variation by mutation and that by developmental noise (Sato et al., 2003; Kaneko, 2007, 2011a).

As for adaptation at a single-cell level, Kashiwagi, Urabe, Yomo, Furusawa, and myself proposed a generic process for it, by noting that stability of growing cells against intrinsic noise is higher than non-growing cells, and there is general tendency that a cell switched to a state with higher growth by a noise in gene expression (Kashiwagi et al., 2006; Furusawa and Kaneko, 2008). This leads to "natural adaptation" of any cells even without the use of specific signal transduction network. It will be relevant to understand why bacteria, for example, can adapt to a huger variety of different environments which they probably have not met before.

### **Q7: CAN YOUR FINDINGS BE VALIDATED EXPERIMENTALLY? IF NOT, WHY AND WILL THIS CHANGE IN THE FUTURE?**

When Chikara Furusawa and I proposed that irregular oscillation in gene expression provides stemness more at 1998, many did not believe in the existence of such oscillation. This is because they measured the average of gene expressions over many cells at that time. As long as the oscillation is not synchronized, oscillation in gene expression, if it existed, would be averaged out and could not be observed. Now, one can measure a protein expression in a single-cell by imaging techniques. Indeed 2 years ago, Kobayashi et al., 2009; Kageyama's group) found the oscillation in HeS protein expression in embryonic stem cell. To our great pleasure, this oscillation disappeared in differentiated cells, as is consistent with our theory.

The studies on adaptation and evolution I mentioned started from experiments, and in this sense, the theory is formulated to be consistent with experimental findings. Then, the theory, in turn, can predict something more, which should be confirmed experimentally. Besides such confirmation, the experiments later challenge theorists with new findings. Ideally, theory and experiments progress hand in hand, in a form of expanding spiral.

For example, the evolution study started from the analysis of the experiment in Yomo's group, and thus in the beginning we proposed a theory to be consistent with experiments, i.e., the proportionality between evolution speed and isogenic fluctuation of phenotype by noise. Then our theory and simulations make new predictions, the proportionality between this isogenic fluctuation and the genetic variance that is the fluctuation due to genetic variation. Now it is a turn to check this relationship experimentally, which is ongoing, and I am looking forward to hearing a positive report soon.

As mentioned, study of "natural adaptation" stemmed from an experiment by Kashiwagi et al. (2006). By embedding an artificial gene network into bacteria, we demonstrated that E. coli are able to adapt to an optimal-growth state without the need for a specific induction mechanism. The accordingly proposed theory of ours is general, so that now it is a turn to carry out an experiment to demonstrate that this natural adaptation indeed works in natural conditions or in a higher organism.

### **Q8: DO YOU THINK JAPAN WILL OPEN DOORS AND INVITE INTERNATIONAL SCIENTISTS TO DO SYSTEMS BIOLOGY RESEARCH IN THE FUTURE?**

Generally, the answer is yes. Especially, in research institutes, the doors have already been opened, and this will be further accelerated. As for universities, we have to decide if we start to give lectures regularly in English. Considering the decrease in population in younger ages, we need to accept immigration more, and at some point we probably have to decide it. To some degree this will be good, but I also have some concern. Giving lectures and thinking in one' own language in science may be important to cultivate creativity. As the world will be "Americanized," the originality in each culture may be declined, which may also suppress developing original ideas in science. In fact, the originality in scientific activity was quite high when Japan was more isolated, to give some examples, Yukawa's meson theory, linear response theory in statistical physics, and so forth. As the world is homogenized, the frequency of original breakthrough in science seems to be declined. Of course, whether we can expect true breakthrough in the systems biology (as we experienced in the emergence of thermodynamics, quantum mechanics, general relativity, Darwinian evolution theory, and so forth) is another question, though.

I may be one of the minority who expects that real theoretical breakthrough will take place in biology, and for it, I believe that thinking differently under an appropriate level of isolation is important.

### **Q9: WHAT ARE YOUR FUTURE ASPIRATIONS?**


### **REFERENCES**


also understand the condition to recover multipotency in a cell, and characterize a cancer cell.

### **Q10: YOUR ADVICE TO NON-BIOLOGISTS WHO CONSIDER APPLYING THEIR SKILLS IN BIOLOGY**

I have been interested in what life is, and to answer it I need first to know universal features in non-life system, for which physics is important. So far, I think mathematics is useful to understand nature reasonably, and as for the application of mathematics for natural science, physics has been most successful. So, I recommend to study seriously what life is, in terms of physics and mathematics, but this may be my biased viewpoint.


expression dynamics endows stem cells with robust differentiation potential. *PLoS ONE* 6, e27232. doi:10.1371/journal.pone.0027232

**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Received: 16 November 2011; accepted: 18 November 2011; published online: 05 December 2011.*

*Citation: Kaneko K (2011) The challenges facing systemic approaches in biology: an interview with Kunihiko Kaneko. Front. Physio. 2:93. doi: 10.3389/fphys.2011.00093*

*This article was submitted to Frontiers in Systems Physiology, a specialty of Frontiers in Physiology.*

*Copyright © 2011 Kaneko. This is an open-access article distributed under the terms of the Creative Commons Attribution Non Commercial License, which permits non-commercial use, distribution, and reproduction in other forums, provided the original authors and source are credited.*